MAJOR: solves problem related to ELABFTW_API_URL variable

if no value was specified for such variable (or .env was missing) EAU would be set to None and get stuck in a prompt loop solved by turning EAU into a required variable in APIHandler (and editing a lot of code through all of src/)
quality improvements
2026-05-14 17:24:02 +02:00 · 2026-05-14 17:21:07 +02:00 · 2026-05-14 17:09:35 +02:00 · 2026-05-14 17:08:56 +02:00 · 2026-05-14 01:40:54 +02:00 · 2026-05-13 21:01:05 +02:00
17 changed files with 11363 additions and 232 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,4 @@
+api_key=""
+elabid=""
+ELABFTW_API_URL="https://elabftw.fisica.unina.it/api/v2"
+operative_unit="cnr-spin.na"
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,6 @@
+# ignores bkp files of drawio
+.$*.bkp
+
 # ignores logs of h5tojson, jsontoh5
 *.log

@@ -5,6 +8,7 @@
 output/*.json
 output/*.h5
 output/*.nxs
+output/attachments/*.*

 # ---> Python
 # Byte-compiled / optimized / DLL files
--- a/docs/images/ts-warning.png
+++ b/docs/images/ts-warning.png
--- a/docs/images/usage-apigen.png
+++ b/docs/images/usage-apigen.png
--- a/docs/images/usage-difference-dotenv.png
+++ b/docs/images/usage-difference-dotenv.png
--- a/docs/images/usage-elabid.png
+++ b/docs/images/usage-elabid.png
--- a/docs/images/usage-name.png
+++ b/docs/images/usage-name.png
--- a/docs/images/usage-venv.png
+++ b/docs/images/usage-venv.png
--- a/docs/user-manual-v0.2.1-alpha.pdf
+++ b/docs/user-manual-v0.2.1-alpha.pdf
--- a/docs/user-manual_01.adoc
+++ b/docs/user-manual_01.adoc
@@ -0,0 +1,53 @@
+== Introduction
+// TO-DO: Grammar-check. I'm totally fried right now and can't seem to complete even a single proper
+*{software-family}* - short for _**e**LabFTW to Ne**X**us **Pars**er_ - is (hopefully) a family of specialized parsing software applications, mainly developed in Python, whose primary job is to automatically transform experimental metadata and data - originally stored as JSON objects inside an electronic lab notebook - into standardized, self-descriptive **NeXus files**.
+
+The software is designed to fetch "scattered" data (often distributed across multiple linked entries) from our eLNfootnote:[Acronym for "_electronic Lab Notebook_".] of choice - link:{elabftw-site}[**eLabFTW**^] - where the data is originally stored as JSON objects. It then parses the included metadata to resolve the full dataset which is then used to create a dictionary following a pre-established schema (dependent on the analysis or fabrication method, e.g., PLD, XRD, or RHEED), and finally uses said dictionary to produce an **HDF5/NeXus file** which complies with the **FAIR Principles** and the guidelines given within the context of the Italian PNRRfootnote:pnrr[PNRR stands for _National Recovery and Resilience Plan_.] **NFFA-DI**.
+
+Specifically, *{software-name}* is designed for *Pulsed Laser Deposition / PLD* fabrications.
+
+=== NFFA-DI and FAIR Principles
+PNRR (_Piano Nazionale di Ripresa e Resilienza_) is Italy's national recovery plan from the aftermaths of COVID-19. +
+*NFFA-DI* (_Nano Foundries and Fine Analysis - Digital Infrastructure_) is a project within this plan aimed at creating a distributed digital infrastructure for nanoscience and nanotechnology. In practice, NFFA-DI provides a unified cyber-platform for researchers to access advanced instrumentation, simulation tools, and data management services across multiple Italian research centers.
+
+Like most modern scientific projects NFFA-DI is _FAIR by design_, meaning it strives for total compliance to *FAIR Principles*. FAIR is the acronym of the four main characteristics all compliant projects should share:
+
+> * Findable: «Metadata and data should be easy to find for both humans and computers.»
+> * Accessible: «Once the user finds the required data, she/he/they need to know how they can be accessed, possibly including authentication and authorisation.»
+> * Interoperable: «The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.»
+> * Reusable: «Metadata and data should be well-described so that they can be replicated and/or combined in different settings.»
+> 
+> Source: link:{go-fair-site}[GO FAIR^]
+
+
+{software-name} contributes to NFFA-DI goals by enabling automated data harmonization: converting local PLD experiment records into a common, shareable format (NeXus) with a mutually agreed upon schema, thereby making the data interoperable across the entire NFFA-DI ecosystem.
+
+TIP: More info on NFFA-DI at link:{nffa-di-site}[nffa-di.it^].
+
+=== eLabFTW
+*eLabFTW* is an open-source, web-based electronic laboratory notebook and resource manager. It acts as a central digital hub for one or more laboratories, organizing information (as database entries) into two main constructs:
+
+* **Experiments**: They are the core feature of eLabFTW, can contain structured data (via custom JSON fields), unstructured text, timestamps, tags, links to files (attachments), and relations to database items.
+* **Resources** or **Items**: This is a separate, structured inventory for items like raw materials (targets, substrates), instruments (UHV machines) or samples. Each entry is built from customizable templates with defined metadata (e.g. for a substrate batch we have name, manufacturer, geometry, available pieces left...).
+
+Although separated into different database constructs, experiments and items all have their own unique, incremental internal ID, which we'll simply call *elabid* to distinguish it from other identifiers, with no academic utility but extremely important when dealing with eLabFTW from a developer's perspective.
+
+// method-specific
+In a software like eLabFTW where data can (and will) be spread out through multiple entries, a particularly useful feature is **linking**: the software allows you to link experiments or items with each other, using elabid's as identifiers. For a PLD deposition, you can link the experiment describing a single layer to the target used, the substrate, the PLD instrument and the sample produced itself (all of which are eLabFTW items). This creates a complete provenance graph which can be (not-so) easily resolved starting from the sample's metadata and a chain of HTTP requests.
+
+In this optic, {software-name} interacts with eLabFTW via its REST API (Application Programming Interface). It reads a starting sample's ID (the entry point), fetches the relevant JSON metadata, chains requests using the elabid's of the sample's linked resources and experiments, rebuilds the entire dataset and if available downloads attached instrument files (e.g., RHEED intensities, images) to package all of it into the final NeXus file.
+
+=== The output: HDF5 and NeXus files
+The output of {software-family} is an **HDF5 (Hierarchical Data Format ver. 5) file**, which is a powerful file format designed to store and organize large volumes of numerical data. It acts like a virtual file system inside a single file, using a hierarchical group/dataset structures in the same way a file system uses folders and files - with both elements having their own metadata; this way the file is self-describing, containing all relevant information like a small database. HDF5 also supports efficient slicing, compression and parallel I/O. The file extension of such format is `.h5`.
+
+On the other hand, *NeXus* is a common data standard [.underline]#built on top of HDF5#. It defines fixed conventions for naming groups, datasets and attributes, specifically for neutron, X-ray, and now materials science experiments. NeXus provides "application definitions" (like _NXpld_fabrication_ for PLD) that specify exactly which fields must/may appear. NeXus is also heavily promoted by _FAIRmat_, a German-based consortium, part of the NFDI, whose main mission is providing scientists «with a FAIR data infrastructure and the skills and tool they need to make the most of it»footnote:[As stated on their link:{fairmat-site}[website^].]. The file extension of such format is `.nxs`, but generally file viewers treat the two formats similarly.
+
+Last but not least, NeXus is also the format of choice for data sharing in the NFFA-DI guidelines. Which brings us to the reason why {software-family} exists.
+
+[#reading-nxs]
+==== Reading HDF5/NeXus files
+While writing an HDF5/NeXus file usually requires dedicated software and/or a good knowledge of programming and familiarity with specific libraries (like h5py), there are multiple ways to read these files even without such knowledge.
+
+One of such ways would be using the online NeXus file viewer of the NCNR (_NIST Center for Neutron Research_), available on their link:{ncnr-viewer}[website^]. The "_Browse..._" button at the bottom allows for uploading both h5 and nxs files, although drag and drop also works.
+
+Another similar but in my opinion more elegant online file viewer is the one hosted by the HDF5 Group: link:{hdf5-viewer}[MyHDF5^]. Other than the more modern appearance this viewer doesn't upload files to any remote server, with every operation happening locally in your browser; the drag and drop works better meaning you won't accidentally reload the page if you miss the dropping area, and the viewer also allows for opening multiple concurrent files, and downloading h5 files from URL.
--- a/docs/user-manual_02.adoc
+++ b/docs/user-manual_02.adoc
@@ -0,0 +1,171 @@
+== Using the software
+WARNING: This software requires Python 3.12 or later. +
+The module *venv* and the package manager *pip* are also required.
+
+=== Downloading the source code
+IMPORTANT: Currently ({revdate}) the source code is hosted on a private Gitea instance, owned by {author}. +
+If the site is down for maintenance or temporarily unavailable please contact the webmaster via mailto:{email}[e-mail].
+
+// TO-DO: add link to direct download of package
+The source code can be acquired directly via *git*, or downloaded from the official repository on link:{repo-url}[Gitea D'Amico^].
+
+[source,bash,subs="verbatim,attributes"]
+----
+git clone {repo-url}.git {software-name}
+cd {software-name} # enter directory
+ls
+  LICENSE    docs/     output/           src/
+  README.md  glossary  requirements.txt  tests/
+----
+
+Optionally, you can access the code in the development branch by executing:
+[source,bash]
+----
+git checkout dev
+----
+
+=== Preparing the environment
+Before starting {software-name} {revnumber} requires a total of 6 modules to be installed, which are listed link:{repo-url}/src/branch/main/requirements.txt[here^]. Since installing a Python module system-wide is almost never a good idea, start by creating and activating a virtual environment.
+
+In the software folder, run:
+
+[source,bash]
+----
+# Calls venv module to create new Python virtual environment in .venv:
+python3 -m venv .venv
+# If command is successful, running ls should show a new .venv folder:
+ls -d .*
+  .venv
+# Activate venv:
+source .venv/bin/activate
+----
+
+.Most shells like Bash show very clearly when you're working inside a virtual environment.
+[#usage-venv]
+image::usage-venv.png[]
+
+At this point you're free to install the requirements through *pip*:
+[source,bash]
+----
+# Install from list in requirements.txt:
+pip install -r requirements.txt
+----
+
+Most of the warnings displayed by pip are safe and generally it's not dangerous to ignore them. +
+Unless pip exits abruptly returning an error, you environment is ready to work.
+
+=== Configuration through .env file
+// foggetaboutit
+Much like the previous step, configuring the software with your settings (API key, eLabFTW URL...) is something you do _una tantum_ and then usually forget about it.
+
+Inside the {software-name} folder there's a file called `.env.example`. Rename it removing ".example", then open it with your editor of choice. This is your *.env* (or *dotenv*) file.
+
+[source,bash]
+----
+mv .env.example .env
+vim .env
+# The file presents itself like this:
+ 1 | api_key=""
+ 2 | elabid=""
+ 3 | ELABFTW_API_URL="https://elabftw.fisica.unina.it/api/v2"
+ 4 | operative_unit="cnr-spin.na"
+----
+
+* *api_key* is your own personal eLabFTW API key. Generating one is an easy task explained in full detail below.
+* *elabid* is the elabid of the resource you'd like to select (your starting sample); this field can (and probably should) be left blank - in which case the application prompts you for an elabid on runtime, and your answer will not be stored meaning you can easily rerun the program with a different target.
+* *ELABFTW_API_URL* is the URL of your eLabFTW instance; if you're running this from the laboratories in Monte S. Angelo, Naples, you're probably leaving this field as it is.
+* *operative_unit* is the operative unit you ran your experiments from. It's only needed to compose the filename of the NeXus, which can be easily modified anytime later, and it's not necessary for creating the file itself.
+
+None of these fields are required, meaning you can technically skip this entire section. If any of the first three keys are blank or missing you will be prompted to provide the necessary info at runtime, and your answers will not be memorized - meaning e.g. you will have to provide your API key every time you run the program.
+
+NOTE: Do [.underline]#NOT# confuse .env with .venv: the first is a [.underline]#file# containing all the environmental variables you need to run {software-name} properly, the latter is a [.underline]#directory# containing your virtual environment with all the required modules.
+
+==== Generating an eLabFTW API key
+eLabFTW has its own link:{elabftw-apikey-docs}[API documentation^] on which you can rely. A new API key can be generated in the Settings → API Keys page by giving it a name and an access level:
+
+.Screenshot from our eLabFTW. The key must have a name and permissions. Naturally, the key you see here in clear has been invalidated.
+[#api-gen]
+image::usage-apigen.png[align=center,width=75%]
+
+The *name* of the key is a descriptor for you to remember why you created it in the first place - something like "parser_key01". The *permissions* can either be "_Read/Write_" or "_Read-Only_": in the first scenario the key may also be used to edit or create entries you own on eLabFTW, while read-only key only allow GET requests. {software-name} doesn't require writing permissions, so both options will do.
+
+[WARNING]
+.A few warnings.
+====
+* The key eLabFTW generates is [.underline]#only shown once#, then stored encrypted in the database. This means that after closing or refreshing the page the key [.underline]#is lost forever# if not saved on an external support. Which brings us to the second warning.
+* Store and protect your API key like you would your password, as [.underline]#it gives full/limited access to your account# exactly like your password, but without the protection given by 2FA/MFA. For this purpose there are many offline (like link:{keepass-site}[KeePass^]) or online (like link:{bitwarden-site}[BitWarden^]) **password managers**.
+* Your .env file is [.underline]#NOT# a safe place to _store_ your API key. Once pasted there be very careful who you share your files with, and be careful not to expose your key when sending your NeXus files to other computers. If you don't trust your awareness leave the api_key field blank and just paste your API key in the terminal every time you run {software-name}.
+====
+
+=== Running the program
+Open a terminal into the project folder. Before attempting to run the program:
+
+* Make sure your virtual environment is active, or if it isn't run: +
+`source .venv/bin/activate`
+* Make sure the required modules are installed, or if they aren't run:
+`pip install -r requirements.txt`
+* Make sure your .env file is properly set, or if it isn't make sure you know how to paste into the terminal the API key, the elabid of the required source and the URL of your eLabFTW instance (ending in `/api/v2`).
+
+When you're ready, run:
+[source,bash]
+====
+python3 src/main.py
+====
+
+If your .env file is completely filled out with valid values the only output you may read on the terminal are warnings or worst-case-scenario errors. Next chapter will cover all such cases. If your .env file lacks one or more values you will be asked to input the missing info at runtime.
+
+==== Entering missing values if prompted
+If you decide to run without a valid .env file (again, worst-case-scenario) you will be prompted to enter the required information directly into the terminal.
+
+.The difference between running {software-name} with no .env, and with a properly filled out .env. Same parameters, same output.
+[#usage-difference-dotenv]
+image::usage-difference-dotenv.png[]
+
+First and foremost you will be prompted for a valid API key. To paste your key in the terminal either right-click (_PowerShell_ and other terminal emulators), right-click > _Paste_, Ctrl + Shift + V (on most terminal emulators) or middle-click (Linux).
+
+Then you will be prompted for an elabid - which is a positive integer number. You can find your sample's elabid on eLabFTW, above the sample's name and before the sample's label and status. See xref:usage-elabid[xrefstyle="short"].
+
+Last but not least you will be prompted for a valid eLabFTW API endpoint URL. Such URL is composed by the base URL of your eLabFTW instance, closing with `/api/v2`. For instance: _++https://elabftw.fisica.unina.it/api/v2++_. +
+{software-name} {revnumber} will not validate such URL or return some very specific error.
+
+WARNING: Make sure the URL you paste doesn't end with a trailing slash. +
++https://elabftw.fisica.unina.it/api/v2++ ✓ +
++https://elabftw.fisica.unina.it/api/v2/++ ✗
+
+You won't be prompted for the operative unit, so that will require either setting up a .env or manually editing your NeXus files' names. The list of officially approved acronyms for the operative units can be consulted on NFFA-DI's link:{nffa-di-uo-acronyms}[official website^].
+
+.Where to find the elabid of a sample.
+[#usage-elabid]
+image::usage-elabid.png[]
+
+==== Retrieving and verifying your file
+By default the NeXus file will be saved in the `output/` folder. Currently ({revdate}) the software will also save a JSON dictionary with the full chain of all metadata collected on the sample. There is also an `attachments/` folder containing all the attachments downloaded during execution, which will be removed later on.
+
+The file will be recognizable by its name, which should already be in compliance with the following NFFA-DI naming guidelines:
+
+> «Each file generated in the context of a Proposal stored on OFED must use the following naming convention: ++nffa-di_[proposal_id]_[UO]_[UO_internal_id]++» - where _proposal_id_ is the approved ID of the research proposal, _UO_ is the link:{nffa-di-uo-acronyms}[official code^] of the operative unit, and «_UO_internal_id_ is a combination of the technique/instrument acronym and an Experiment ID freely decided». +
+> «Each file generated in the context of an In-house Research Project stored on OFED must use the following naming convention: nffa-di_[UO]_[project_id]_key, where the first part of the name adheres to the name of the bucket, while key is arbitrary.»
+>
+> Source: link:{nffa-di-rdp}[NFFA-DI Research Data Policy^]
+
+This means that the accepted filename for a NeXus file of a PLD, where proposal_id is _EXMPL01_, the operative unit is CNR-SPIN Naples and the sample's internal ID is _Na-26-012_ the filename will be:
+
+image::usage-name.png[]
+
+A NeXus file can be verified through one of the readers listed in xref:reading-nxs[xrefstyle="short"]. Pay attention to the following aspects:
+
+* Do I visualize the file correctly?
+* Does the file respect the fabrication method's schema?
+* Is every required field present? Do I read the same values on eLabFTW and in the NeXus file? Are the units of measurement present?
+* Can I visualize heatmaps and N-axis graphs correctly?
+
+If the answer to all previous questions is "Yes", then the output file is NFFA-DI compliant.
+
+////
+collect nxs file
+filename is: [paste link of guidelines here]
+output folder is: output/
+attachments will be in: output/attachments - to be removed
+???
+profit
+////
--- a/docs/user-manual_03.adoc
+++ b/docs/user-manual_03.adoc
@@ -0,0 +1,3 @@
+== Troubleshooting
+
+WIP
--- a/docs/user-manual_main.adoc
+++ b/docs/user-manual_main.adoc
@@ -0,0 +1,41 @@
+= {software-name} User Manual: eLabFTW to NeXus Parser for PLD Fabrications 
+:author: Emanuele D'Amico
+:description: eLabFTW to NeXus Parser for PLD Fabrications
+:doctype: book
+:email: emanuele+expars@damico.ing
+:imagesdir: images
+:keywords: nffa-di, elabftw, nexus, parser, data science, mdmc, naples, cnr-spin, cnr, spin institute, python, hdf5, cli
+:revdate: 2026-05-14
+:revnumber: v0.2.1
+:revremark: alpha untested
+:stem: latexmath
+:toc:
+// custom attributes
+:disclamer: I'm in no position to give anyone coding/development/programming/testing tips. The only tips I can give you are based on my personal knowledge of this specific project.
+:software-family: eXPars
+:software-name: {software-family}-PLD
+:repo-url: https://gitea.damico.ing/emanuele/eXParser-PLD
+:repo-ssh: ssh://git@gitea.damico.ing/emanuele/eXParser-PLD.git
+:elabftw-apikey-docs: https://doc.elabftw.net/docs/usage/api/#generating-a-key
+:elabftw-site: https://elabftw.net
+:nffa-di-site: https://nffa-di.it/en/about-us/project/
+:nffa-di-rdp: https://nffa-di.it/it/research-data-policy/#3.1
+:nffa-di-uo-acronyms: https://nffa-di.it/en/uo-acronyms-for-data-infrastructure-naming-convention
+:go-fair-site: https://www.go-fair.org/fair-principles/
+:fairmat-site: https://www.fairmat-nfdi.eu/fairmat/about-fairmat/consortium-fairmat#mission
+:keepass-site: https://keepassxc.org/
+:bitwarden-site: https://bitwarden.com/
+:ncnr-viewer: https://ncnr.nist.gov/ncnrdata/view/nexus-hdf-viewer.html
+:hdf5-viewer: https://myhdf5.hdfgroup.org/
+
+include::user-manual_01.adoc[]
+
+include::user-manual_02.adoc[]
+
+//include::user-manual_03.adoc[]
+
+///////////////////////////////////////////////////////////////////////////
+// Look out for "method-specific" comments I've left before sections
+// containing information about one method in particulare (e.g. PLD fab.)
+// because that needs to be edited when writing the user manuals of other
+// eXParser's
--- a/requirements.txt
+++ b/requirements.txt
@@ -3,3 +3,4 @@ asyncio
 h5py
 pillow
 elabapi_python
+dotenv
--- a/src/APIHandler.py
+++ b/src/APIHandler.py
@@ -1,4 +1,6 @@
 import os, requests
+from dotenv import load_dotenv
+from getpass import getpass
 import elabapi_python as elabapi


@@ -10,19 +12,22 @@ class APIHandler:
    (since the API doesn't support downloading attachments AFAIK).

    Args:
-        api_key:          A valid API key for the eLabFTW instance where the data is stored, with permissions to access the relevant entries.
+        api_key: str:           A valid API key for the eLabFTW instance where the data is stored, with permissions to access the relevant entries.
                                eLabFTW's API keys are well documented here: https://doc.elabftw.net/docs/usage/api/.
                                If you don't have an API key and are uncapable of creating one, contact your eLabFTW administrator.
                                Or RTFM and create one yourself, it's not that hard.
-        ELABFTW_API_URL:  Complete URL of the eLabFTW instance's root for the API endpoints.
+        ELABFTW_API_URL: str:   Complete URL of the eLabFTW instance's root for the API endpoints.
                                In full caps because it won't (shouldn't) be changed much.
    """

    # TO-DO: remove static url.
-    def __init__(
-        self, api_key="", ELABFTW_API_URL="https://elabftw.fisica.unina.it/api/v2"
-    ):
-        """Init method, apikey suggested but not required (empty by default)."""
+    def __init__(self, api_key="", ELABFTW_API_URL=None):
+        """Init method, api_key suggested but not required (empty by default)."""
+        # if not ELABFTW_API_URL:
+        #    load_dotenv()
+        #    ELABFTW_API_URL = os.getenv("ELABFTW_API_URL") or input(
+        #        "Enter a valid eLabFTW API URL (ends with '/api/v2)': "
+        #    )
        self.api_key = api_key
        self.auth = {"Authorization": api_key}
        self.content = {"Content-Type": "application/json"}
@@ -33,9 +38,9 @@ class APIHandler:
        """
        Returns raw data (as dictionary) from its elabid and entry type.

-        args:
-            elabid:     elabftw internal id of the selected resource.
-            entryType:  Resource type. Anything other than "experiments" or "items" WILL raise an error.
+        Args:
+            elabid: int:     elabftw internal id of the selected resource.
+            entryType: str:  Resource type. Anything other than "experiments" or "items" WILL raise an error.
        """
        if entryType not in ["experiments", "items"]:
            raise Exception(
@@ -64,12 +69,12 @@ class APIHandler:
                case 404:
                    # Lapalissian:
                    raise ConnectionError(
-                        f"404: Not Found. This means there's no resource with this elabid (wrong elabid?) on your eLabFTW (wrong endpoint?)."
+                        "404: Not Found. This means there's no resource with this elabid (wrong elabid?) on your eLabFTW (wrong endpoint?)."
                    )
                case 400:
                    # I genuinely have no idea:
                    raise ConnectionError(
-                        f"400: Bad Request. This means the API endpoint you tried to reach is invalid. Did you tamper with the source code? If not, contact the developer."
+                        "400: Bad Request. This means the API endpoint you tried to reach is invalid. Did you tamper with the source code? If not, contact the developer."
                    )
                case _:
                    # For some fucking reason, this is the only error I actually get from the API...
@@ -80,17 +85,19 @@ class APIHandler:
        entry_data = response.json()
        return entry_data

-    def download_all_attachments_data(self, elabid, entryType="experiments"):
+    def download_attachment_data(self, elabid, upload_id, entryType="experiments"):
        """
-        Downloads attachments of a certain eLabFTW experiment (default) or item.
-        Only returns their binary data. Use method download_attachments_to_disk to save to file.
+        Downloads a specific attachment of a certain eLabFTW experiment (default) or item.
+        Only returns its binary data. Use method download_attachment_to_disk to save to file.
+
        NOTE: Output is a dictionary where:
-            * The keys are the attachments' filenames;
-            * The values are the binary data for those attachments.
+            * The key is the attachment's filename;
+            * The value is the attachment's binary data.

        Args:
-            elabid:     eLabFTW internal ID of the selected resource.
-            entryType:  Resource type. Anything other than "experiments" or "items" WILL raise an error.
+            elabid: int:     eLabFTW internal ID of the selected resource.
+            upload_id: int:  eLabFTW internal ID of the selected upload.
+            entryType: str:  Resource type. Anything other than "experiments" or "items" WILL raise an error.
        """
        if entryType not in ["experiments", "items"]:
            raise Exception(
@@ -98,7 +105,7 @@ class APIHandler:
            )

        config = elabapi.Configuration()
-        config.api_key["api_key"] = api_key
+        config.api_key["api_key"] = self.api_key
        config.api_key_prefix["api_key"] = "Authorization"
        config.host = self.elaburl
        config.debug = False
@@ -108,33 +115,37 @@ class APIHandler:
        )
        uploads_api = elabapi.UploadsApi(api_client)

-        # Actual uploads (dictionary):
-        uploads = {
+        # Scans through the attachments and selects the one with corresponing ID.
+        attachment = {
            upload.real_name: uploads_api.read_upload(
-                entryType, elabid, upload.id, format="binary", _preload_content=False
+                entryType, elabid, upload_id, format="binary", _preload_content=False
            ).data
            for upload in uploads_api.read_uploads(entryType, elabid)
+            if upload.id == upload_id
        }

-        return uploads
+        return attachment

-    def download_attachments_to_disk(
+    def download_attachment_to_disk(
        self,
        elabid,
+        upload_id,
        entryType="experiments",
        dump_dir="output/attachments",
        # persistent=True,
    ):
        """
-        Downloads attachments of a certain eLabFTW experiment (default) or item.
+        Downloads a specific attachment of a certain eLabFTW experiment (default) or item.
        Downloads their binary data through method download_attachments_data and dumps it to dump_dir.
+        Returns full path of the output file.

        Args:
-            elabid:     eLabFTW internal ID of the selected resource.
-            entryType:  Resource type. Anything other than "experiments" or "items" WILL raise an error.
-            dump_dir:   Directory to which to save the attachments. Default is "output/attachments".
-            persistent: [Unused] Decides if the files will stay on disk after all operations are completed.
-                        If set to False, deletes the file upon exiting.
+            elabid: int:        eLabFTW internal ID of the selected resource.
+            upload_id: int:     eLabFTW internal ID of the selected upload.
+            entryType: str:     Resource type. Anything other than "experiments" or "items" WILL raise an error.
+            dump_dir: str:      Directory to which to save the attachments. Default is "output/attachments".
+            persistent: bool:   [Unused] Decides if the files will stay on disk after all operations are completed.
+                                If set to False, deletes the file upon exiting. Default = True.
        """

        if entryType not in ["experiments", "items"]:
@@ -142,9 +153,17 @@ class APIHandler:
                "You can only download attachments from experiments or items."
            )

-        uploads = download_all_attachments_data(elabid, entryType=entryType)
+        uploads = self.download_attachment_data(elabid, upload_id, entryType=entryType)
        for file in uploads:
-            raw_data = uploads["file"]
-            with open(os.path.join(dump_dir, f"exp{elabid}-{file}"), "wb") as f:
+            raw_data = uploads[file]
+            full_path = os.path.join(dump_dir, f"exp{elabid}-{file}")
+            with open(full_path, "wb") as f:
                f.write(raw_data)
-        return
+        return full_path
+
+
+# Testing methods
+if __name__ == "__main__":
+    api_key = getpass("Paste API key here [no echo]: ")
+    handler = APIHandler(api_key=api_key)
+    handler.download_attachment_to_disk(elabid=58, upload_id=81)
--- a/src/classes.py
+++ b/src/classes.py
@@ -1,4 +1,5 @@
 import os, json, requests
+from getpass import getpass
 from APIHandler import APIHandler


@@ -14,6 +15,10 @@ class Layer:
    """

    def __init__(self, layer_data):
+        """
+        Properties/Attributes:
+            Too many to list.
+        """
        try:
            self.elabid = layer_data["id"]
            self.operator = layer_data["fullname"]
@@ -129,14 +134,30 @@ class Layer:
        self.start_time = layer_data.get("created_at") or None
        self.description = layer_data.get("body") or None

-    def get_instruments(self, api_key):
-        raw_lasersys_data = APIHandler(api_key).get_entry_from_elabid(
+    def get_instruments(self, api_key, ELABFTW_API_URL):
+        """
+        Retruns a dictionary of all the instruments used to create the layer.
+        The format of the dictionary is:
+            {
+                "laser_system": str,
+                "deposition_chamber": str,
+                "rheed_system": str
+            }
+
+        Args:
+            api_key: str:           A valid API key for the eLabFTW instance where the data is stored, with permissions to access the relevant entries.
+                                    eLabFTW's API keys are well documented here: https://doc.elabftw.net/docs/usage/api/.
+                                    If you don't have an API key and are uncapable of creating one, contact your eLabFTW administrator.
+                                    Or RTFM and create one yourself, it's not that hard.
+            ELABFTW_API_URL: str:   URL for the API root endpoint of the eLabFTW instance. Ends with '/api/v2' - no trailing slash.
+        """
+        raw_lasersys_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
            self.laser_system_elabid, entryType="items"
        )
-        raw_chamber_data = APIHandler(api_key).get_entry_from_elabid(
+        raw_chamber_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
            self.chamber_elabid, entryType="items"
        )
-        raw_rheedsys_data = APIHandler(api_key).get_entry_from_elabid(
+        raw_rheedsys_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
            self.rheed_system_elabid, entryType="items"
        )
        instruments_used = {
@@ -149,8 +170,8 @@ class Layer:
    def list_attachments(self):
        """
        Returns a dictionary of all the attachments linked to the layer, where:
-            * Each key is the attachment's elabid;
-            * Each value is a dictionary containing the attachment's filename, hashname and related experiment elabid (= self.elabid).
+            * Each key is the attachment's progressive ID (0, 1...);
+            * Each value is a dictionary containing the attachment's elabid, filename, hashname and related experiment elabid (= self.elabid).

        Data is already in layer_data, so the API key is unrequired. Same goes for:
            * fetch_textual_uploads() - no arguments;
@@ -162,7 +183,8 @@ class Layer:
        if self.uploads == []:
            return {}
        attachments = {
-            attachment["id"]: {
+            self.uploads.index(attachment): {
+                "id": attachment["id"],
                "filename": attachment["real_name"],
                "hashname": attachment["long_name"],
                "related_experiment": attachment["item_id"],
@@ -218,10 +240,21 @@ class Entrypoint:
    """

    def __init__(self, sample_data):
+        """
+        Properties/Attributes:
+            * name: str:                        Name of the sample. Fairly important, and always present unless someone screws up REALLY bad.
+            * linked_items: dict:               Dictionary generated by eLabFTW containing metadata on the items linked to the entrypoint.
+            * batch_elabid: int:                eLabFTW internal id of the batch of the substrate used as the foundation of the sample.
+            * proposal: int:                    eLabFTW internal id of the proposal linked to the sample.
+            * linked_experiments: dict:         Dictionary generated by eLabFTW containing metadata on the experiments linked to the entrypoint.
+            * linked_experiments_elabid: list:  List of eLabFTW internal id's of the experiments linked to the entrypoint.
+        """
        try:
+            self.name = sample_data["title"]
            self.extra = sample_data["metadata_decoded"]["extra_fields"]
            self.linked_items = sample_data["items_links"]  # dict
            self.batch_elabid = self.extra["Substrate batch"]["value"]  # elabid
+            self.proposal = self.extra["Proposal"].get("value") or None  # proposal
            self.linked_experiments = sample_data["related_experiments_links"]  # dict
            self.linked_experiments_elabid = [
                i["entityid"] for i in self.linked_experiments
@@ -232,10 +265,6 @@ class Entrypoint:
            raise KeyError(
                f'The provided dictionary lacks a "{k}" key. Check the sample entry on eLabFTW and make sure you used the correct Resource template.'
            )
-        # Non-required attributes:
-        self.name = (
-            sample_data.get("title") or None
-        )  # error prevention is more important than preventing empty fields here


 class Material:
@@ -251,6 +280,14 @@ class Material:
    """

    def __init__(self, material_data):
+        """
+        Properties/Attributes:
+            * name: str:            Name of the material.
+            * compound_elabid: int: eLabFTW internal id of the compound.
+            * dimensions: str:      Dimensions of the material, in standard format.
+                                    The class recognizes the unit of measurement and acts consequently.
+            * dimensions_unit: str: Unit of measurement - either "mm x mm", "inches" or None.
+        """
        try:
            self.name = material_data["title"]  # required
            self.extra = material_data["metadata_decoded"]["extra_fields"]
@@ -271,8 +308,24 @@ class Material:
                f'The provided dictionary lacks a "{k}" key. Check the target/substrate entry on eLabFTW and make sure you used the correct Resource template.'
            )

-    def get_compound_data(self, apikey):
-        raw_compound_data = APIHandler(apikey).get_entry_from_elabid(
+    def get_compound_data(self, apikey, ELABFTW_API_URL):
+        """
+        Returns a dictionary with the relevant data on the compound of which the material is made.
+        The format of the dictionary is:
+            {
+                "name": str,
+                "chemical_formula": str,
+                "cas_number": str
+            }
+
+        Args:
+            api_key: str:           A valid API key for the eLabFTW instance where the data is stored, with permissions to access the relevant entries.
+                                    eLabFTW's API keys are well documented here: https://doc.elabftw.net/docs/usage/api/.
+                                    If you don't have an API key and are uncapable of creating one, contact your eLabFTW administrator.
+                                    Or RTFM and create one yourself, it's not that hard.
+            ELABFTW_API_URL: str:   URL for the API root endpoint of the eLabFTW instance. Ends with '/api/v2' - no trailing slash.
+        """
+        raw_compound_data = APIHandler(apikey, ELABFTW_API_URL).get_entry_from_elabid(
            self.compound_elabid, entryType="items"
        )
        name = raw_compound_data["title"]
@@ -286,13 +339,50 @@ class Material:
        }
        return compound_data

-    def get_compound_formula(self, apikey):
-        formula = self.get_compound_data(apikey).get("chemical_formula")
+    def get_compound_formula(self, apikey, ELABFTW_API_URL):
+        """
+        Returns a string with the chemical formula of the compound.
+
+        Args:
+            api_key: str:           A valid API key for the eLabFTW instance where the data is stored, with permissions to access the relevant entries.
+                                    eLabFTW's API keys are well documented here: https://doc.elabftw.net/docs/usage/api/.
+                                    If you don't have an API key and are uncapable of creating one, contact your eLabFTW administrator.
+                                    Or RTFM and create one yourself, it's not that hard.
+            ELABFTW_API_URL: str:   URL for the API root endpoint of the eLabFTW instance. Ends with '/api/v2' - no trailing slash.
+        """
+        formula = self.get_compound_data(apikey, ELABFTW_API_URL).get(
+            "chemical_formula"
+        )
        return formula


 class Substrate(Material):
+    """
+    Substrate(material_data) - where material_data is a Python dictionary.
+
+    Inherits from Material and it's meant to be used exclusively for eLabFTW Resources of the "Substrate" category.
+    """
+
    def __init__(self, material_data):
+        """
+        Properties/Attributes common to all Materials:
+            * name: str:            Name of the material.
+            * compound_elabid: int: eLabFTW internal id of the compound.
+            * dimensions: str:      Dimensions of the material, in standard format.
+                                    The class recognizes the unit of measurement and acts consequently.
+            * dimensions_unit: str: Unit of measurement - either "mm x mm", "inches" or None.
+
+        Specific properties/attributes:
+            * orientation: str:
+            * miscut_angle: str:
+            * miscut_angle_unit: str:
+            * miscut_direction: str:
+            * thickness: str:
+            * thickness_unit: str:
+            * surface_treatment: str:
+            * manufacturer: str:
+            * batch_id: str:
+        """
        super().__init__(material_data)
        try:
            self.orientation = self.extra["Orientation"]["value"]
@@ -312,7 +402,28 @@ class Substrate(Material):


 class Target(Material):
+    """
+    Target(material_data) - where material_data is a Python dictionary.
+
+    Inherits from Material and it's meant to be used exclusively for eLabFTW Resources of the "PLD Target" category.
+    """
+
    def __init__(self, material_data):
+        """
+        Properties/Attributes common to all Materials:
+            * name: str:            Name of the material.
+            * compound_elabid: int: eLabFTW internal id of the compound.
+            * dimensions: str:      Dimensions of the material, in standard format.
+                                    The class recognizes the unit of measurement and acts consequently.
+            * dimensions_unit: str: Unit of measurement - either "mm x mm", "inches" or None.
+
+        Specific properties/attributes:
+            * thickness: str:
+            * thickness_unit: str:
+            * shape: str:
+            * solid_form: str:
+            * manufacturer: str:
+        """
        super().__init__(material_data)
        try:
            self.thickness = self.extra["Thickness"]["value"]
@@ -328,7 +439,37 @@ class Target(Material):
        self.description = material_data.get("body") or ""


+class Proposal:
+    """
+    Proposal(proposal_data) - where proposal_data is a Python dictionary.
+
+    Recovers only the relevant info on a proposal linked to the entrypoint sample.
+    Which currently is just its name.
+
+    If the name starts with "Proposal " (space included) that gets omitted from the output.
+    """
+
+    def __init__(self, proposal_data):
+        """
+        Properties/Attributes:
+            * name: str:    Name of the proposal.
+                            If the name starts with "Proposal " (space included) that gets omitted from the output.
+        """
+        if "Proposal " in proposal_data["title"]:
+            self.name = proposal_data["title"].replace("Proposal ", "")
+        else:
+            self.name = proposal_data["title"]
+
+
 if __name__ == "__main__":
-    head = APIHandler("MyApiKey-123456789abcdef")
-    print(f"Example header:\n\t{head.header}\n")
-    print("Warning: you're not supposed to be running this as the main program.")
+    # head = APIHandler("MyApiKey-123456789abcdef")
+    # print(f"Example header:\n\t{head.header}\n")
+    # print("Warning: you're not supposed to be running this as the main program.")
+    api_key = getpass("Paste API key here [no echo]: ")
+    ELABFTW_API_URL = input("Enter a valid eLabFTW API URL (ends with '/api/v2)': ")
+    handler = APIHandler(api_key, ELABFTW_API_URL)
+    exp58 = handler.get_entry_from_elabid(elabid=58, entryType="experiments")
+    layer58 = Layer(exp58)
+    print(layer58.list_attachments())
+    print(layer58.fetch_textual_uploads())
+    print(layer58.fetch_images())
--- a/src/main.py
+++ b/src/main.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python3
 import os, json, requests, h5py
 import numpy as np
+from dotenv import load_dotenv
 from getpass import getpass
 from APIHandler import APIHandler
 from classes import *
@@ -10,12 +11,16 @@ from PIL import Image

 def call_entrypoint_from_elabid(elabid):
    """
-    Calls an entrypoint sample from eLabFTW using its elabid, then returns an object of the Entrypoint class.
+    Calls a sample from eLabFTW through its elabid, then returns an object of the Entrypoint class.
+    The Entrypoint serves as the starting point in the construction of the dataset.

    If the entry is not a sample (category_title not matching exactly "Sample") returns ValueError.
+    It's most likely the first error you might encounter (with a valid API key).
+
+    Arg: elabid: int    eLabFTW internal id of the selected resource.
    """
    try:
-        sample_data = APIHandler(apikey).get_entry_from_elabid(
+        sample_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
            elabid, entryType="items"
        )
        if not sample_data.get("category_title") == "Sample":
@@ -32,11 +37,14 @@ def call_material_from_elabid(elabid):
    """
    Calls a material from eLabFTW using its elabid, then returns an object of the Material class.

-    If the entry is neither a PLD Target or a Substrate batch returns ValueError. Such entries always have a category_title key with its value matching exactly "PLD Target" or "Substrate".
+    If the entry is neither a PLD Target or a Substrate batch returns ValueError.
+    Such entries always have a category_title key with its value matching exactly "PLD Target" or "Substrate".
    Because of an old typo, the value "Subtrate" (second 's' is missing) is also accepted.
+
+    arg: elabid: int    eLabFTW internal id of the selected resource.
    """
    try:
-        material_data = APIHandler(apikey).get_entry_from_elabid(
+        material_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
            elabid, entryType="items"
        )
        material_category = material_data.get("category_title")
@@ -57,14 +65,17 @@ def call_material_from_elabid(elabid):

 def call_layers_from_list(elabid_list):
    """
-    Calls a list of (PLD deposition) experiments from eLabFTW using their elabid - which means the input must be a list of integers instead of a single one - then returns a list of Layer-class objects.
+    Calls a list of (PLD deposition) experiments from eLabFTW through their elabid's, then returns a list of Layer-class objects.

-    If one of the entries is not related to a deposition layer (category_title not matching exactly "PLD Deposition") that entry is skipped, with no error raised.
+    If one of the entries is not related to a deposition layer (category_title not matching exactly "PLD Deposition")
+    that entry is skipped, with no error raised.
+
+    Arg: elabid_list: list(int):    list of eLabFTW experiments.
    """
    list_of_layers = []
    for elabid in elabid_list:
        try:
-            layer_data = APIHandler(apikey).get_entry_from_elabid(
+            layer_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
                elabid, entryType="experiments"
            )
            if not layer_data.get("category_title") == "PLD Deposition":
@@ -80,15 +91,46 @@ def call_layers_from_list(elabid_list):
                + str(e)
                + f"\nPlease solve the problem before retrying."
                + "\n\n"
-                + f"Last resource attempted to call: {ELABFTW_API_URL}/experiments/{elabid}"
+                + f"Last resource attempted to call at base url: /experiments/{elabid}"
            )
    return list_of_layers  # list of Layer-class objects


+def call_proposal_from_elabid(elabid):
+    """
+    Calls a proposal item from eLabFTW using their elabid and creates a Proposal-class object.
+
+    Returns the proposal's name (method Proposal.name -> str)
+    If the name starts with "Proposal " (space included) that gets omitted from the output.
+
+    Arg: elabid: int    eLabFTW internal id of the selected resource.
+    """
+    try:
+        proposal_data = APIHandler(api_key, ELABFTW_API_URL).get_entry_from_elabid(
+            elabid, entryType="items"
+        )
+        proposal_category = proposal_data.get("category_title")
+        # TO-DO: correct this typo on elabftw: Subtrate → Substrate.
+        if (
+            "Proposal" not in proposal_category
+        ):  # to avoid that same old problem with trailing spaces
+            print(f"Category of the resource: {proposal_category}.")
+            raise ValueError(
+                f"The referenced resource (elabid = {elabid}) is not a proposal."
+            )
+        else:
+            proposal = Proposal(proposal_data)
+    except ConnectionError as e:
+        raise ConnectionError(e)
+    return proposal.name  # String
+
+
 def chain_entrypoint_to_batch(sample_object):
    """
-    Takes an Entrypoint-class object, looks at its .batch_elabid attribute and returns a Material-class object containing data on the substrate batch associated to the starting sample.
+    Takes an Entrypoint-class object, looks at its .batch_elabid attribute and retrieves data on the substrate batch associated to the starting sample.
+    Returns a Material-class object.

+    Arg: sample_object: Entrypoint:     Entrypoint-class object.
    Dependency: call_material_from_elabid.
    """
    material_elabid = sample_object.batch_elabid
@@ -98,10 +140,11 @@ def chain_entrypoint_to_batch(sample_object):

 def chain_entrypoint_to_layers(sample_object):
    """
-    Takes an Entrypoint-class object, looks at its .linked_experiments_elabid attribute (list) and returns a list of Layer-class objects containing data on the deposition layers associated to the starting sample - using the function call_layers_from_list.
-
-    The list is sorted by progressive layer number (layer_number attribute).
+    Takes an Entrypoint-class object, looks at its .linked_experiments_elabid attribute (list) and
+    retrieves data on the deposition layers associated to the starting sample.
+    Returns a list of Layer-class objects, sorted by progressive layer number (layer_number attribute).

+    Arg: sample_object: Entrypoint:     Entrypoint-class object.
    Dependency: call_layers_from_list.
    """
    linked_experiments_elabid = (
@@ -114,8 +157,10 @@ def chain_entrypoint_to_layers(sample_object):

 def chain_layer_to_target(layer_object):
    """
-    Takes a Layer-class object, looks at its .target_elabid attribute and returns a Material-class object containing data on the PLD target used in the deposition of said layer.
+    Takes a Layer-class object, looks at its .target_elabid attribute and retrieves data on the PLD target used in the deposition of said layer.
+    Returns a Material-class object.

+    Arg: layer_object: Layer:   Layer-class object.
    Dependency: call_material_from_elabid.
    """
    target_elabid = layer_object.target_elabid
@@ -125,16 +170,23 @@ def chain_layer_to_target(layer_object):

 def deduplicate_instruments_from_layers(layers):
    """
-    Takes a list of Layer-class objects and for each layer gets the instruments used (laser, depo chamber and RHEED), returns dictionary with one item per category. This means that if more layers share the same instruments it returns a dictionary with just their names as strings (no lists or sub-dictionaries).
+    For each layer gets the instruments used (laser, depo chamber and RHEED).
+    Creates three sets with all the instruments used regardless of the layer they've been used for.
+    Turns the sets into three strings (joined with commas), then returns a dictionary in the format:
+        {
+            "laser_system": "Laser A, Laser B...",
+            "deposition_chamber": "DC A, DC B...",
+            "rheed_system": RHEED A, RHEED B..."
+        }

-    If different layers have different instruments (e.g. laser systems) the user is prompted to only select one.
+    Arg: layers: list(Layer):   List of Layer-class objects.
    """
    lasers = []
    chambers = []
    rheeds = []
    elegant_dict = {}
    for lyr in layers:
-        instruments = lyr.get_instruments(apikey)
+        instruments = lyr.get_instruments(api_key, ELABFTW_API_URL)
        lasers.append(instruments["laser_system"])
        chambers.append(instruments["deposition_chamber"])
        rheeds.append(instruments["rheed_system"])
@@ -152,57 +204,8 @@ def deduplicate_instruments_from_layers(layers):
        "deposition_chamber": ", ".join(ded_chambers),
        "rheed_system": ", ".join(ded_rheeds),
    }  # dictionary's name is a joke
-    # updated_dict = {} # use this for containing the final dataset
-    # for ded in elegant_dict:
-    #     if len(elegant_dict[ded]) == 0:
-    #         # if len of list is 0 - empty list - raise error
-    #         raise IndexError(f"Missing data: no Laser System, Chamber and/or RHEED System is specified in any of the Deposition-type experiments related to this sample. Fix this on eLabFTW before retrying. Affected list: {ded}.")
-    #     elif len(elegant_dict[ded]) > 1:
-    #         # if len of list is > 1 - too many values - allow the user to pick one
-    #         print("Warning: different instruments have been used for different layers - which is currently not allowed.")
-    #         # there's a better way to do this but I can't remember now for the life of me...
-    #         i = 0
-    #         while i < len(elegant_dict[ded]):
-    #             print(f"{i} - {elegant_dict[ded][i]}")
-    #             i += 1
-    #         ans = None
-    #         while not type(ans) == int or not ans in range(0, len(elegant_dict[ded])):
-    #             ans = input("Please pick one of the previous (0, 1, ...) [default = 0]: ") or "0"
-    #             if ans.isdigit():
-    #                 ans = int(ans)
-    #             continue # unnecessary?
-    #         updated_dict[ded] = elegant_dict[ded][ans]
-    #     elif elegant_dict[ded][0] in ["", 0, None]:
-    #         # if len is 1 BUT value is "", 0 or None raise error
-    #         raise ValueError(f"Missing data: a Laser System, Chamber and/or RHEED System which is specified across all the Deposition-type experiments related to this sample is either empty or invalid. Fix this on eLabFTW before retrying. Affected list: {ded}.")
-    #     else:
-    #         # if none of the previous (only 1 value), that single value is used
-    #         updated_dict[ded] = elegant_dict[ded][0]
-    # instruments_used_dict = {
-    #     "laser_system": updated_dict["Laser Systems"],
-    #     "deposition_chamber": updated_dict["Deposition Chamber"],
-    #     "rheed_system": updated_dict["RHEED Systems"],
-    # }
    return elegant_dict

-    ### OLD CODE
-    # if 0 in [ len(i) for i in elegant_list ]:
-    #     # i.e. if length of one of the lists in elegant_list is zero (missing data):
-    #     raise IndexError("Missing data: no Laser System, Chamber and/or RHEED System is specified in any of the Deposition-type experiments related to this sample.")
-    # if not all([ len(i) == 1 for i in elegant_list ]):
-    #     print("Warning: different instruments have been used for different layers - which is currently not allowed.")
-    #     # for every element in elegant list check if len > 1 and if it is
-    #     print("Selecting the first occurence for every category...")
-    ###
-    # lasers = { f"layer_{lyr.layer_number}": lyr.laser_system for lyr in layers }
-    # chambers = { f"layer_{lyr.layer_number}": lyr.deposition_chamber for lyr in layers }
-    # rheeds = { f"layer_{lyr.layer_number}": lyr.rheed_system for lyr in layers }
-    # instruments_used_dict = {
-    #     "laser_system": lasers,
-    #     "deposition_chamber": chambers,
-    #     "rheed_system": rheeds,
-    # }
-

 def select_rheed_data(layer):
    """
@@ -222,10 +225,15 @@ def select_rheed_data(layer):
            "related_experiment": elabid
        }
    """
+
    n = layer.layer_number
    textual_uploads = layer.fetch_textual_uploads()
    images = layer.fetch_images()

+    # Check for length. Three cases:
+    # 1. len is 0, no file of this category → return {}
+    # 2. len is more than 1, user must select
+    # 3. len is 1, God's in his heaven, all's right with the world
    if len(textual_uploads) == 0:
        rheed_data_file = {}
    elif len(textual_uploads) > 1:
@@ -248,8 +256,12 @@ def select_rheed_data(layer):
            continue
        rheed_data_file = textual_uploads[ans]  # still a dictionary
    else:
-        rheed_data_file = textual_uploads[0]
+        rheed_data_file = textual_uploads[
+            next(iter(textual_uploads))
+        ]  # this prism of pork gets the value of the only key in the dictionary
+        # it's proof like no other that my code is human-generated, and that I suck at coding. It's hubris manifest.

+    # As above so below
    if len(images) == 0:
        rheed_image_file = {}
    elif len(images) > 1:
@@ -272,16 +284,11 @@ def select_rheed_data(layer):
            continue
        rheed_image_file = images[ans]  # still a dictionary
    else:
-        rheed_image_file = images[0]
+        rheed_image_file = images[next(iter(images))]

    return (rheed_data_file, rheed_image_file)


-def download_rheed_data():
-
-    return
-
-
 def analyse_rheed_data(data):
    """
    Takes the content of a tsv file and returns a dictionary with timestamps and intensities.
@@ -291,13 +298,15 @@ def analyse_rheed_data(data):
    Time    Layer1_Int1     Layer1_Int2     Layer1_Int3
    -----

-    Distinct ValueErrors are raised if:
+    Exceptions:
+        1. Distinct ValueErrors are raised if:
            * The array is not 2-dimensional;
-    * The total number of columns does not equate exactly 1+3 (= 4).
+            * The total number of columns does not equate at least 1+3 (= 4).
+        2. If the file has more than 4 columns the function prints a warning, then ignores the other columns.
+        3. No exception is made for files where the first column is not the time, or the others are not intensities.

    Time is expressed in seconds, intensities are adimensional on 8 bits (min. 0, max. 255).

-    # TO-DO: complete this description...
    Written with help from DeepSeek.
    """
    # Verifying the format of the input file:
@@ -337,13 +346,19 @@ def make_nexus_schema_dictionary(substrate_object, layers):
    and a list of Layer-class objects (output of the chain_entrypoint_to_layers() function).

    Returns dictionary with the same schema as the NeXus standard for PLD fabrications.
+
+    Args:
+        substrate_object: Substrate:    Substrate-class object.
+        layers: list(Layer):            List of Layer-class objects.
    """
    instruments = deduplicate_instruments_from_layers(layers)
    pld_fabrication = {
        "sample": {
            "substrate": {
                "name": substrate_object.name,
-                "chemical_formula": substrate_object.get_compound_formula(apikey),
+                "chemical_formula": substrate_object.get_compound_formula(
+                    api_key, ELABFTW_API_URL
+                ),
                "orientation": substrate_object.orientation,
                "miscut_angle": {
                    "value": substrate_object.miscut_angle,
@@ -371,7 +386,9 @@ def make_nexus_schema_dictionary(substrate_object, layers):
        target_object = chain_layer_to_target(layer)
        target_dict = {
            "name": target_object.name,
-            "chemical_formula": target_object.get_compound_formula(apikey),
+            "chemical_formula": target_object.get_compound_formula(
+                api_key, ELABFTW_API_URL
+            ),
            "description": target_object.description,
            "shape": target_object.shape,
            "dimensions": target_object.dimensions,
@@ -465,11 +482,27 @@ def make_nexus_schema_dictionary(substrate_object, layers):
            },
            "instruments_used": instruments[name],
        }
-        rheed_data[name] = {}
+        rheed_data[name] = {
+            "layer_number": layer.layer_number,
+            "data": select_rheed_data(
+                layer
+            ),  # tuple: (rheed_data_file, rheed_image_file)
+        }
    return pld_fabrication


-def build_nexus_file(pld_fabrication, output_path, rheed_osc=None, heatmap_matrix=None):
+def build_nexus_file(pld_fabrication, output_path="output/nffa-di_unnamed.h5"):
+    """
+    The function which actually builds the NeXus file for *PLD DEPOSITIONS*.
+    Saves the file in the specified directory.
+
+    Args:
+        pld_fabrication:    A dictionary with a specific schema, one only the function
+                            make_nexus_schema_dictionary should make.
+        output_path:        The full path to the output file, including filename complete with extension.
+                            It's a string, which should be produced with os.path.
+                            Default value is: "output/nffa-di_unnamed.h5" - which is NOT NFFA-DI compliant.
+    """
    # NOTE: look at the mail attachment from Emiliano...
    with h5py.File(output_path, "w") as f:
        nx_pld_entry = f.create_group("pld_fabrication")
@@ -735,6 +768,92 @@ def build_nexus_file(pld_fabrication, output_path, rheed_osc=None, heatmap_matri
        nx_rheed = nx_pld_entry.create_group("rheed_data")
        nx_rheed.attrs["NX_class"] = "NXdata"

+        rheed_data = pld_fabrication["rheed_data"]
+        for layer in rheed_data:
+            nx_rheed_layer = nx_rheed.create_group(layer)
+
+            layer_dict = rheed_data[layer]
+            n = layer_dict["layer_number"]
+            rheed_data_file = layer_dict["data"][0]  # first in the tuple
+            rheed_image_file = layer_dict["data"][1]  # second in the tuple
+            handler = APIHandler(api_key, ELABFTW_API_URL)
+
+            # TO-DO: maybe make a dedicated function???
+            data_path = None
+            image_path = None
+
+            if rheed_data_file != {}:
+                try:
+                    elabid = rheed_data_file["related_experiment"]
+                    upload_id = rheed_data_file["id"]
+                except KeyError as ke:
+                    raise KeyError(
+                        f"Missing key in your file: {rheed_data_file.get('filename') or '<missing name>'}: {ke}"
+                    )
+                data_path = handler.download_attachment_to_disk(
+                    elabid=elabid, upload_id=upload_id
+                )
+
+            if rheed_image_file != {}:
+                try:
+                    upload_id = rheed_image_file["id"]
+                    elabid = rheed_image_file["related_experiment"]
+                except KeyError as ke:
+                    raise KeyError(
+                        f"Missing key in your file: {rheed_data_file.get('filename') or '<missing name>'}: {ke}"
+                    )
+                image_path = handler.download_attachment_to_disk(
+                    elabid=elabid, upload_id=upload_id
+                )
+
+            if data_path and os.path.isfile(data_path):
+                with open(data_path, "r") as o:
+                    osc = np.loadtxt(o, delimiter="\t")
+                try:
+                    rheed_osc = (
+                        analyse_rheed_data(data=osc) or None
+                    )  # analyze rheed data first, build the file later
+                except ValueError as ve:
+                    raise ValueError(
+                        f"Error with function analyse_rheed_data. {ve}\nPlease make sure the Realtime Window Analysis file is exactly 4 columns wide - where the first column represents time and the others are RHEED intensities."
+                    )
+                if rheed_osc is not None:
+                    # Time axis (needed?)
+                    t_ds = nx_rheed_layer.create_dataset("time", data=rheed_osc["time"])
+                    t_ds.attrs["units"] = "s"
+                    t_ds.attrs["long_name"] = "Time"
+
+                    # Intensity shape (n_layers, n_timepoints, 3)
+                    i_ds = nx_rheed_layer.create_dataset(
+                        "intensity", data=rheed_osc["intensity"]
+                    )
+                    i_ds.attrs["units"] = "a.u."
+                    i_ds.attrs["long_name"] = "RHEED Intensity"
+
+                    # NXdata attributes — NeXus 3.x notation
+                    nx_rheed_layer.attrs["signal"] = "intensity"
+                    nx_rheed_layer.attrs["axes"] = [
+                        ".",
+                        "time",
+                        ".",
+                    ]  # only time axis (1) is named
+                    nx_rheed_layer.attrs["time_indices"] = np.array([1], dtype=np.int32)
+
+            if image_path and os.path.isfile(image_path):
+                img = Image.open(image_path).convert("L")
+                heatmap_matrix = np.array(img, dtype=np.uint8)  # or None
+                # heatmap_matrix = heatmap_matrix.astype(np.float32) / 255.0  # toggle to normalize matrix values
+
+            if heatmap_matrix is not None:
+                heatmap = nx_rheed_layer.create_dataset(
+                    "diffraction_image", data=heatmap_matrix
+                )
+                heatmap.attrs["long_name"] = "Diffraction Image"
+                heatmap.attrs["units"] = "a.u."
+                heatmap.attrs["interpretation"] = "spectrum"
+    return
+    # TO-DO: ↓↓↓ comment cleanup ↓↓↓
+    #
    # here's what we gon do: (to be read with the voice of Mike from Breaking Bad)
    # 1. rheed_osc and heatmap_matrix are NOT given in input to the function so no need for checking that
    # 2. loop through the layers, each with its elabid and metadata
@@ -753,96 +872,45 @@ def build_nexus_file(pld_fabrication, output_path, rheed_osc=None, heatmap_matri
    #   * Layer.fetch_textual_uploads() - dictionary
    #   * Layer.fetch_images() - dictionary

-        if rheed_osc is not None:
-            # Asse temporale
-            t_ds = nx_rheed.create_dataset("time", data=rheed_osc["time"])
-            t_ds.attrs["units"] = "s"
-            t_ds.attrs["long_name"] = "Time"
-
-            # Intensità: shape (n_layers, n_timepoints, 3)
-            i_ds = nx_rheed.create_dataset("intensity", data=rheed_osc["intensity"])
-            i_ds.attrs["units"] = "a.u."
-            i_ds.attrs["long_name"] = "RHEED Intensity"
-
-            # Attributi NXdata — notazione NeXus 3.x corretta
-            nx_rheed.attrs["signal"] = "intensity"
-            nx_rheed.attrs["axes"] = [
-                ".",
-                "time",
-                ".",
-            ]  # solo l'asse 1 (time) è denominato
-            nx_rheed.attrs["time_indices"] = np.array([1], dtype=np.int32)
-            # ###########
-            # nx_rheed = nx_pld_entry.create_group("rheed_data")
-            # nx_rheed.attrs["NX_class"] = "NXdata"
-
-            # nx_rheed.create_dataset("time", data=rheed_osc["time"])
-            # nx_rheed["time"].attrs["units"] = "s"
-
-            # nx_rheed.create_dataset("intensity", data=rheed_osc["intensity"])
-            # #nx_rheed["intensity"].attrs["units"] = "counts"
-            # nx_rheed["intensity"].attrs["long_name"] = "RHEED intensity"
-            # nx_rheed.attrs["signal"] = "intensity"
-            # nx_rheed.attrs["axes"] = "layer:time:channel"
-            # nx_rheed.attrs["layer_indices"] = [0]  # asse layer
-            # nx_rheed.attrs["time_indices"] = [1]   # asse tempo
-            # nx_rheed.attrs["channel_indices"] = [2]
-        if heatmap_matrix is not None:
-            heatmap = nx_rheed.create_dataset("diffraction_image", data=heatmap_matrix)
-            heatmap.attrs["long_name"] = "Diffraction Image"
-            heatmap.attrs["units"] = "a.u."
-            # this is of my own initiative. good???
-            heatmap.attrs["interpretation"] = "spectrum"
-            # suggested by DeepSeek, useful? probably not.
-            # heatmap.attrs["suggested_colormap"] = "inferno"
-            # heatmap.attrs["scale_min"] = 0.0
-            # heatmap.attrs["scale_max"] = 1.0
-    return
-

 if __name__ == "__main__":
-    # TO-DO: place the API base URL somewhere else.
-    ELABFTW_API_URL = "https://elabftw.fisica.unina.it/api/v2"
-    apikey = getpass("Paste API key here: ")
-    elabid = input("Enter elabid of your starting sample [default = 1111]: ") or 1111
-    data = APIHandler(apikey).get_entry_from_elabid(elabid)
+    load_dotenv()
+    api_key = os.getenv("api_key") or getpass("Paste API key here: ", echo_char="*")
+    elabid = (
+        os.getenv("elabid")
+        or input("Enter elabid of your starting sample [default = 1111]: ")
+        or 1111
+    )
+    ELABFTW_API_URL = os.getenv("ELABFTW_API_URL") or input(
+        "Enter a valid eLabFTW API URL (ends with '/api/v2)': "
+    )
+    handler = APIHandler(api_key, ELABFTW_API_URL)
+    data = handler.get_entry_from_elabid(elabid)
    sample = Entrypoint(data)
-    sample_name = sample.name.strip().replace(" ", "_")
+    sample_name = sample.name.strip().replace(
+        " ", "-"
+    )  # returns error if no "title" or title is not str
+    operative_unit = os.getenv("operative_unit") or None
+    if operative_unit:
+        operative_unit = operative_unit.strip().replace(" ", "-")
+    if sample.proposal:
+        sample_proposal = call_proposal_from_elabid(sample.proposal)
+    else:
+        sample_proposal = None
    substrate_object = chain_entrypoint_to_batch(sample)  # Substrate-class object
    layers = chain_entrypoint_to_layers(sample)  # list of Layer-class objects
    n_layers = len(layers)  # total number of layers on the sample
-    result = make_nexus_schema_dictionary(substrate_object, layers)
    # print(make_nexus_schema_dictionary(substrate_object, layers)) # debug
-    with open(f"output/sample-{sample_name}.json", "w") as f:
-        json.dump(result, f, indent=3)
-    # TO-DO: remove the hard-coded path of the RWA file
-    # ideally the script should download a TXT/CSV file from each layer
-    # (IF PRESENT ←→ also handle missing file error)
-    # and merge all data in a single file to analyse it
-    # WARNING: fails if file is missing
+    fn_base = (
+        "nffa-di_"
+        + (f"{sample_proposal}_" if sample_proposal else "")
+        + (f"{operative_unit}_" if operative_unit else "")
+        + "PLD_"
+        + sample_name[:9]
+    )

-    with open("tests/Realtime_Window_Analysis.txt", "r") as o:
-        osc = np.loadtxt(o, delimiter="\t")
-    try:
-        rheed_osc = (
-            analyse_rheed_data(data=osc) or None
-        )  # analyze rheed data first, build the file later
-    except ValueError as ve:
-        raise ValueError(
-            f"Error with function analyse_rheed_data. {ve}\nPlease make sure the Realtime Window Analysis file is exactly 4 columns wide - where the first column represents time and the others are RHEED intensities."
-        )
-    # This one tries to open a png image.
-    # Emiliano said to keep it to one image per layer tops.
-    # In this test I will only consider one image.
-    # TO-DO: make it format-agnostic. If not possible, make it PNG-only.
-    if os.path.isfile("tests/LAO_16min50s_736C_STO.bmp"):  # if BMP
-        # if os.path.isfile("tests/LAO_16min50s_736C_STO.png"): # if PNG
-        img = Image.open("tests/LAO_16min50s_736C_STO.bmp").convert("L")
-        mx = np.array(img, dtype=np.uint8)
-        # mx = mx.astype(np.float32) / 255.0  # consider deleting???
-    build_nexus_file(
-        result,
-        output_path=f"output/sample-{sample_name}-nexus.h5",
-        rheed_osc=rheed_osc,
-        heatmap_matrix=mx,
-    )
+    result = make_nexus_schema_dictionary(substrate_object, layers)
+    with open(f"output/{fn_base}.json", "w") as f:
+        json.dump(result, f, indent=3)
+
+    build_nexus_file(result, output_path=f"output/{fn_base}.nxs")
Author	SHA256	Message	Date
PioApocalypse	685d15d55b	MAJOR: solves problem related to ELABFTW_API_URL variable if no value was specified for such variable (or .env was missing) EAU would be set to None and get stuck in a prompt loop solved by turning EAU into a required variable in APIHandler (and editing a lot of code through all of src/)	2026-05-14 17:24:02 +02:00
PioApocalypse	1ce381f341	quality improvements API key prompt is now "echo on" - echo off was useless given the context sample name gets trimmed so only STD-ID is preserved in the filename filename now contains technique (PLD) and ends in .nxs all should be right in the world - and nffa-di data research policy compliant, spec.lly sect. 3.1.7-3.1.9	2026-05-14 17:21:07 +02:00
PioApocalypse	e1d5dfa487	example env now contains approved operative unit code for CNR-SPIN@Na	2026-05-14 17:09:35 +02:00
PioApocalypse	45220bbaf3	docs finished up to usage, ignores drawio bkp	2026-05-14 17:08:56 +02:00
PioApocalypse	dc916b1207	new docs, up to installation procedure	2026-05-14 01:40:54 +02:00
PioApocalypse	50a1ba9f22	first docfiles (asciidoc) - not completed not even the introduction is full	2026-05-13 21:01:05 +02:00
emanuele	8962135f0e	adds example .env file	2026-05-13 12:38:00 +02:00
emanuele	ee96100a73	uses dotenv to store api key and other important variables if a value is not found in .env it will be prompted, but not checked next step is user docs	2026-05-13 12:31:26 +02:00
emanuele	686f869d10	documents all the functions/classes/methods (by hand) no AI used, it took more than I'm willing to admit but it's done	2026-05-13 12:12:32 +02:00
emanuele	2eea3fc2dd	ignores output/attachments	2026-05-13 10:27:40 +02:00
emanuele	cbf5cdd115	clears comments	2026-05-13 10:26:15 +02:00
emanuele	a6d4c72f9c	adds dependency: dotenv	2026-05-13 09:53:57 +02:00
emanuele	7e808509cc	THIS should solve the naming problem new class for the Proposals, only outputs their names if name contains "Proposal ", that gets cropped out if no proposal is specified the name of the sample shall not include one	2026-05-12 22:59:19 +02:00
emanuele	2bbab96ca7	rm unnecessary fstring	2026-05-12 16:48:04 +02:00
PioApocalypse	f84478a7a4	this should solve the filename problem	2026-05-12 16:08:49 +02:00
PioApocalypse	19a802694f	MAJOR: fundamental functions of the parser are ready and tested! TO-DO: 1. follow the "TO-DO" comments to clean the code 2. filename should be NFFA-DI compliant like: nffa-di_NA01_Napoli_Na-26-015.h5 3. rheed data analysis should take two distinct functions one for the raw stream and one for the image 4. if time allows: consider moving most of main.py in separate modules	2026-05-12 15:38:06 +02:00
PioApocalypse	df927b7c0e	Layer class methods to list attachments up and tested	2026-05-12 13:51:59 +02:00
PioApocalypse	ccf74fca26	methods to download experiments attachments up and tested to-do: clean code	2026-05-12 13:36:52 +02:00