new docs, up to installation procedure

2026-05-14 01:40:54 +02:00
parent 50a1ba9f22
commit dc916b1207
5 changed files with 77 additions and 12 deletions
--- a/docs/images/usage-venv.jpg
+++ b/docs/images/usage-venv.jpg
--- a/docs/images/usage-venv.png
+++ b/docs/images/usage-venv.png
--- a/docs/user-manual_intro.adoc
+++ b/docs/user-manual_intro.adoc
@@ -1,8 +1,8 @@
 == Introduction
 // TO-DO: Grammar-check. I'm totally fried right now and can't seem to complete even a single proper
-*eXParser* - short for _**e**LabFTW to Ne**X**us **Parser**_ - is (hopefully) a family of specialized parsing software applications, mainly developed in Python, whose primary job is to automatically transform experimental metadata and data - originally stored as JSON objects inside an electronic lab notebook - into standardized, self-descriptive **NeXus files**.
+*{software-family}* - short for _**e**LabFTW to Ne**X**us **Pars**er_ - is (hopefully) a family of specialized parsing software applications, mainly developed in Python, whose primary job is to automatically transform experimental metadata and data - originally stored as JSON objects inside an electronic lab notebook - into standardized, self-descriptive **NeXus files**.

-The software is designed to fetch "scattered" data (often distributed across multiple linked entries) from our eLNfootnote:[Acronym for "_electronic Lab Notebook_".] of choice - link:{elabftw-site}[**eLabFTW**^] - where the data is originally stored as JSON objects. It then parses the included metadata to resolve the full dataset which is then used to create a dictionary following a pre-established schema (dependent on the analysis or fabrication method, e.g., PLD, XRD, or RHEED), and finally uses said dictionary to produce an **HDF5/NeXus File** which complies with the **FAIR Principles** and the guidelines given within the context of the Italian PNRRfootnote:pnrr[PNRR stands for _National Recovery and Resilience Plan_.] **NFFA-DI**.
+The software is designed to fetch "scattered" data (often distributed across multiple linked entries) from our eLNfootnote:[Acronym for "_electronic Lab Notebook_".] of choice - link:{elabftw-site}[**eLabFTW**^] - where the data is originally stored as JSON objects. It then parses the included metadata to resolve the full dataset which is then used to create a dictionary following a pre-established schema (dependent on the analysis or fabrication method, e.g., PLD, XRD, or RHEED), and finally uses said dictionary to produce an **HDF5/NeXus file** which complies with the **FAIR Principles** and the guidelines given within the context of the Italian PNRRfootnote:pnrr[PNRR stands for _National Recovery and Resilience Plan_.] **NFFA-DI**.

 Specifically, *{software-name}* is designed for *Pulsed Laser Deposition / PLD* fabrications.

@@ -37,8 +37,16 @@ In a software like eLabFTW where data can (and will) be spread out through multi

 In this optic, {software-name} interacts with eLabFTW via its REST API (Application Programming Interface). It reads a starting sample's ID (the entry point), fetches the relevant JSON metadata, chains requests using the elabid's of the sample's linked resources and experiments, rebuilds the entire dataset and if available downloads attached instrument files (e.g., RHEED intensities, images) to package all of it into the final NeXus file.

-=== HDF5 and NeXus Files
+=== The output: HDF5 and NeXus files
+The output of {software-family} is an **HDF5 (Hierarchical Data Format ver. 5) file**, which is a powerful file format designed to store and organize large volumes of numerical data. It acts like a virtual file system inside a single file, using a hierarchical group/dataset structures in the same way a file system uses folders and files - with both elements having their own metadata; this way the file is self-describing, containing all relevant information like a small database. HDF5 also supports efficient slicing, compression and parallel I/O. The file extension of such format is `.h5`.

-== Using the software
+On the other hand, *NeXus* is a common data standard [.underline]#built on top of HDF5#. It defines fixed conventions for naming groups, datasets and attributes, specifically for neutron, X-ray, and now materials science experiments. NeXus provides "application definitions" (like _NXpld_fabrication_ for PLD) that specify exactly which fields must/may appear. NeXus is also heavily promoted by _FAIRmat_, a German-based consortium, part of the NFDI, whose main mission is providing scientists «with a FAIR data infrastructure and the skills and tool they need to make the most of it»footnote:[As stated on their link:{fairmat-site}[website^].]. The file extension of such format is `.nxs`, but generally file viewers treat the two formats similarly.

-== Troubleshooting
+Last but not least, NeXus is also the format of choice for data sharing in the NFFA-DI guidelines. Which brings us to the reason why {software-family} exists.
+
+==== Reading HDF5/NeXus files
+While writing an HDF5/NeXus file usually requires dedicated software and/or a good knowledge of programming and familiarity with specific libraries (like h5py), there are multiple ways to read these files even without such knowledge.
+
+One of such ways would be using the online NeXus file viewer of the NCNR (_NIST Center for Neutron Research_), available on their link:{ncnr-viewer}[website^]. The "_Browse..._" button at the bottom allows for uploading both h5 and nxs files, although drag and drop also works.
+
+Another similar but in my opinion more elegant online file viewer is the one hosted by the HDF5 Group: link:{hdf5-viewer}[MyHDF5^]. Other than the more modern appearance this viewer doesn't upload files to any remote server, with every operation happening locally in your browser; the drag and drop works better meaning you won't accidentally reload the page if you miss the dropping area, and the viewer also allows for opening multiple concurrent files, and downloading h5 files from URL.
--- a/docs/user-manual_main.adoc
+++ b/docs/user-manual_main.adoc
@@ -1,9 +1,9 @@
-= eXP for PLD User Manual: eLabFTW to NeXus Parser for PLD Fabrications 
+= {software-name} User Manual: eLabFTW to NeXus Parser for PLD Fabrications 
 :author: Emanuele D'Amico
 :description: eLabFTW to NeXus Parser for PLD Fabrications
-:email: emanuele.damico@cnr.it
+:email: emanuele+expars@damico.ing
 :keywords: nffa-di, elabftw, nexus, parser, data science, mdmc, naples, cnr-spin, cnr, spin institute, python, hdf5, cli
-:revdate: 2026-05-13
+:revdate: 2026-05-14
 :revnumber: v0.2.1
 :revremark: alpha untested
 :stem: latexmath
@@ -11,15 +11,21 @@
 :doctype: book
 // custom attributes
 :disclamer: I'm in no position to give anyone coding/development/programming/testing tips. The only tips I can give you are based on my personal knowledge of this specific project.
-:software-name: eXParser-PLD
-:url-repo: https://gitea.damico.ing/emanuele/eXParser-PLD
-:ssh-repo: ssh://git@gitea.damico.ing/emanuele/eXParser-PLD.git
+:software-family: eXPars
+:software-name: {software-family}-PLD
+:repo-url: https://gitea.damico.ing/emanuele/eXParser-PLD
+:repo-ssh: ssh://git@gitea.damico.ing/emanuele/eXParser-PLD.git
 :elabftw-site: https://elabftw.net
 :nffa-di-site: https://nffa-di.it/en/about-us/project/
 :go-fair-site: https://www.go-fair.org/fair-principles/
+:fairmat-site: https://www.fairmat-nfdi.eu/fairmat/about-fairmat/consortium-fairmat#mission
+:ncnr-viewer: https://ncnr.nist.gov/ncnrdata/view/nexus-hdf-viewer.html
+:hdf5-viewer: https://myhdf5.hdfgroup.org/

 include::user-manual_intro.adoc[]

+include::user-manual_usage.adoc[]
+
 ///////////////////////////////////////////////////////////////////////////
 // Look out for "method-specific" comments I've left before sections
 // containing information about one method in particulare (e.g. PLD fab.)
--- a/docs/user-manual_usage.adoc
+++ b/docs/user-manual_usage.adoc
@@ -0,0 +1,51 @@
+== Using the software
+WARNING: This software requires Python 3.12 or later. +
+The module *venv* and the package manager *pip* are also required.
+
+=== Downloading the source code
+IMPORTANT: Currently ({revdate}) the source code is hosted on a private Gitea instance, owned by {author}. +
+If the site is down for maintenance or temporarily unavailable please contact the webmaster via mailto:{email}[e-mail].
+
+// TO-DO: add link to direct download of package
+The source code can be acquired directly via *git*, or downloaded from the official repository on link:{repo-url}[Gitea D'Amico^].
+
+[source,bash,subs="verbatim,attributes"]
+----
+git clone {repo-url}.git {software-name}
+cd {software-name} # enter directory
+ls
+  LICENSE    docs/     output/           src/
+  README.md  glossary  requirements.txt  tests/
+----
+
+Optionally, you can access the code in the development branch by executing:
+[source,bash]
+----
+git checkout dev
+----
+
+=== Preparing the environment
+{software-name} {revnumber} requires a total of 6 modules to be installed before starting. Since installing a Python module system-wide is almost never a good idea, start by creating and activating a virtual environment.
+
+In the software folder, run:
+
+[source,bash]
+----
+# Calls venv module to create new Python virtual environment in .venv:
+python3 -m venv .venv
+# If command is successful, running ls should show a new .venv folder:
+ls -d .*
+  .venv
+# Activate venv:
+source .venv/bin/activate
+----
+
+.Most shells like Bash show very clearly when you're working inside a virtual environment.
+image::images/usage-venv.png[]
+
+At this point you're free to install the requirements through *pip*:
+
+
+
+---
+== Troubleshooting