diff --git a/jupyter/chained_requests.ipynb b/jupyter/chained_requests.ipynb new file mode 100644 index 0000000..c163ce7 --- /dev/null +++ b/jupyter/chained_requests.ipynb @@ -0,0 +1,244 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a55330c2-1e65-42f4-bb3e-5a4298978771", + "metadata": {}, + "source": [ + "# Chained requests: fetching a resource referenced by another resource referenced by another resource...\n", + "An issue with our eLabFTW schema is that important data is scattered across many interconnected entries. For this reason the data requires **chained API requests** to resolve fully. This happens because resources on eLabFTW are linked to each other through their internal ID's (*elabid*) - which as previously stated is not relevant info to the project - and for instance to recover the value of key X from item A referenced on experiment B we need to:\n", + "\n", + "1. Fetch JSON data on experiment B through HTTP request;\n", + "2. Parse the JSON to get the value of item A's elabid;\n", + "3. Send another HTTP request to the server to fetch item A's data (also in JSON);\n", + "4. Parse the JSON to get the value of key X;\n", + "5. In the event that key X is the elabid of yet another resource, repeat steps 3 to 5 as necessary.\n" + ] + }, + { + "cell_type": "markdown", + "id": "a1ff1232-174e-4e0e-901b-92750340d1bc", + "metadata": {}, + "source": [ + "## Levels of nesting\n", + "Assuming we want to fetch the chemical formula of the compound of the substrate of a certain sample following the steps listed above, we may find the string containing that information into the eLab entry for the *compound* the substrate is made of, which is linked to the entry for the *batch* from which the substrate is taken, which in turn is linked to the entry for the *sample* associated to the experiment. Therefore we'll need to make N × chained HTTP requests to the eLabFTW server in order to get just that one specific information.\n", + "\n", + "```\n", + "──> : \"points to ... but doesn't include its data\"\n", + "Experiment (= deposition layer) ──> Sample ──> Substrate batch ──> Compound (Chemical formula)\n", + "```\n", + "\n", + "1. Make an HTTP request to fetch an **experiment's data** (equivalent to a single PLD deposition layer).\n", + "2. Parse the JSON to get the value of the *sample's elabid*.\n", + "3. Make an HTTP request to fetch the **sample's data**.\n", + "4. Parse the JSON to get the value of the *substrate batch's elabid*.\n", + "5. Make an HTTP request to fetch the **substrate batch's data**.\n", + "6. Parse the JSON to get the value of the *substrate's compound's elabid*.\n", + "7. Make an HTTP request to fetch the **compound's data**.\n", + "8. Parse the JSON to FINALLY get the value of the compound's chemical formula.\n", + "\n", + "> ℹ️ The chemical formula of the substrate is required by the NeXus standard for PLD depositions.\n", + "\n", + "The information we want to get is nested behind three additional HTTP requests. Every additional request required, other than the initial one made to fetch the experiment's data, shall be referred to as a **level of nesting** (or depth). If a piece of information is immediately available in the experiment's data, it means the information is not nested and is at zero levels of depth.\n", + "\n", + "Here are some examples of nested values required by the NeXus standard for PLD depositions.\n", + "\n", + "```\n", + "(0) Experiment (= deposition layer) ───> Deposition time, Repetition rate, etc. (depth 0)\n", + "1 ├──> Sample\n", + " 2 │ └──> Substrate batch\n", + " 3 │ └──> Compound ───> Chemical formula (depth 3)\n", + "1 ├──> Target\n", + " 2 │ └──> Compound ───> Chemical formula (depth 2)\n", + "1 ├──> RHEED System ───> Title (name of, depth 1)\n", + "1 └──> Laser System ───> Title (name of, depth 1)\n", + "```\n", + "\n", + "### Nested data when starting from a sample\n", + "If we want to fully resolve all data on a deposition by starting from the entry for the sample (as opposed to starting from the experiments selected manually - which is more logical), the schema proposed above should be rearranged as follows:\n", + "\n", + "```\n", + "(0) Sample (eLabFTW Resource) ───> Title, STD-ID (depth 0)\n", + "1 ├──> Deposition layers (eLabFTW Experiments)\n", + " 2 │ ├──> Target\n", + " 3 │ │ └──> Compound ───> Chemical formula (depth 3)\n", + " 2 │ ├──> RHEED System ───> Title (name of, depth 2)\n", + " 2 │ └──> Laser System ───> Title (name of, depth 2)\n", + "1 └──> Substrate batch\n", + " 2 └──> Compound ───> Chemical formula (depth 2)\n", + "```\n", + "\n", + "### Computable data\n", + "Some values can be obtained as a result of basic arithmetic operations. For instance the number of pulses to produce a certain layer is not specified on eLabFTW but it can be easily obtained as the product of the *deposition time* (in seconds) and the *repetition rate* (in Hertz, assumed constant).\n", + "\n", + "Computable data shall not be considered equal to nested data; as a matter of fact - if the analysis starts from the experiment data - both *deposition time* and *repetition rate* of an experiment are level-zero information, as they can be taken directly from the experiment's JSON." + ] + }, + { + "cell_type": "markdown", + "id": "8d7caa68-6e64-4715-a3cc-41971a539eb3", + "metadata": {}, + "source": [ + "## Getting level 1 data\n", + "### Name of the RHEED system used from the experiment\n", + "Starting easy, let's write a function to fetch the name of the RHEED system used in Experiment 48 and 49 (`../tests/objects/experiment_48_elab.json`,`experiment_49_elab.json`). The information is nested at level-1 depth, which means we need an additional HTTP request to recover the JSON file for the \"RHEED System\" entry." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "3aa6992e-e613-413b-aa3d-801735741437", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Paste API key here: ········\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "For exp. 48: STAIB RHEED 30\n", + "For exp. 49: STAIB RHEED 30\n", + "SAME RHEED SYSTEM!\n" + ] + } + ], + "source": [ + "import os, json, requests\n", + "from getpass import getpass\n", + "\n", + "def get_rheed(filename):\n", + " with open(os.path.join(\"../tests/objects\", filename)) as f:\n", + " layer = json.load(f)\n", + " extra = layer[\"metadata_decoded\"][\"extra_fields\"]\n", + " rheed_id = extra.get(\"RHEED System\").get(\"value\")\n", + " header = {\n", + " \"Authorization\": apikey,\n", + " \"Content-Type\": \"application/json\"\n", + " }\n", + " rheed_data = requests.get(\n", + " headers = header,\n", + " url = f\"https://elabftw.fisica.unina.it/api/v2/items/{rheed_id}\",\n", + " verify=True\n", + " )\n", + " rheed_name = rheed_data.json().get(\"title\")\n", + " return rheed_name\n", + "\n", + "apikey = getpass(\"Paste API key here: \")\n", + "rheed_48 = get_rheed(\"experiment_48_elab.json\")\n", + "rheed_49 = get_rheed(\"experiment_49_elab.json\")\n", + "print(f\"For exp. 48: {rheed_48}\")\n", + "print(f\"For exp. 49: {rheed_49}\")\n", + "# If equal print \"SAME RHEED SYSTEM!\"\n", + "if rheed_48 == rheed_49:\n", + " print(\"SAME RHEED SYSTEM!\")" + ] + }, + { + "cell_type": "markdown", + "id": "67e2ce73-482f-4b74-a5ce-bf254fc1e12d", + "metadata": {}, + "source": [ + "### Fetch any experiment related to a certain sample\n", + "Another easy task is fetching every experiment entry related to a given sample. \\\n", + "Let's start with the sample of elabid equal to *1077* - which at the time I'm writing this is named \"Na-26-001 LMNO LAO\". The related experiments are Experiment 48 (layer 1) and Experiment 49 (layer 2). From these experiments we will only fetch *deposition time* and *repetition rate* to keep the present notebook clean.\n", + "\n", + "#### How linked experiments are managed by eLabFTW\n", + "Linked experiments are similar to linked items: you can reference an eLabFTW experiment into a resource, but when you download the data of the latter from the API endpoint you only get the elabid and **very few additional data** - like the experiment's title - of the former. In this situation the experiments linked to a sample contain information about its layers, which is at least level-1 nested since it requires at least an additional HTTP request." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "504c6eec-967b-4886-8770-67b5df3ed5ef", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Paste API key here: ········\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'deposition_time': '65', 'repetition_rate': '1'},\n", + " {'deposition_time': '56', 'repetition_rate': '1'}]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import os, json, requests\n", + "from getpass import getpass\n", + "\n", + "def get_sample_layers_data(elabid):\n", + " header = {\n", + " \"Authorization\": apikey,\n", + " \"Content-Type\": \"application/json\"\n", + " }\n", + " sample_data = requests.get(\n", + " headers = header,\n", + " url = f\"https://elabftw.fisica.unina.it/api/v2/items/{elabid}\",\n", + " verify=True\n", + " ).json()\n", + " related_experiments = sample_data[\"related_experiments_links\"]\n", + " result = []\n", + " for exp in related_experiments:\n", + " experiment_data = requests.get(\n", + " headers = header,\n", + " url = f\"https://elabftw.fisica.unina.it/api/v2/experiments/{exp.get(\"entityid\")}\",\n", + " verify=True\n", + " ).json()\n", + " extra = experiment_data[\"metadata_decoded\"][\"extra_fields\"]\n", + " result.append(\n", + " {\"deposition_time\": extra.get(\"Duration\").get(\"value\"),\n", + " \"repetition_rate\": extra.get(\"Repetition rate\").get(\"value\")}\n", + " )\n", + " return result\n", + "\n", + "apikey = getpass(\"Paste API key here: \")\n", + "get_sample_layers_data(1077)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "046f7383-5830-462b-9ac0-2026e46c5697", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}