Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

  • 8183719
  • /
  • build_a_reader.md
Raw File Download

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
content badge
swh:1:cnt:d7465bdc4f0016523798fea70763c106d368d613
directory badge
swh:1:dir:81837199a519dcccc949bcd6cc475e623e46d6f4

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
build_a_reader.md
# How to build your own reader

Your current data is not supported yet? Don't worry, the following how-to will guide you how to write a reader your own data.

## pynxtools-xps supports your format, but some groups and fields are different

Good! The basic functionality to read your data is already in place. Before you start writing your own reader, consider two options:
1) You can modify the default [config files](https://github.com/FAIRmat-NFDI/pynxtools-xps/tree/main/src/pynxtools_xps/config).
2) Consider opening a [pull request on the GitHub repository](https://github.com/FAIRmat-NFDI/pynxtools-xps/pulls) modifying the existing reader.

## You have a completely new data format

You will have to write a new sub-reader inside pynxtools-xps. There are multiple steps to get started:

### Development install

You should start with an devlopment install of the package with its dependencies:

```shell
git clone https://github.com/FAIRmat-NFDI/pynxtools-xps.git \\
    --branch main \\
    --recursive pynxtools_xps
cd pynxtools_xps
python -m pip install --upgrade pip
python -m pip install -e .
python -m pip install -e ".[dev,consistency_with_pynxtools]"
```

There is also a [pre-commit hook](https://pre-commit.com/#intro) available
which formats the code and checks the linting before actually commiting.
It can be installed with
```shell
pre-commit install
```
from the root of this repository.

### Design strategy
The development process is modular so that new parsers can be added. The design logic is the following:
1. First, [`XpsDataFileParser`](https://github.com/FAIRmat-NFDI/pynxtools-xps/tree/main/src/pynxtools_xps/file_parser.py#L36) selects the proper parser based on the file extensions of the provided files. It then calls a sub-parser that can read files with such extensions and calls the `parse_file` function of that reader. In addition, it selects a proper config file from
the `config` subfolder.
2. Afterwards, the NXmpes NXDL template is filled with the data in `XpsDataFileParser` using the [`config`](https://github.com/FAIRmat-NFDI/pynxtools-xps/tree/main/src/pynxtools_xps/config) file. Data that is not in the given main files can be added through the ELN file (and must be added for required fields in NXmpes).

### Write your reader
TODO!

### Test the software
There exists a basic test framework written in [pytest](https://docs.pytest.org/en/stable/) which can be used as follows:
```shell
python -m pytest -sv tests
```
You should add test data and add your reader to the `test_params` in the `test_reader.py` script.

# Further details

[NXmpes](https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXmpes.html)

[NXxps](https://fairmat-nfdi.github.io/nexus_definitions/classes/contributed_definitions/NXxps.html)

back to top

Software Heritage — Copyright (C) 2015–2026, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API