spec-api.rst
API Specification
=================
This is `Software Heritage <https://www.softwareheritage.org>`__'s
`SWORD
2.0 <http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html>`__
Server implementation.
**S.W.O.R.D** (**S**\ imple **W**\ eb-Service **O**\ ffering
**R**\ epository **D**\ eposit) is an interoperability standard for
digital file deposit.
This implementation will permit interaction between a client (a repository) and
a server (SWH repository) to push deposits of software source code archives
with associated metadata.
*Note:*
* In the following document, we will use the ``archive`` or ``software source
code archive`` interchangeably.
* The supported archive formats are:
* zip: common zip archive (no multi-disk zip files).
* tar: tar archive without compression or optionally any of the following
compression algorithm gzip (.tar.gz, .tgz), bzip2 (.tar.bz2) , or lzma
(.tar.lzma)
Collection
----------
SWORD defines a ``collection`` concept. In SWH's case, this collection
refers to a group of deposits. A ``deposit`` is some form of software
source code archive(s) associated with metadata.
By default the client's collection will have the client's name.
Limitations
-----------
* upload limitation of 100Mib
* no mediation
API overview
------------
API access is over HTTPS.
The API is protected through basic authentication.
Endpoints
---------
The API endpoints are rooted at https://deposit.softwareheritage.org/1/.
Data is sent and received as XML (as specified in the SWORD 2.0
specification).
.. include:: endpoints/service-document.rst
.. include:: endpoints/collection.rst
.. include:: endpoints/update-media.rst
.. include:: endpoints/update-metadata.rst
.. include:: endpoints/status.rst
.. include:: endpoints/content.rst
Possible errors:
----------------
* common errors:
* 401 (unauthenticated) if a client does not provide credential or provide
wrong ones
* 403 (forbidden) if a client tries access to a collection it does not own
* 404 (not found) if a client tries access to an unknown collection
* 404 (not found) if a client tries access to an unknown deposit
* 415 (unsupported media type) if a wrong media type is provided to the
endpoint
* archive/binary deposit:
* 403 (forbidden) if the length of the archive exceeds the max size
configured
* 412 (precondition failed) if the length or hash provided mismatch the
reality of the archive.
* 415 (unsupported media type) if a wrong media type is provided
* multipart deposit:
* 412 (precondition failed) if the md5 hash provided mismatch the reality of
the archive
* 415 (unsupported media type) if a wrong media type is provided
* Atom entry deposit:
* 400 (bad request) if the request's body is empty (for creation only)
Sources
-------
* `SWORD v2 specification
<http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html>`__
* `arxiv documentation <https://arxiv.org/help/submit_sword>`__
* `Dataverse example <http://guides.dataverse.org/en/4.3/api/sword.html>`__
* `SWORD used on HAL <https://api.archives-ouvertes.fr/docs/sword>`__
* `xml examples for CCSD <https://github.com/CCSDForge/HAL/tree/master/Sword>`__