--- title: "OAI-PMH API v2.0" subtitle: OpenArchives Initiative Protocol for Metadata Harvesting API, v2.0 date: 2024-07-17 --- ::: {.callout-important} ## Superseded version of service and schema Since **21 March 2025** **v2.0** of the service and schema is **superseded by version 2.1**. The service and data remains available for now, but we recommend switching to the newer version. ::: ::: {.callout-note} API is available here: [https://api.aiscr.cz/2.0/oai?verb=Identify](https://api.aiscr.cz/2.0/oai?verb=Identify){.external}. User registration is available here: [https://amcr.aiscr.cz/accounts/register/](https://amcr.aiscr.cz/accounts/register/){.external}. This guide is valid for **version 2.0** of the service and the schema. ::: OAI-PMH AMCR API provides metadata from the AMCR database using the *OpenArchives Initiative Protocol for Metadata Harvesting* ([OAI-PMH](http://www.openarchives.org/OAI/2.0/oai_dc.xsd){.external}). The implementation supports two metadata standards: - [Dublin Core](http://dublincore.org/){.external} -- AMCR basic metadata serialised using Dublin Core elements from the DCMI schema for unqualified Dublin Core according to OAI-PMH specification; - [AMCR XML](https://api.aiscr.cz/schema/amcr/2.0/amcr.xsd) -- native format with a complete representation of the whole AMCR database. The API response is an XML file. ## Access The access to some metadata is protected by user roles: - **A** (anonymous) -- anyone on the internet accessing AMCR services, - **B** (researcher) -- registered user, - **C** (archaeologist) -- archaeologist from a licensed organisation, - **D** (archivist) -- authorised AMCR archivist. Registration to the system is available [here](https://amcr.aiscr.cz/accounts/register/){.external}. Accessibility is automatically assessed for each record based on the user's role, organisational affiliation and the processing state of the record. If the record or part of it is inaccessible, the corresponding element will return an `HTTP/1.1 403 Forbidden` string. The rules are applied as detailed in the following tables. The first table defines the basic rules for the record to be visible, the second table lists the effects of the limited accessibility defined on the individual record level. ```{r} #| tbl-cap: "Basic rules for the record visibility" read.csv("tabs/access1.csv", tryLogical = FALSE) |> dplyr::mutate(dplyr::across(dplyr::everything(), \(x) dplyr::if_else(x == TRUE, "✔", x)), dplyr::across(dplyr::everything(), \(x) dplyr::if_else(x == FALSE, "✘", x))) |> knitr::kable(align = "lcccc") ``` > *stav* -- processing state of the record, see [Documentation](https://amcr-help.aiscr.cz/o-systemu/procesy.html){.external} (*in Czech language only*) > *my record* -- record created by the user > *my organisation record* -- record created by someone from user's organisation ```{r} #| tbl-cap: "Effects of the limited accessibility" read.csv("tabs/access2.csv") |> knitr::kable(align = "ll", col.names = c("element with required role", "protected element")) ``` ## Specification ### Schema Schema (XSD) for current version of the service can be found here: [https://api.aiscr.cz/schema/amcr/2.0/](https://api.aiscr.cz/schema/amcr/2.0/). The schema follows the AMCR data structure as described [here](https://amcr-help.aiscr.cz/o-systemu/datovy-model.html){.external}. Main entities are represented as records in individual sets (see below), with all direct children listed in the XML hierarchical structure. ### Filters #### Sets Selective querying can be achieved using predefined sets. Description of sets' contents is available at: [https://api.aiscr.cz/2.0/oai?verb=ListSets](https://api.aiscr.cz/2.0/oai?verb=ListSets){.external}. ```{r} #| tbl-cap: "Available sets" read.csv("tabs/sets.csv") |> dplyr::mutate(vocabulary = dplyr::if_else(vocabulary, "✔", "✘")) |> knitr::kable(align = "llc") ``` #### Datestamps and deleted records In `ListIdentifiers` and `ListRecords` requests it is possible to filter based on optional query argument datestamp. (`from` -- `until`). The datestamp reflects the last change to the record. The API also keeps track of deleted records that are no longer available in AMCR. Such records are still listed in the results, while the datestamp marks the date of deletion. Deleted records are marked with the `status="deleted"` attribute in the header element of the record (i.e. `
`). Therefore, datestamps and from/until arguments can be used to continuously update previous harvests. ## Login and authorization To be able to access records with limited accessibility, it is neccessary to login using *Basic Access Authentication* while sending the requests. ### cURL Use flag `-u` tu put in your username and password in each request. ```zsh curl -u : ``` ### Postman In Postman, username and password can be set up on the *Authorization* tab. Select *Basic Auth* in the *TYPE* menu. ## Verbs OAI-PMH protocol defines several verbs that allow metadata harvesting, the specification is available [here](http://www.openarchives.org/OAI/openarchivesprotocol.html){.external}. ### Identify Verb `Identify` is used to get information about the repository. *Request:* [https://api.aiscr.cz/2.0/oai?verb=Identify](https://api.aiscr.cz/2.0/oai?verb=Identify){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=Identify' ``` *Sample response:* ::: {.small} ```xml 2024-07-11T12:26:50Z https://api.aiscr.cz/oai Archaeological Map of the Czech Republic (AMCR) https://api.aiscr.cz/oai 2.0 info@amapa.cz 1990-01-01 persistent YYYY-MM-DDThh:mm:ssZ Archeologická mapa České Republiky (AMCR) Archaeological Map of the Czech Republic (AMCR) [...] https://api.aiscr.cz/ version 2.0.0 [...] ``` ::: ### ListMetadataFormats Verb `ListMetadataFormats` returns available metadata formats/standards. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListMetadataFormats](https://api.aiscr.cz/2.0/oai?verb=ListMetadataFormats){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListMetadataFormats' ``` *Sample response:* ::: {.small} ```xml 2024-07-11T12:46:31Z https://api.aiscr.cz/oai oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd http://www.openarchives.org/OAI/2.0/oai_dc/ oai_amcr https://api.aiscr.cz/schema/amcr/2.0/amcr.xsd https://api.aiscr.cz/schema/amcr/2.0/ ``` ::: Supported metadata formats are: - Dublin Core -- `metadataPrefix=oai_dc`, schema available [here](http://www.openarchives.org/OAI/2.0/oai_dc.xsd){.external}, - native AMCR *.xml* format -- `metadataPrefix=oai_amcr`, schema available [here](https://api.aiscr.cz/schema/amcr/2.0/). ### ListSets Verb `ListSets` lists available sets. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListSets](https://api.aiscr.cz/2.0/oai?verb=ListSets){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListSets' ``` *Sample response:* ::: {.small} ```xml 2024-07-11T13:28:01Z https://api.aiscr.cz/oai projekt Projekty / Projects Evidenční jednotky terénní činnosti badatelského nebo záchranného rázu evidované již ve fázi přípravy. Pro vymezení projektu je rozhodující podnět k výzkumu a provádějící subjekt (oprávněná organizace), lokalizace a příp. projektová dokumentace. Na projekt zpravidla navazuje jedna či (méně často) více terénních akcí či evidence samostatných nálezů. Records of field activities of a research or development-led nature recorded in the preparation phase. For the definition of the project, the motivation and the implementing body (authorised organisation), the location and, if applicable, the project documentation are decisive. The project is usually followed by one or (less frequently) more fieldwork, or the recording of individual finds. [...] ``` ::: ### ListIdentifiers Verb `ListIdentifiers` lists record headers. Required argument is `metadataPrefix`, optional arguments allow filtering based on predefined sets (`set`) and/or datestamps (`from`, `until`). Only first page with limited amount of records is returned, to go to the next page, user has to submit a `resumptionToken` returned in the response. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&metadataPrefix=oai_amcr](https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&metadataPrefix=oai_amcr){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&metadataPrefix=oai_amcr' ``` *Sample response:* ::: {.small} ```xml 2024-07-11T12:51:03Z https://api.aiscr.cz/oai
https://api.aiscr.cz/id/C-202013149 2024-07-11T12:27:13.968Z projekt
https://api.aiscr.cz/id/P-0334-000377 2024-07-11T12:12:19.386Z pian
https://api.aiscr.cz/id/C-DL-200400209 2024-07-11T12:10:48.403Z dokument
[...] BF0A0598B38E42E3FEB4E639B2911C90
``` ::: ### ListRecords Verb `ListRecords` lists records. Required argument is `metadataPrefix`, optional arguments allow filtering based on predefined sets (`set`) and/or datestamps (`from`, `until`). Only first page with limited amount of records is returned, to go to the next page, user has to submit a `resumptionToken` returned in the response. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_dc](https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_dc){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_dc' ``` *Sample response:* ::: {.small} ```xml 2024-07-11T13:25:02Z https://api.aiscr.cz/oai
https://api.aiscr.cz/id/C-202013149 2024-07-11T12:27:13.968Z projekt
AMCR - projekt C-202013149 C-202013149 projekt Stav: 6 http://base_url/id/HES-000865 [...]
BF0A0598B38E42E3FEB4E639B2911C90
``` ::: ### GetRecord Verb `GetRecord` returns individual record. Required arguments are `metadataPrefix` and `identifier` (unique identifier of the record, URI). Unique identifier of individual record is in the header of each record in the element with tag ``. The contents of `` element is derived from a unique identifier of the given record usually stored in the `` prefixed with namespace according to the API endpoint, i.e. string `https://api.aiscr.cz/id/`. Resulting URL acts as an alias to `GetRecord` in the `oai_amcr` schema for the specified record, and is automatically translated into the appropriate request. Therefore it can be used to easily refer specific records in the API. *Request:* [https://api.aiscr.cz/2.0/oai?verb=GetRecord&identifier=https://api.aiscr.cz/id/M-FT-110598700&metadataPrefix=oai_amcr](https://api.aiscr.cz/2.0/oai?verb=GetRecord&identifier=https://api.aiscr.cz/id/M-FT-110598700&metadataPrefix=oai_amcr){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=GetRecord&identifier=https://api.aiscr.cz/id/M-FT-110598700&metadataPrefix=oai_amcr' ``` *Sample response:* ::: {.small} ```xml 2024-07-15T09:04:59Z https://api.aiscr.cz/oai
https://api.aiscr.cz/id/M-FT-110598700 2024-05-09T12:39:30.474Z dokument
HTTP/1.1 403 Forbidden
``` ::: ## Arguments Some requests (verbs) have required arguments (typically specifying the metadata format with `metadataPrefix`) and some have optional arguments allowing filtration, pagination etc. ```{r} #| tbl-cap: "Required and optional query arguments" read.csv("tabs/params.csv") |> knitr::kable() ``` ### Pagination In case of `ListIdentifiers`, `ListRecords` and `ListSets` requests, only the first page of results is returned. To get another page, a `resumptionToken` returned at the end of the response must be used as an argument in the following request. Argument `resumptionToken` should be always the last argument in the request. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_amcr](https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_amcr){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListRecords&metadataPrefix=oai_amcr' ``` *Sample response:* ::: {.small} ```xml 2024-07-15T11:57:23Z https://api.aiscr.cz/oai
[...]
[...]
BF0A0598B38E42E3FEB4E639B2911C90
``` ::: *Follow up request with `resumptionToken`:* [https://api.aiscr.cz/2.0/oai?verb=ListRecords&resumptionToken=BF0A0598B38E42E3FEB4E639B2911C90](https://api.aiscr.cz/2.0/oai?verb=ListRecords&resumptionToken=BF0A0598B38E42E3FEB4E639B2911C90){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListRecords&resumptionToken=BF0A0598B38E42E3FEB4E639B2911C90' ``` ::: {.small} ```xml 2024-07-15T12:01:28Z https://api.aiscr.cz/oai
[...]
[...]
7C6DD5774E7C5493819FF48540A24546
``` ::: Notice the attributes of the `resumptionToken` element `completeListSize` indicating total number of results and `cursor` indicating that the first record in the given response is n-th returned record. Each resumption token also has set expiration date until when it can be used to acquire the next page. Please note that the data in the API is constantly updated so it is advisable to harvest the results at once to avoid possible inconsistencies in the response. ### Filters Selective harvesting is allowed thanks to datestamps of records and sets. #### Sets Available sets of records are returned using the `ListSets` verb. The contents of the `` element can be used in a `set` argument in `ListIdentifiers` and `ListRecords` requests. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=pian&metadataPrefix=oai_amcr](https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=pian&metadataPrefix=oai_amcr){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=pian&metadataPrefix=oai_amcr' ``` *Sample response:* ::: {.small} ```xml 2024-07-15T12:17:21Z https://api.aiscr.cz/oai
https://api.aiscr.cz/id/P-1223-101288 2024-07-15T11:13:45.237Z pian
[...] BF0A0598B38E42E3FEB4E639B2911C90
``` ::: #### Datestamps Arguments `from` and `until` in `ListIdentifiers` and `ListRecords` requests allow filtering based on a datestamp of the given record. Use timestamps in ISO-8601 standard format, i.e. YYYY-MM-DDThh:mm:ssZ or its subset. *Request:* [https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=projekt&from=2023-04-21&until=2024-04-27&metadataPrefix=oai_amcr](https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=projekt&from=2023-04-21&until=2024-04-27&metadataPrefix=oai_amcr){.external} ```zsh curl 'https://api.aiscr.cz/2.0/oai?verb=ListIdentifiers&set=projekt&from=2023-04-21&until=2024-04-27&metadataPrefix=oai_amcr' ``` *Sample response:* ::: {.small} ```xml 2024-07-15T12:20:21Z https://api.aiscr.cz/oai
https://api.aiscr.cz/id/M-202204394 2024-04-26T13:08:37.256Z projekt
[...]
https://api.aiscr.cz/id/C-202401944 2024-03-17T06:32:08.143Z projekt
BF0A0598B38E42E3FEB4E639B2911C90
``` :::