Deposit harvester

The deposit harvester allows you to retrieve metadata records for content that you’ve registered. The metadata retrieved is in our UNIXSD output format, which delivers the exact metadata submitted in a deposit, including any citations registered. Members (or their designated third parties) may only retrieve their own metadata.

The harvester uses Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to deliver the metadata. The verbs Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, and GetRecord are supported.

Ownership and retrieval restrictions - who can retrieve records?

The deposit harvester will only retrieve records for the authorized owner of the metadata records. Metadata ownership is established by the DOI prefix(es) associated with a user’s account (learn more about transferring responsibility for DOIs. Many members have one prefix and one account, but some members may have multiple prefixes. For example, Member A has been assigned account abcd, which is associated with prefixes 10.xxxx, 10.yyyy, and 10.zzzz. Member A can retrieve metadata owned by prefixes 10.xxxx, 10.yyyy, and 10.zzzz using their abcd account.

Ownership of DOIs and titles often moves from member to member, so a title-owning prefix will not always match the prefix of the DOIs attached to the title. Retrieval permission is granted to the current owner, not the original depositor. For example, Member B registers identifier 10.5555/jfo.33425. Ownership of the journal and all identifiers is transferred to Member A with prefix 10.50505. The DOI is now “owned” by prefix 10.50505, and only Member A may harvest the metadata record for that identifier.


The deposit harvester supports a hierarchy of sets. The hierarchy is in three parts: <work-type>:<prefix>:<publication-id>. For example, the set J:10.12345:6789 will return metadata for a journal (J), with prefix 10.12345, and publication id 6789. The set B will return all book metadata. The set S:10.12345 will return all the series metadata associated with the 10.12345 prefix.

The work-type designators are:

  • J for journals
  • B for books and book-like works (reports, conference proceedings, standards, dissertations)
  • S for non-journal series and series-like works.

If no set is specified, the set “J” is used.

Example requests


Retrieve list of titles owned by the prefixes assigned to your account:


Retrieve data for a prefix:

Retrieve data for a single title: ID&usr=username&pwd=password


Retrieve data for a single DOI:

When using GetRecord, the <DOI> value should be URL encoded.


Use to check the status of the deposit harvester (no account needed):


Lists available metadata formats (currently UNIXREF)

Request parameters

  • work-type: J for journals, B for book or conference proceeding titles, S for series
  • prefix: the owning prefix of the title being retrieved
  • title ID: the title identification number assigned by us. Title IDs are included in the ListSets response described above.
  • username and password: account details for the prefix/title being retrieved


Results conform to Crossref’s UNIXREF format and may contain the following root elements:

  • journal
  • book
  • conference
  • dissertation
  • report-paper
  • standard
  • sa_component
  • database

Using resumption tokens with the deposit harvester

Some OAI-PMH requests are too big to be retrieved in a single transaction. If a given response contains a resumption token, the user must make an additional request to retrieve the rest of the data. You must provide the account name and password with both the initial request and subsequent resumption requests. A resumption without authentication details will fail. Learn more about resumption tokens.

Initial request

Request with resumption token

Page owner: Patrick Polischuk   |   Last updated 2020-April-08