Best practice license information

Encouraging registration of better license metadata with Crossref

The information below sets out guidance for publishers on registering license metadata with Crossref. This is to help academic institutions: they need to identify the articles written by their researchers, and also understand how they may use them – and to do so in an automated, machine-readable way.

Institutions need to know which article version may be exposed on an open repository, and from what date. It is no longer sufficient simply to describe in words how they may calculate the embargo end-date, for example, by referring them to a general set of terms and conditions that apply to all of your content across its whole lifecycle – they need to know whether this version of this article can be exposed on their repository and, if so, from what specific date, and what repository readers can then do with the content they find there.

The Crossref schema contains all the fields you need to specify this unambiguously. By doing so, you can also be more confident that institutions will have the information they need to respect your terms and conditions.

How Crossref collects license information

A single Crossref DOI can contain metadata relating to multiple versions of a work: the accepted manuscript (AM), version of record (VoR), or a version that intended for text and data mining (TDM). Each of these versions can have their own license conditions attached to them. To reflect this, works in Crossref can have multiple license elements. Each license element can contain a URL to a license, the article version the license applies to, and the license start date. Together, these can describe nuanced license terms across different versions of the work. An analysis done by Jisc of Crossref metadata found that while 48% of journal articles published in 2017 had license information, the licenses most often referred to the text and data mining version of the work, and licenses were still being used inconsistently for the version of record (VoR) or accepted manuscript (AM). A major concern is that many publishers link to their general terms and conditions rather than to licenses that apply at specific times to specific versions of a work. For example, a publisher may set its policies out in a general terms and conditions page, and link to it in the license metadata like so:

<license_ref applies_to="vor" start_date="2019-01-01">

On the terms and conditions page, the publisher could spell out, for example, the license that applies to the VoR, the restrictions that apply to the AM during its embargo period, and details of how the AM may be used after its embargo period. A repository manager would then have to go through the terms and conditions, and manually calculate the embargo end date, in order to determine whether the work could be deposited to a repository. This is a prohibitively onerous process for institutions, and risks content being used outside the terms of publisher policies because of human error. It would be helpful if publishers could instead set out specific licenses for each stage in each article’s lifecycle, for each of its versions. If the licensing terms for a version will change (for example, because it may be exposed on a repository after an embargo period), then a separate license should be used, with the ‘start_date’ element indicating when the new license comes into effect. Using start dates for this license information is best practice in general, as it can validate immediate open access, which is at the heart of many institutional and funder policies. This is set out in more detail in the examples below.

Example: Green OA with Creative Commons license

In this example, a work is published 1 January 2019. Under the publisher’s policy, the VoR is under access controls. The AM is under embargo for a six-month period and then becomes open access under a CC BY NC ND license.

Green OA with Creative Commons license

Green OA with Creative Commons license

By using a Creative Commons license with a start date, the embargo end date can be unambiguously deduced from the metadata.

Example: Green OA with publisher-defined post-embargo license

Linking to a Creative Commons license is optimal whenever possible, as this is an unambiguously open license and so will be readily recognizable as identifying the post-embargo period. It is also a standard license which makes it more easily machine-readable. However, if you need to define your own open license, you can instead link to that in the metadata along with the appropriate start date.

Green OA with publisher-defined post-embargo license

Green OA with publisher-defined post-embargo license

Repository managers will still be able to unambiguously distinguish works that can be made available after an embargo period, albeit involving a brief manual check, provided the license identifies itself explicitly as referring specifically to the post-embargo period. It would not be suitable to provide a single URL containing license terms for both the pre-embargo and post-embargo period, for example:

<license_ref applies_to="am" start_date="2019-01-01"> general_terms

This would not allow institutions to unambiguously determine the embargo end date and license, and so should be avoided.

Example: Gold OA

In the case of gold OA, the licenses are simple: both the AM and the VoR have an open license (in this example, CC BY) that starts no later than the date of publication. The start date could optionally be omitted entirely, since the license terms will apply for the article’s lifetime.

Gold OA licensing

Gold OA licensing

Use cases

Having clear, unambiguous license metadata will help institutions use the content within your terms and conditions. For example, an institution could use Crossref to find works published by researchers at their organisation (provided you have also populated the affiliations of all the (co-)authors), and check programmatically for the presence and with-effect dates of any open license(s). This would show whether (and if so when) the work can be exposed on their repository.

How to populate your Crossref metadata with license information

There are multiple ways that publishers can add license information to the metadata they deposit/have deposited with Crossref:
- Add license information to your regular Crossref deposits
- Upload a resource deposit with only license information to populate existing metadata records
- Upload a .csv file with license information to populate existing metadata records
- Add license information to existing or new records using Metadata Manager. (Please note, this new tool is currently in beta.)

Please contact our technical support specialists with any questions about registering or retrieving license metadata.

Last Updated: 2019 June 16 by Rachael Lammey