Blog

2023 public data file now available with new and improved retrieval options

We have some exciting news for fans of big batches of metadata: this year’s public data file is now available. Like in years past, we’ve wrapped up all of our metadata records into a single download for those who want to get started using all Crossref metadata records. We’ve once again made this year’s public data file available via Academic Torrents, and in response to some feedback we’ve received from public data file users, we’ve taken a few additional steps to make accessing this 185 gb file a little easier.

2022 public data file of more than 134 million metadata records now available

In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past. We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.

With a little help from your Crossref friends: Better metadata

We talk so much about more and better metadata that a reasonable question might be: what is Crossref doing to help? Members and their service partners do the heavy lifting to provide Crossref with metadata and we don’t change what is supplied to us. One reason we don’t is because members can and often do change their records (important note: updated records do not incur fees!). However, we do a fair amount of behind the scenes work to check and report on the metadata as well as to add context and relationships.

A ROR-some update to our API

Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁 The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants.

New public data file: 120+ million metadata records

Jennifer Kemp

Jennifer Kemp – 2021 January 19

In MetadataCommunityAPIs

2020 wasn’t all bad. In April of last year, we released our first public data file. Though Crossref metadata is always openly available––and our board recently cemented this by voting to adopt the Principles of Open Scholarly Infrastructure (POSI)</agic––we’ve decided to release an updated file. This will provide a more efficient way to get such a large volume of records. The file (JSON records, 102.6GB) is now available, with thanks once again to Academic Torrents.

Come for a swim in our new pool of Education materials

After 20 years in operation, and as our system matures from experimental to foundational infrastructure, it’s time to review our documentation. Having a solid core of education materials about the why and the how of Crossref is essential in making participation possible, easy, and equitable. As our system has evolved, our membership has grown and diversified, and so have our tools - both for depositing metadata with Crossref, and for retrieving and making use of it.

Helping researchers identify content they can text mine

TL;DR Many organizations are doing what they can to aid in the response to the COVID-19 pandemic. Crossref members can make it easier for researchers to identify, locate, and access content for text mining. In order to do this, members must include elements in their metadata that: Point to the full text of the content. Indicate that the content is available under an open access license or that it is being made available for free (gratis).

Free public data file of 112+ million Crossref records

A lot of people have been using our public, open APIs to collect data that might be related to COVID-19. This is great and we encourage it. We also want to make it easier. To that end we have made a free data file of the public elements from Crossref’s 112.5 million metadata records. The file (65GB, in JSON format) is available via Academic Torrents here: https://0-doi-org.lib.rivier.edu/10.13003/83B2GP It is important to note that Crossref metadata is always openly available.

Crossref metadata for bibliometrics

Our paper, Crossref: the sustainable source of community-owned scholarly metadata, was recently published in Quantitative Science Studies (MIT Press). The paper describes the scholarly metadata collected and made available by Crossref, as well as its importance in the scholarly research ecosystem.

Using the Crossref REST API (with Open Ukrainian Citation Index)

Over the past few years, I’ve been really interested in seeing the breadth of uses that the research community is finding for the Crossref REST API. When we ran Crossref LIVE Kyiv in March 2019, Serhii Nazarovets joined us to present his plans for the Open Ukrainian Citation Index, an initiative he explains below. But first an introduction to Serhii and his colleague Tetiana Borysova. Serhii Nazarovets is a Deputy Director for Research at the State Scientific and Technical Library of Ukraine.