Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁
The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants.
Event Data is our service to capture online mentions of Crossref records. We monitor data archives, Wikipedia, social media, blogs, news, and other sources. Our main focus has been on gathering data from external sources, however we know that there is a great deal of Crossref metadata that can be made available as events. Earlier this year we started adding relationship metadata, and over the last few months we have been working on bringing in citations between records.
Tl;dr: Metadata for the (currently 26,000) grants that have been registered by our funder members is now available via the REST API. This is quite a milestone in our program to include funding in Crossref infrastructure and a step forward in our mission to connect all.the.things. This post gives you all the queries you might need to satisfy your curiosity and start to see what’s possible with deeper analysis. So have the look and see what useful things you can discover.
Update on the outage of October 6th. In my blog post on October 6th, I promised an update on what caused the outage and what we are doing to avoid it happening again. This is that update.
Crossref hosts its services in a hybrid environment. Our original services are all hosted in a data center in Massachusetts, but we host new services with a cloud provider. We also have a few R&D systems hosted with Hetzner.
There are now two versions of iThenticate available. Most subscribers are on v1, and the instruction on this website explain how to set up and use v1. Some new subscribers can start using v2 from September 2021 - we’ll let new subscribers know if v2 is appropriate for them when they apply. You can find out more about iThenticate v2 on our blog.
To work out which version you’re on, take a look at the website address that you use to access iThenticate. If you go to ithenticate.com then you are using v1. If you a bespoke URL, https://crossref-[your member ID].turnitin.com/ then you are using v2.
To download a Similarity Report as a print-friendly .pdf document, click the print icon at the bottom left of the Document Viewer.
The .pdf created is based on the current view of the Similarity Report, so a version created while in Match Overview will create a .pdf with color-coded highlights.
Filters and exclusions in individual Similarity Reports (v1)
You can use filters and exclusions to remove certain elements from being checked for similarity, and help you focus on more significant matches. The functions for excluding material are approximate - they are not perfectly accurate. Take care when choosing what to exclude, as you may miss important matches. At folder level, all users can set filters and exclusions, and administrators can also set URL filters and phrase exclusions. These settings will apply to any documents within the folder. But you can also set filters and exclusions on an individual document, so they only apply to the Similarity Report for that specific document.
Start from the Document Viewer, and click the filters icon at the bottom of the sidebar to see the Filters & Settings menu.
The filters and exclusions options are:
Exclude quoted or bibliographic material: Click the check-box next to Exclude Quotes or Exclude Bibliography, then click Apply Changes at the bottom of the Filter & Settings sidebar.
Exclude small sources: Click the check-box for excluding by words or %, and enter a numerical value for sources to be excluded from this Similarity Report. To turn off excluding small sources, select Don’t exclude by size. Click Apply Changes at the bottom of the Filter & Settings sidebar. This setting will affect the All Sources view of the side panel.
Exclude small matches: Under Exclude matches that are less than, choose words, and enter the numerical value for match instances to be excluded from this Similarity Report. To turn off excluding small matches, select Don’t exclude. Click Apply Changes at the bottom of the Filter & Settings sidebar. This setting will affect the Match Overview view of the side panel.
Exclude sections: Under Exclude Sections, choose the sections you would like to exclude:
methods and materials (including variations)
iThenticate will exclude sections of the submitted document with headers containing the excluded words: ‘abstract’, ‘method and materials’, ‘methods’, ‘method’, ‘materials’, and ‘materials and methods’.
Exclude a match (v1)
If you decide that a match does not need to be flagged, you can exclude the source from the Similarity Report through Match Breakdown or All Sources. The Similarity Score will be recalculated, and may change the current percentage of the Similarity Report.
To access Match Breakdown from Match Overview, hover over the match for which you would like to view the underlying sources, and click the arrow icon.
In Match Breakdown, click Exclude Sources, and select the sources you would like to remove by selecting the check-box next to each, then click the Exclude button.
To exclude an entire source match from All Sources, select Exclude Sources, select the sources you would like to remove by selecting the check-box next to each, then click the Exclude button.
Excluded sources lis (v1)
The excluded sources list shows all sources excluded from the Similarity Report. To see the excluded sources list, click the excluded sources icon at the bottom of the sidebar.
Click the check-box next to any source you would like to re-include in the Similarity Report, and click the Restore button to include the source in the Similarity Report. To restore all of the sources that were excluded from the report, click the Restore All button. The Similarity Score will be recalculated.
The text-only report (v1)
Start in the Document Viewer, and click the Text-Only Report button at the bottom right to see the Similarity Report without document formatting. The report will stay in text-only view mode (even if you close and reopen it) until you click Document Viewer to return to that mode.
Along the top of the screen, the document information bar shows important details about the submitted document (including the date the report was processed, word count, the folder the document was submitted from, the number of matching documents found in the selected databases and the similarity index), and a menu bar with various options. Use the information bar drop-down to switch between uploaded documents in the same folder.
The menu bar beneath the information bar has a mode selection drop-down menu, options to exclude quotes, bibliography, small sources, and small matches, as well as options to print and download.
Choose a viewing mode from the mode drop-down menu:
Similarity Report (default) - this mode has a similar layout to the Document Viewer. You will see the document’s text on the left of the screen, with similarities highlighted. On the right are the sources, color-coded and listed from highest to lowest percentage of matching words. Only the top or best matches are shown - choose Content Tracking mode to see all underlying matches.
Content tracking mode lists all the matches between the submitted document and the databases. Regular updates means that there may be many matches from the same source, some of which may be partially or completely hidden due to the content appearing in a higher matched source. The sources that are the same will specify from where they were taken and when.
Summary report mode offers a simple, printable list of the matches found followed by the paper with the matching areas highlighted. It shows the sources first, with the document text below.
Largest matches mode shows the percentage of words that are a part of a matching text string (with some limited flexibility). In some cases, strings from the same source may overlap, in which case, the longer string in the largest match view will be displayed.
You have options to filter and exclude:
Exclude quoted or bibliographic material - click Exclude Quotes or Exclude Bibliography from the menu bar.
Exclude phrases - click enable this setting for a folder means that any submission made to that folder will exclude the phrases specified in the folder settings. If you would like to include these phrases in the report, click Do not Exclude Phrases in the menu bar.
Exclude a match - use this to exclude a source from the Similarity Report in either the Similarity Report or largest matches viewing modes. To exclude a match, view the report in Similarity Report or largest matches mode. Each source listed has an X icon to its right - click this to exclude the source. Any underlying source, if present, will replace the excluded source. Once a source has been excluded it can be re-included in the Similarity Report through the content tracking mode, which lists all sources with content matching that of the submission. In this view mode, excluded sources have a + icon to the right of their name - click this to re-include the source in the Similarity Report.
Exclude small sources and matches - click Exclude small sources or Exclude small matches in the menu bar.
Exclude small sources - To exclude a small source, enter a value into the word count or percentage field to set an exclusion threshold. Any source below the word court or match percentage threshold will be excluded from the record. Click Update to save the exclusion setting.
Exclude small matches - To exclude a small match, enter a value into the word count field to set an exclusion threshold. Any match below that threshold will be excluded from the report. Click Update to save the exclusion setting.
Making these changes may change the percentage of matching text found within the submission. Deselect an option to include it again.