33 minute read.Crossref’s Board votes to adopt the Principles of Open Scholarly Infrastructure
TL;DR
On November 11th 2020, the Crossref Board voted to adopt the “Principles of Open Scholarly Infrastructure” (POSI). POSI is a list of sixteen commitments that will now guide the board, staff, and Crossref’s development as an organisation into the future. It is an important public statement to make in Crossref’s twentieth anniversary year. Crossref has followed principles since its founding, and meets most of the POSI, but publicly committing to a codified and measurable set of principles is a big step. If 2019 was a reflective turning point, and mid-2020 was about Crossref committing to open scholarly infrastructure and collaboration, this is now announcing a very deliberate path. And we’re just a little bit giddy about it.
Here is a picture of me being “giddy.”
If you just want to see the principles that the board has endorsed, you can see them here:
https://0-doi-org.lib.rivier.edu/10.24343/C34W2H
But if you also want some background and want to understand some of the implications of Crossref adopting the principles, read on…
Warning - this is a long post.
Background and Origins
Some of you may be surprised that we’ve done this - simply because you always assumed we operated under these principles anyway. And we have. Mostly.
The “Principles of Open Scholarly Infrastructure” were largely inspired by a set of uncodified rules and norms that Crossref had been operating under for years. So how did we get to this circular situation where we are making a big announcement about adopting something we have largely been doing anyway?
Six years ago I met with Cameron Neylon and Jennifer Lin when they were still at PLOS and we decided that we wanted to write a blog post about…
Well, it doesn’t really matter.
We never finished writing that blog post because we got distracted by an issue that we kept seeing which was that services that the scholarly community depended on were increasingly taking directions that seemed antithetical to the community’s interests.
We were concerned because the scholarly community was becoming increasingly distrustful of infrastructure services. We wondered if there were any practices that we could point to that might mitigate the risk of infrastructure being co-opted and that would help build trust. Fortunately, we had two great models to look at:
- Crossref, which had a set of informal rules and norms that it had followed since its founding (e.g., transparency of operations, being business-model neutral, one member one vote).
- ORCID, an organisation that was spun-out of Crossref and which had adopted a written set of principles, based largely on codifying practices that they had seen at Crossref.
And so we wrote these practices up and added a few that we thought were missing. And we posted a different blog post to the one we had originally planned. It was titled “The Principles of Open Scholarly Infrastructures.” And the blog post became popular. And we did a bunch of talks about the Principles. And, much to our surprise, POSI has influenced the directions and policies of a number of organisations and initiatives since, including SPARC, Invest in Open Infrastructure, Open Data Institute, OA Switchboard, and others.
Elsewhere, community organizations and likeminded community members helped further develop the implementation of POSI through discussions at FORCE11 and through additional blog posts and books. Some, like Dryad and ROR, started to work to align their organizational structure to embrace POSI.
And this left Crossref in a strange position. Although we were largely the inspiration for these Principles - we ourselves had never codified and adopted them.
Motivations. Why Now?
Because it is the right thing to do for those that currently depend on Crossref
It is a healthy thing for the organisation to do. Adopting these principles strengthens Crossref’s governance. After twenty years, Crossref infrastructure has become critical to a broad segment of the community. As our membership profile changes, and as our broader stakeholder community expands, we need to explicitly evolve our governance to reflect stakeholders. And it would be irresponsible to continue to have our governance guided by a set of informal conventions. Particularly in the context of a global political period where we’ve seen the informal operating conventions and policy understandings of at least two major democracies ignored or discarded.
Because it could help make the creation of new, sustainable, open scholarly infrastructure easier and less expensive
There is a lot of new interest in open scholarly infrastructure. New infrastructure services and systems are being proposed almost every month. Many of them seek extensive advice and consulting from Crossref. A subset of these are incubated through Crossref. And a subset of these become Crossref services. Others are spun out as separate organisations (e.g., ORCID) or were specifically initiated as collaborations (e.g., ROR).
Our experience has been that the vast majority of work involved in these infrastructure projects was in establishing trust amongst the stakeholder community. We think that Crossref adopting the principles will help to address fundamental questions about accountability and sustainability that are inevitably raised when a new constituency approaches Crossref with an idea for collaborating on a new or existing infrastructure service. In short, adopting the principles will make future collaboration easier.
Adopting the Principles: Plus ça change
The Principles of Open Scholarly Infrastructure (POSI) proposes three areas that an Open Infrastructure organisation can address in order to garner the trust of the broader scholarly community: accountability (governance), funding (sustainability), and protection of community interests (insurance).
POSI proposes a set of concrete commitments that an organisation can make to build trust in each of these areas. There are 16 such commitments. Of these 16 commitments, Crossref is already completely or partially meeting the requirements of 15. And adopting the 16th commitment just formalises a direction Crossref has been heading toward for several years.
Critically, “adopting” POSI does not mean that we have to instantly meet all of the criteria. After all, when ORCID adopted its principles, it didn’t meet any of them. They were adopted to make a statement of intent. And they were publicly adopted so that the community could measure the organisation’s progress as well as to allow the community to detect if ORCID started to stray from its stated intentions.
Adopting the principles is akin to adopting a mission statement or a vision statement. It is an aspirational guide, not a description of the status quo.
Having said that, the principles are more concrete than a mission or vision statement, and this makes them easier to measure.
It is also important to note that the criteria are designed to balance each other. So, for example, one would not want to change the governance or business model to better support the mission if doing so would also threaten the sustainability of the organisation.
And finally, meeting a commitment is an ongoing process - it is not a one-off event. The organisation needs to keep measuring their performance against the principles in order to make sure that they have not inadvertently regressed.
Implications
Before adopting the principles, we did a candid self-audit to see which ones we thought we currently met and which ones we still needed to work on.
The three areas and sixteen commitments that are proposed in POSI are all designed to ensure that an infrastructure can not be co-opted by a particular party or interest group.
And the last area, “Insurance,” is the backstop that makes sure that, if some in the community feel that the infrastructure organisation has gone in a radically wrong direction, they can recreate the infrastructure as it was when they were comfortable with it, and they will not be hindered by practices or policies that lock them into the existing organisation.
This “insurance” is very much inspired by Crossref. Crossref itself was built, in part, to make sure that publishers were not locked into platforms and that journals and societies were not locked into publishers. Using the indirect Crossref DOI linking mechanism ensures that content can move between platforms and publishers without breaking vital citation links. Moving between platforms or publishers is never easy. And it isn’t cheap. But using Crossref DOIs for citation links at least makes it possible.
Crossref has an extra insurance level as well. It is built on the DOI and Handle infrastructure. If Crossref were to take a direction that some of its members found unacceptable, those members could join another DOI Registry agency more amenable to them. It wouldn’t be easy. It wouldn’t be cheap. But it would be possible.
And this knowledge helps keep Crossref grounded and attuned to the needs and concerns of its members. We know that our members are not “trapped” with us. We don’t take lightly the trust placed in us. And we know that there is trust still to build with various corners of our community. And it is this knowledge that helps keep us from developing the disdainful, take-it-or-leave-it, attitude that can be the cliché characteristic of infrastructure organisations.
So the fundamental, overarching goal of POSI is to set out principles that ensure that the stakeholders of an infrastructure organisation have a clear say in setting its agenda and priorities and that, in extremis, the stakeholders can leave and create an alternative infrastructure if the original organisation becomes unresponsive, hostile, or disappears.
As we look at how Crossref currently maps to the principles, please keep in mind three things:
- If we have marked something as green, that doesn’t mean we think we do this perfectly. It simply means that we already have internal processes that focus on this commitment and we have evidence that these processes have thus far been working.
- The fact that something is green and has “thus-far been working” does not mean that we should rest easy. We could regress. Our processes need to be able to detect and address regressions.
- The commitments are supposed to be balanced. So we don’t want to do something to turn something green if it has an irreversible impact on another commitment. So, for example, we should not address a shortfall in the contingency fund by generating revenue in a way that ultimately hurts Crossref’s mission.
- The implication of #3 above is that it may take us some time to meet all of the commitments. But again, the community can measure our progress against meeting the commitments.
So how does Crossref currently meet POSI?
Governance
🟢 Coverage across the research enterprise.
🟢 Non-discriminatory membership
🟢 Transparent operations
🟢 Cannot lobby
🟢 Living will
🟢 Formal incentives to fulfil mission & wind-down
🔴 Stakeholder Governed
Sustainability
🟢 Time-limited funds are used only for time-limited activities.
🟢 Goal to generate surplus
🟡 Goal to create contingency fund to support operations for 12 months
🟢 Mission-consistent revenue generation
🟢 Revenue based on services, not data
Insurance
🟢 Available data (within constraints of privacy laws)
🟡 Patent non-assertion
🟡 Open source
🟡 Open data (within constraints of privacy laws)
Governance
If an infrastructure is successful and becomes critical to the community, we need to ensure it is not co-opted by particular interest groups. Similarly, we need to ensure that any organisation does not confuse serving itself with serving its stakeholders. How do we ensure that the system is run “humbly”, that it recognises it doesn’t have a right to exist beyond the support it provides for the community and that it plans accordingly? How do we ensure that the system remains responsive to the changing needs of the community?
– POSI
In the area of governance, Crossref clearly meets six of the seven criteria listed. We will discuss these first.
🟢 Coverage across the research enterprise
it is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.
– POSI
Crossref includes members who publish in the STM, HSS and Professional spheres. There are still some gaps in our coverage (e.g., monographs, law), but this is not through policy or lack of trying.
Crossref has members in 139 countries and has agreements with people in 150 countries. However note that geographic diversity is not the same as language diversity. Although we have members in many countries, the vast majority of our registered content is still in English. This does not reflect the trends in research outputs. We still need to do a lot of work to support non-English publications and non-English speaking members. But we have already identified this as a priority and are working on a number of initiatives to better support research communication in languages other than English.
🟢 Non-discriminatory membership
we see the best option as an “opt-in” approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership
– POSI
It is first worth noting that “non-discriminatory” does not mean that we cannot have standards, obligations, and rules that all members of Crossref have to adhere to. It simply means that said rules are clear and that we apply them uniformly.
Crossref has always had catholic membership criteria. Although we have until now historically defined ourselves as a primarily “publisher” organisation, we define “publisher” loosely as anybody who produces content that commonly references or is referenced by scholarly literature. Historically, this has included NGOs, IGO’s, standards bodies, institutional archives, and professional publishers. More recently it has expanded to include preprint archives and funders.
The requirements for joining Crossref are few. We admit any applicant who:
- Agrees to the obligations of membership.
- Can pay the fees.
In practice we have historically had a policy of rejecting individuals as members. But even this is probably a pointless distinction as many of our members are “organisations” consisting of one person.
And fundamental to Crossref’s governance is that a member’s influence in the governance of Crossref is not tied to the level of financial investment they make in the organisation. All members have the same single vote. All board members have one vote.
Recently, we have also made changes to our governance and election process. The first to introduce contested elections for the board. The second to ensure that board membership was proportionally balanced amongst the membership tiers. Even as recently as 2017, when the Board established a Governance Committee, the idea of weighting votes to membership tiers was roundly rejected - on principle.
This is not to say that we can relax on this point. For example, as more funders and institutions join Crossref, we will need to make sure that our governance reflects that. We talk about this more in the section on governance.
Some will also point out that our fees are themselves a form of discrimination as they can still be an insurmountable barrier to some in the community. We understand this and, without trying to make light of or dismiss the situation, we are also confident that we are constantly looking at ways to lower the barrier-to-entry for joining Crossref. Our fees have gone steadily down since we were founded and we are constantly reviewing them to try and make them more equitable. We have created a category of sponsoring organisations to defray the costs of membership. We collaborate closely with organisations like PKP to try and build tools and services that make participation in Crossref easier and less expensive.
🟢 Transparent operations
achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).
– POSI
Crossref has transparent finances and a transparent governance process. Much of this is simply a byproduct of the regulations governing non-profits with tax exempt status in the US and our specific registration as a non-profit membership association in New York State.
Until fairly recently, the obvious exception to this was Crossref’s use of pre-picked slates in board elections, but we have since improved this with an open election process.
🟢 Cannot lobby
the community, not infrastructure organisations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it
– POSI
Crossref has never lobbied. Partly this is a byproduct of our commitment to be business-model neutral as most lobbying efforts in the industry seem to center around promoting the views held by members who share a business model.
But also, Crossref has never lobbied on its own behalf. We have always relied on our members and the community to point out and promote Crossref if there is any area of legislative policy that the Crossref infrastructure could help with.
🟢 Living will
a powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles
– POSI
Crossref has two relationships that require us to set out plans for an orderly wind-down.
The first is a condition of our incorporation as a non-profit in the state of New York. This explicitly includes a provision that requires us to hand over our operations and responsibilities to a successor non profit organisation that has a similar constituency and mission. The NY State Attorney General reviews and approves any major changes to ensure this requirement is met.
The second is a condition of our being members of the DOI Foundation, which includes provisions for us to hand over management of DOIs to another registration agency should Crossref ever wind-down. It is worth noting that we have already seen this clause invoked for other registration agencies that have wound down and who have, as part of the DOI Foundation provisions, handed responsibility for their DOIs to Crossref.
This is not to say that we are perfect on this score. We do not, for example, have any single place that outlines the steps that would need to be taken in order to execute the requirements laid out by our obligations to the state of New York and the IDF.
infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.”
– POSI
Crossref has a track record of periodically reviewing our services and decommissioning those that are no longer needed - either because they have fulfilled their specific mission or because there is simply waning interest in them (arguably, the same thing).
Again, this is not to say we are perfect on this score. We also have, by our last count, about 30 specialised, overlapping APIs- many of which are used by just a handful of users. These have escaped our normal scrutiny because they never had the status of a formal service and had not been through our product management process.
But still, Crossref has long made it a habit to question its own existence. At virtually every board annual strategy meeting we ask the question “will technology X make Crossref unnecessary?” We need to continue with the attitude that the best thing we could do for our members is to make ourselves unnecessary.
🔴 Stakeholder Governed
a board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.
– POSI
Overall, Crossref meets most of the Governance requirements with the notable exception of broader stakeholder involvement.
Of course, the key to this is how you define “stakeholder.”
Some may dispute this and argue that Crossref “stakeholders” are “publishers” because they are the parties that invested in creating Crossref.
But this narrow definition of “stakeholder” - focusing solely on those who have “invested”- is not widely held. In fact, common phrases like “stakeholder economy” and “stakeholder capitalism” describe the exact opposite- systems that don’t just focus on the “investor”, but which instead balance benefits to the investor with benefits to employees, the broader community, society, and the environment.
It is this latter, broader definition of “stakeholder” that is used in POSI.
And just in case anybody still thinks that people other than publishers don’t consider themselves “stakeholders’ in the Crossref infrastructure, we simply point to this, recently tweeted by Brea Manuel, a researcher, in celebration of their publication in Nature Reviews Chemistry (read it, and learn how to recruit and retain a diverse workforce):
Sustainability
Financial sustainability is a key element of creating trust. “Trust” often elides multiple elements: intentions, resources, and checks and balances. An organisation that is both well meaning and has the right expertise will still not be trusted if it does not have sustainable resources to execute its mission. How do we ensure that an organisation has the resources to meet its obligations?
– POSI
In the area of sustainability, Crossref clearly meets four of the five of the criteria listed and is most of the way to meeting the fifth.
🟢 Time-limited funds are used only for time-limited activities
day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.
– POSI
Crossref has never supported production activities based on grants. Indeed Crossref’s delivery on this point is what inspired the approach taken in this principle. This distinguishes Crossref from many grant-funded infrastructure initiatives which either barely stay afloat or disappear altogether. Even those that survive often do so by pursuing solutions that align with their funder’s interest over their user’s needs.
🟢 Goal to generate surplus
organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive, it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.
– POSI
Crossref has always attempted to generate a surplus. Crossref has generated surpluses since 2002 - so for 18 years of its 20 year existence.
🟡 Goal to create contingency fund to support operations for 12 months
a high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.
– POSI
Crossref currently has a contingency fund that would support operations for 9 months. Although this may be standard for industry, it seems prudent to extend this in the case of infrastructure organisations, particularly when they are membership organisations. First, the very fact that something is infrastructure implies that the systemic effects of its failing ungracefully could have industry-wide repercussions. Second, the decision-making process of a membership organisation whose governance is voluntary is inherently slower. It has taken Crossref Board 9 months, for example, just to discuss the ramifications of adopting POSI.
Given our recent financial performance, we expect Crossref could comfortably increase the contingency fund to support 12 months of operations within the next 2-3 years.
🟢 Mission-consistent revenue generation
potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation.
– POSI
Crossref has a good track record of periodically reviewing our services and fees and adjusting them to better support Crossref’s mission. The role of the Membership & Fees Committee in advising the Board has been critical. The very first example of this was in the early days of Crossref when we dropped matching fees because they were disincentivising members from linking their references. Crossref was also quick to recognise that, in order to support global research and reach smaller publishers in lower income countries, we had to develop a sponsoring mechanism to help defray the costs and ameliorate the technical complexity of participating in Crossref. Most recently we have taken the decision to drop fees for Crossmark as it was clear they had become a barrier to our members distributing retraction and correction notifications in a machine actionable format.
🟢 Revenue based on services, not data
data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees
– POSI
Crossref does not charge for or resell its members’ data. Doing so would restrict dissemination and reduce the discoverability of our members’ content. Instead our revenue comes from a combination of membership fees and service fees. The DOI registration is a member service that generates the bulk of our revenue. But our SLA-backed APIs are becoming increasingly popular as members and others seek to integrate Crossref metadata into their production workflows and services.
Insurance
Even with the best possible governance structures, critical infrastructure can still be co opted by a subset of stakeholders or simply drift away from the needs of the community. Long term trust requires the community to believe it retains control. Here we can learn from Open Source practices. To ensure that the community can take control if necessary, the infrastructure must be “forkable.” The community could replicate the entire system if the organisation loses the support of stakeholders, despite all established checks and balances. Each crucial part then must be legally and technically capable of replication, including software systems and data. Forking carries a high cost, and in practice this would always remain challenging. But the ability of the community to recreate the infrastructure will create confidence in the system. The possibility of forking prompts all players to work well together, spurring a virtuous cycle. Acts that reduce the feasibility of forking then are strong signals that concerns should be raised. The following principles should ensure that, as a whole, the organisation in extremis is forkable.
– POSI
Crossref clearly meets two of the four Insurance requirements. And the remaining two can be met easily with some clarification and time.
The “governance” section of POSI is designed to ensure that an infrastructure organisation is beholden to the broader stakeholder community and that it can not be co-opted by a particular party or special interest. And the “sustainability” section of POSI is designed to ensure that the infrastructure organisation takes the financial steps to ensure it can weather sudden changes in the financial or technical environment. But the last section, “insurance” is designed to protect stakeholder interests in case either “governance” or “sustainability” fail.
The term “forkable” comes from the Open Source software community where it is used to indicate when a software community’s interests diverge and they decide to split a project into several projects, with each new project focusing on a particular sub-community’s interests.
One of the immediate worries that people have when they first hear of the concept of “forkability” is that it will encourage the creation many variations of a project based on frivolous criteria. But this simply does not happen. Forking a project is never easy and takes a lot of effort. It is only done successfully when a critical mass of the community becomes unhappy with the direction a project is taking and is willing to take on the substantial burden of running an entirely separate project. Without such a critical mass, the fork just withers and has virtually no effect on the original project.
And the reason for this is simple, the mere knowledge that a project is “forkable” forces project maintainers to balance the interests of the community so that no sizable subgroup grows dissatisfied enough to fork the project.
Forkability encourages reponsivness to the community by making sure that the community is not “locked-in.”
Crossref itself was founded, in part, to prevent lock-in. Use of the DOI in linking citations makes it easier for publishers to move platforms, and for journals and societies to move between publishers.
And Crossref itself is architected in part to ensure that lock-in is not possible. Crossref is just one of several DOI registration agencies. Members unhappy with Crossref, can move to another DOI registration agency and their citation links will continue to work. But there are things we could do to make this even easier.
🟢 Available data (within constraints of privacy laws)
It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.
– POSI
Crossref provides public APIs that allow users to access Crossref metadata. We are planning to eventually release yearly public data files. We already did this once when we released a public data file in support of COVID-19 research. This in no way prevents the provision of data through paid Service Level Agreement tiers that provide guarantees of regularity, availability or reliability for those that need it. Existing Metadata Plus customers primarily use data that is available through the open API or existing dumps, but value additional services that support their use-cases.
🟡 Patent non-assertion
“The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.
– POSI
Crossref has never registered a patent. But the DOI Foundation, with significant support from Crossref, had to respond to (and then monitored) a set of patent applications that, if successful, the DOI System would infringe on. The applications were filed more than 15 years ago and haven’t been successful so these applications aren’t a current concern. As a result of this, the DOI Foundation adopted a patent policy in 2005 that covers all Registration Agencies and protects the DOI System. We may want to register protective patents in the future in order to enable us to defend ourselves against patent trolls.
The problem with patents is that they could be used by an organisation to prevent the infrastructure forking. One technique that has been used by major companies to assure communities that they will not be affected by patents, is to make a patent non-assertion covenant. For example, IBM, Microsoft and Google have made non-assertion statements in order to assure the open source and standards communities that they participate in that they will not co-opt an open source project or open standard by asserting patents on code or processes they contribute.
Though Crossref has never registered a patent, issuing a patent non-assertion covenant would help assure stakeholders that we would not use patents in the future to prevent the community from forking the system.
🟡 Open source
All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.
– POSI
All code for new initiatives since 2007 has been released under an open source MIT license. The legacy Content System code could be open sourced within 12-18 months with no extra effort.
If some Crossref stakeholders wanted to “fork” Crossref or leave for another DOI registration agency, their biggest hurdle would be trying to recreate the twenty years worth of rules and algorithms we use for processing and matching metadata. Without access to the source code of the system, it would be almost impossible for these to be reverse engineered.
Similarly, without access to the source code of our system - it is difficult to ensure that Crossref is, indeed, non-discriminatory in the way it works with member content. It would be possible, for example, for Crossref to modify its matching algorithms to deliberately favour or deprecate some members’ content.
If we want to assure the community that we are managing our member metadata fairly and if we want to provide even better insurance to our members and the broader stakeholders, we should make all of our code open source.
The legacy so-called “CS” (content system) is in the process of being refactored. The only reason we cannot open source this immediately is that we still need to make some security changes to it. These security changes are being done as part of a current refactoring project and should be completed without any extra effort within 12-18 months. After that, we can open source the code.
🟡 Open data (within constraints of privacy laws)
For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible.
– POSI
Achieving this simply requires us clarifying copyright and license information and that this will not have any effect on the metadata registered in Crossref by our members.
First we should outline the current copyright status of a Crossref metadata record.
The fundamental issue is that what we colloquially call “Crossref metadata” is actually a mix of elements, some of which come from our members, and some of which come from third parties and some of which comes from Crossref itself. These elements, in turn, each have different copyright implications.
On top of this, Crossref has terms and conditions for its members and terms and conditions for specific services. These grant Crossref the right to do things with some classes of metadata and not do things with other classes of metadata - regardless of copyright.
Let’s start with the easiest case. Crossref already has two services with CC0 metadata:
- The Open Funder Registry
- Event Data
Obviously, the POSI open data provision would not change anything for either service.
The next easiest case is private data. Crossref collects PII (usernames, passwords IP addresses, etc.). This would remain private. And we will continue to manage it in conformance with GDPR. It would not be affected by the open data provision of POSI.
Next let’s look at what most people probably think of as “Crossref metadata”- that is, the basic bibliographic metadata that Crossref has collected from its members since its founding (titles, authors, volumes, issues, etc). For the record- this does not include abstracts.
Since 2000 Crossref has stated that it considers this basic bibliographic metadata to be “facts.” And under US law (Crossref is registered in the US) these facts are not subject to copyright at all. If this data is not subject to copyright at all, there is no way Crossref can “waive the copyright” under CC0. This metadata would not be affected at all under the open data provision of POSI.
More recently, some of our members have been submitting abstracts to Crossref. These are copyrighted. In the case of subscription publishers, the copyright usually belongs to the publisher. In the case of open access publishers, the copyright most often belongs to the authors. In both cases, Crossref cannot waive copyright under CC0 because the copyright is not ours to waive. However, we are allowed to redistribute the abstracts with our metadata because that is part of the terms and conditions we have with our members. We already have language that notes the distinct copyright status of the abstracts in our metadata, but, ideally, we should extend our schema to make that information available in a machine actionable form as well. In short, the copyright status of abstracts would not be affected at all by the open data provision of POSI.
Crossref also has its Reference Distribution Policy that the board adopted in 2017 - limited and closed references are not distributed by Crossref and this won’t change.
[EDIT 6th June 2022 - all references are now open by default with the March 2022 board vote to remove any restrictions on reference distribution].
And this leaves us with the one thing that would be affected by the open data provision of POSI- data that is created by Crossref itself as a byproduct of our services. By law, this data is under Crossref’s copyright unless we explicitly waive it. This data includes things like, participation reports, conflict reports, member IDs and Cited-by counts (just the counts, not the references) and any aggregations of our otherwise uncopyrighted data that might, by aggregating it, be subject to sui generis database rights. At the moment, although we distribute this data freely and without restriction, we have no explicit copyright attached to it. All we would be seeking to do is explicitly say that data generated by Crossref will be distributed CC0. Again, at first it would be enough to just specify this in human readable form, along with our other copyright information. But, eventually, we would want to include this information in machine actionable form in the metadata itself.
To summarise:
Metadata type | Example | Current Copyright | Change under POSI |
---|
Already CC0 | Open Funder Registry, Event Data | CC0 | None |
Private | Log files, user IDs | Private | None |
Bibliographic | Title, authors, volume, issue | Facts | None |
Closed references | | Facts - but no distribution under the reference distribution board policy from 2017 | None |
Limited references | | Facts - but no public distribution under the reference distribution board policy from 2017 | None |
Open references | | Facts | None |
Crossref-generated data | Participation data, reports, extracts | Copyright Crossref | CC0 |
[EDIT 6th June 2022 - all references are now open by default with the March 2022 board vote to remove any restrictions on reference distribution].
No member metadata will be affected by our adopting the open data provision of POSI. The only data that would be affected is data generated by Crossref itself.
However, the adoption of this principle would likely have an effect on our decisions about future services. For example, under this principle we would not launch any new services where the data was not freely reusable or the copyright of the data was not CC0.
Conclusion and Next steps
So again we face the paradox- We are announcing something that is simultaneously insignificant and important. It is insignificant in that we are simply saying that we will continue to do what we have largely been doing since Crossref was founded. But it is important because, in codifying what we have been doing, we are also confirming that these principles actually worked. That they were essential to building the trust that allowed us to function over the past twenty years, and they will continue to be essential in the future- as we look to work with existing organisations to strengthen current infrastructures, and work with new stakeholders to develop new infrastructures.
So much of the work in building scholarly infrastructure is about building trust. We would love to see other organisations and services adopt POSI as well. Doing so would help us to collaborate more efficiently by allowing us to confirm from the outset that our fundamental values align. And having a set of verifiable commitments that we can point to will also help build the community’s trust in our respective organisations and services.
And this brings us to an important point. Although POSI might have been inspired by Crossref, POSI is not a “Crossref thang” and it never has been. The movement to create open scholarly infrastructures and to define and clarify the ground rules within which they operate has become a much broader community concern.
To this end, we’ve worked with some sibling infrastructure organisations—such as Dryad and ROR—as well as the original authors of POSI to create a website where we could host the list of principles independent of the original blog post and independent of any single organisation:
Minimally, this provides a place for anybody who wants to link to or cite POSI - either because they are endorsing them, or because they are simply discussing them.
If we see enough activity of this type, then the site could evolve to become a register of those organisations and services who have formally adopted POSI and a place where they can link to their self-assessments against the principles.
The community promoting, discussing and applying POSI has long since grown beyond the original authors of the POSI blog post. And it is also much larger than any single organisation. Our hope is that this website encourages that growth.
And, of course, in addition to the external outreach and coordination, Crossref still has internal work to do in addressing the outstanding issues that were raised in our own self-assessment above. We need to increase our contingency funds. We need to publish a patent non-assertion covenant. We need to open source our core software. And we need to clarify our metadata license information and make it explicit that Crossref waives copyright (using CC-0) for any metadata generated by Crossref. And, finally, as Crossref expands and starts working with different stakeholders, we will need to adjust our governance and the composition of our board accordingly. We will, of course, post updates here as we make progress on addressing these areas.
2020 marked Crossref’s 20th birthday. What a grim year to have an anniversary. But we are, at least, ending it on a little bit of a high. We are delighted that the issue of open scholarly infrastructure has become so prominent in the community. And we are eager to help strengthen and extend this infrastructure. The decision by Crossref’s board to adopt POSI is the equivalent of Crossref finally adopting a written constitution. And it is a fitting launch to our next twenty years.