Dominika Tkaczyk – 2018 December 18
In my previous blog post, Matchmaker, matchmaker, make me a match, I compared four approaches for reference matching. The comparison was done using a dataset composed of automatically-generated reference strings. Now it’s time for the matching algorithms to face the real enemy: the unstructured reference strings deposited with Crossref by some members. Are the matching algorithms ready for this challenge? Which algorithm will prove worthy of becoming the guardian of the mighty citation network? Buckle up and enjoy our second matching battle!
Dominika Tkaczyk – 2018 November 12
Matching (or resolving) bibliographic references to target records in the collection is a crucial algorithm in the Crossref ecosystem. Automatic reference matching lets us discover citation relations in large document collections, calculate citation counts, H-indexes, impact factors, etc. At Crossref, we currently use a matching approach based on reference string parsing. Some time ago we realized there is a much simpler approach. And now it is finally battle time: which of the two approaches is better?
Dominika Tkaczyk – 2018 November 09
At Crossref Labs, we often come across interesting research questions and try to answer them by analyzing our data. Depending on the nature of the experiment, processing over 100M records might be time-consuming or even impossible. In those dark moments we turn to sampling and statistical tools. But what can we infer from only a sample of the data?
2019 January 17
2019 January 03
2018 December 18