ABOUT SCHOLEXPLORER: THE OPENAIRE’S SCHOLIX HUB

Scholix The goal of the Scholix initiative is to establish a high level interoperability framework for exchanging information about the links between scholarly literature and data. It aims to enable an open information ecosystem to understand systematically what data underpins literature and what literature references data. Scholix maintains an evolving set ofGuidelines consisting of: (i) an information model (conceptual definition of what is a Scholix scholarly link), (ii) a link metadata schema (set of metadata fields representing a Scholix link), and (iii) a corresponding XML and JSON schema.

How to expose Scholix data from this hub Scholexplorer harvests scholarly links (i.e. collects metadata records about links from public APIs) from the Scholix compatible sources and from DataCite compatible sources. The main sources collected by the service are listed in Table 1.

Table 1. Scholexplorer: data sources. Legenda: Hub (aggregator of sources that exposes Scholix links), Source (data source that exposes Scholix links)

Data Source	Type of source
DataCite from the DataCite OAI-PMH end point	Hub: a provider of links published by data repositories; links are between datasets (DOIs) and articles (DOIs, handles, URLs) or datasets (DOI)
CrossRef from the Event Data end point	Hub: a provider of links published by publishers; links are between articles (DOIs) and datasets (DOI, accession numbers, URLs)
Data repositories that are not yet DataCite members	Source: a publisher of links between datasets (no DOI) and articles
Thematic publishers, e.g. Europe PMC	Source: a publisher of links between articles and accession numbers
Dataset databases, e.g. ENA	Source: a publisher of links between accession numbers of sequences and articles

Data sources willing to include the links they publish or provide within Scholexplorer have two options:

Become Scholix-compliant and register to become a Scholexplorer data source;
Become a data source aggregated by any of the hubs above.

HOW TO ACCESS SCHOLIX DATA FROM THIS HUB

The Scholix Swagger API allows clients to run REST queries over the Scholexplorer index in order to fetch links matching given criteria. In the current version, clients can search for:

Links whose source object has a given PID or PID type;
Links whose source object has been published by a given data source ("data source as publisher");
Links that were collected from a given data source ("data source as provider").

The APIs are available from here. The results of queries return lists of links encoded as JSON Scholix records. JSON (and XML) schema and example records for Scholix links are availabe on GitHub.

TERMS OF USE and SLA

REST APIs: ScholeXplorer's REST APIs are free-to-use (no sign-up needed) by any third-party service. Note that:

The service limits each query to around 10,000 paged results (pages are by default of length 100 and can be navigated via resumption token);
Since March 2018 a full JSON dump of the service is made available in Zenodo.org every six months
For unlimited access to the APIs please contact the service administrators

Metadata license is CC-BY: Scholix metadata records returned by the service can be freely re-used by commercial and non-commercial partners under CC-BY license, hence as long as OpenAIRE ScholeXplorer is acknowledged as content provider provider.

SLA: the service is running in production 24/7 within the OpenAIRE infrastructure premises deployed at the data center facilities of the Interdisciplinary Centre for Mathematical and Computational Modelling (ICM)

Faq

For any questions you may have about this service please contact Sandro La Bruzzo.

1. How to I expose Scholix links to Scholexplorer (or other services)? Does Scholix recommend an access protocol?

Scholix does not recommend any specific protocol to expose links, although any community standard is strongly encouraged. As to Scholexplorer, its harvesting layer bulk-collects links from external data sources via public APIs. OAI-PMH is preferable, but also REST APIs can be accepted. The option of “incremental” harvesting is strongly encouraged (not mandatory), for example allowing to search links by "last date of indexing": http://www.mydomain.eu/scholix?lastIndexingDate=yyyy-mm-dd). If Scholix links are returned with all metadata fields, including the optional ones, the APIs above are enough. If instead, the links are limited to the mandatory fields, then a "resolution" API would be required: given a PID of an object the API returns its full metadata record.

2. Which PIDs are compatible with Scholexplorer?

Scholexplorer accepts links between any kind of persistent identifiers, including URLs. The major standard identifiers (e.g. DOIs, PDBs, PMCID, etc) are also resolved to include the complete record information in Scholexplorer, while in general URLs are not resolved as the variability of the associated resolvers cannot be handled by one service. In principle if your data source is Scholix compliant and provides proprietary PIDs you can include complete Scholix links records in Scholexplorer in two ways: (i) exposing complete Scholix records or (ii) exposing minimally compliant records but making a resolver available, so that Scholexplorer can collect the PID complete records (note: your PIDs will have a specific type, associated by Scholexplorer to your resolver service).