The FAIR side of Mendeley Data

Mendeley hosts a Hack Day aimed at making Mendeley Datasets accessible by FAIR

Earlier this year we launched Mendeley Data, an open data repository where researchers from all disciplines can deposit their datasets. Because we want to support all fields of science, we allow all file formats, and are flexible in the kinds of metadata researchers have to provide. However, we still want to ensure that it is easy for others to find the data, access the data, and work with the data.

That’s where FAIR comes in. FAIR stands for Findable, Accessible, Interoperable and Reusable and is an approach for data developed since January 2014 by a wide range of scientific and research data organisations including the Dutch Techcentre for Life Sciences (DTL), and which Elsevier and Mendeley and others support strongly.

In the FAIR Data approach, data should be:

Easy to find by both humans and computer systems, with metadata that allow the discovery of interesting datasets;
Stored for long term such that they can be easily accessed and/or downloaded with well-defined license and access conditions, whether at the level of metadata, or at the level of the actual data content;
Ready to be combined with other datasets by humans as well as computer systems;
Ready to be used for future research and to be processed further using computational methods.

Community organizations and funding agencies are starting to recognize the importance of data being FAIR; for example the European Commission is providing researchers that receive funding through Horizon2020 with FAIR data management guidelines.

Mendeley Data wants to support researchers making their data available in a FAIR manner and so we’re delighted to be able to collaborate with the DTL, who are developing FAIR tools.

Mendeley Data is FAIR2

Hacking the data

Last Friday developers from DTL joined the Mendeley Data developers for a Mendeley hack day. The goal for the hack day was to extend Mendeley Data API, to be able to expose the FAIR metadata, which allows researchers to discover datasets in Mendeley Data based on detailed metadata attributes.

The end goal is that a researcher using a FAIR-enabled tool can carry out a detailed search operation (for example search for datasets about a particular disease condition) and find relevant results from a range of repositories, including Mendeley Data.

In order to enable this, ultimately, we need to create an endpoint which exposes detailed metadata for our datasets. We knew this would be a tall order for our hack day, so we created a proof-of-concept endpoint which exposed this metadata for some static/hardcoded instances of collections and datasets.

This was enough to show the FAIR Data Point in action, starting off accessing Mendeley Data, and then drilling down into these example catalogues and from there finding the example datasets.

By the end of the hack day we had:

Mapped our datasets’ metadata to the FAIR metadata layers of the FAIR Data Point, including W3C’s DCAT spec;
Implemented the proof-of-concept FAIR Data Point-compatible endpoint providing metadata which can be consumed by FAIR-enabled tools;
Demoed the Mendeley Data FAIR Data Point in action, navigating through the layers of FAIR metadata including the data repository (Mendeley Data), catalogue, datasets and data files.

The outcomes of the hack day were: a much better understanding of how to make our datasets available as FAIR resources, so they can be found, integrated and reused by researchers along with other FAIR datasets; and creation of an endpoint which is only a few steps away from being productionised and available to use by the community.

We really enjoyed working closely with Luiz, DTL’s CTO, and developers Rajaram and Kees to concretely and tangibly make progress towards making Mendeley Data datasets more findable, accessible, interoperable and reusable!

Follow Mendeley Twitter to hear when we launch this capability!

Mendeley Blog