Mendeley Data awarded Data Seal of Approval

On June 22, it was announced that Mendeley Data’s open research data repository won the Data Seal of Approval certification; this award confirms that the repository complies with the Data Seal of Approval guidelines, and is a trusted digital repository.

The 16 rigorous guidelines include guarantees to the “integrity and authenticity of the data” and “protection of the facility and its data, products, services, and users”.  Also, these guidelines ensure that data can be easily cited.

When choosing a repository to deposit and share your data, it’s important to know that your data will be stored safely, and will be available, findable and accessible over the long-term. Choosing a certified repository is a way to ensure this. For this reason JISC, the UK’s publicly-funded research advice body, recommends selecting a certified repository to store your data.

The Data Seal of Approval certification highlights the value of the services provided by Mendeley Data, and Elsevier’s wider commitment to helping researchers make maximum use of their data, as well as store their vital data safely.

About Mendeley Data

Mendeley Data is a secure cloud-based repository where researchers can store data, ensuring it is easy to share, access and cite, wherever they are. Research data is published with a Force11 compliant citation; it is backed up by DANS (Data Archiving Networking Services) to ensure that it is safely archived.

Mendeley Data can be accessed at http://data.mendeley.com/.

Mendeley Data adopts Google Science Datasets standards

Mendeley Data is pleased to announce that we’ve adopted the new Google Science Datasets markup standard for datasets.

For the non-computer science buffs amongst us, this means we describe our datasets in a structured way recognised by Google – which helps Google to index our datasets, and makes them more readily available in their search results.

This also means Google could eventually show datasets in a special way within search results, perhaps by presenting a “rich snippet” for a dataset like the example for a research article below. This makes them more visible and easier to scan by readers.

An example of a “rich snippet” search result, in this case for a research article

 

This applies to all datasets posted so far, as well as any new datasets.

This is all part of our efforts to make the data you share as discoverable as possible by researchers, so that it can be valuable to the community and you can get credit for generating and sharing it.

Any questions, thoughts or suggestions, we would love to hear from you.

 

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

Topic-specific conferences are no longer just focused on research; Data sharing initiatives are now a major part of the discussions happening in research. With Mendeley Data and other Data Management initiatives, we are always looking to learn more, directly from the researchers. Anita DeWaard, Vice President of Data Research and Collaboration for Elsevier, shares how she learned data repositories often struggle with similar issues — and how collaboration can help address those issues.

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

I attended my first AGU meeting in New Orleans last fall, with the intention to learn more about informatics, metadata and research data in the Earth and Planetary sciences. For a newbie, this meeting is an intimidating affair where over 25,000 scientists gather to discuss topics ranging from plate tectonics to probabilistic flood mapping, and from solar prominences to paleoclimatology.

Informatics and metadata played a huge role in the program sessions. The Earth and Space Science Informatics Section alone, for example, amounted a staggering 1,200 talks and posters on that topic alone. And that by no means comprises the full extent of sessions about informatics and metadata: for instance, the Hydrology Section has not one but two sessions (with 10 – 20 papers each) on  ‘Advances in Data Integration, Inverse Methods, and Data Valuation across a Range of Scales in Hydrogeophysics’, and the Public Affairs Section hosts ‘Care and Feeding of Physical Samples and Their Metadata’.

It is easy to feel overwhelmed. Yet once I stopped focusing on watching the endless streams of people moving up and down escalators to more and more rooms full of posters and talks (and once I finally retrieved my Flat White after the seemingly endless fleece-clad line at the Starbucks!) I learned that if you just jump in the stream and go with the flow, the AGU is really just a great ride.

I was involved in three events: a session about data discovery, one on Unique Identifiers, and the Data Rescue award, which Elsevier helped organize, together with IEDA, the Interdisciplinary Earth and Data Alliance (http://www.iedadata.org/).

Data Repository issues; Or, how to come up with a means of survival

In the Data discovery session, we had 8 papers pertaining to searching for earth science data. Siri Jodha Khalsa and myself are co-chairing a nascent group as part of the Research Data Alliance on this same topic, very is quite relevant to us in developing our Datasearch platform. It struck me how very comfortable and aware of various aspects of data retrieval the earth science community seems to be, compared to repositories in other domains, who are just starting to talk about this.

The data repositories that presented were struggling with similar issues; how to scale the masses of content that need to be uploaded, how to build tools that provide optimal relevancy ranking over heterogeneous and often distributed data collections, keep track of usage, provide useful recommendations, and offer personalisation services when most search engines do not ask for login details, all with a barebones staff an an organisation that is more often than not asked to come up with means for its own survival.

The end of download science

At the Poster session that evening, it was exciting to see the multitude of work being done pertaining to data discoverability. One of the most interesting concepts for me was in a poster by Viktor Pankratius from MIT, who developed a ‘computer-aided discovery system’ for detecting patterns, generating hypotheses, and even driving further data collection from a set of tools running in the cloud.

Pankratius predicted the ‘end of download science’: whereas in the past, (earth) scientists did most of their data-intensive work by downloading datasets from various locations, writing tools to parse, analyze and combine them, and publish (only) their outcomes, Pankratius and many others are developing analysis tools that are native to the cloud, and are shared and made available together with the datasets for reuse.

Persistent Identifiers

On Thursday, I spoke at a session entitled:“Persistent Identification, Publication, and Trustworthy Management of Research Resources”: two separate but related topics. The first three talks focused on trustworthiness, Persistent identifiers are a seemingly boring topic, that, however, just got their own very groovy conference, PIDaPalooza (leave it to Geoff Bilder to groovify even the nerdiest of topics!).

One of the papers in that session (https://agu.confex.com/agu/fm16/meetingapp.cgi/Paper/173684) discussed a new RDA Initiative, Scholix, which uses DOIs for papers and datasets to enable a fully open linked data repository that connects researchers with their publications and published datasets. Scholix represents a very productive collaboration, spearheaded by the RDA Data Publishing Group, involving many parties including publishers (including Thomson Reuters, IEEE, Europe PMC and Elsevier), data centres (the Australian National Data Service, IEDA, ICPSR, CCDC, 3TU DataCenter, Pangaea and others) and aggregators and integrators (including CrossRef, DataCite and OpenAIRE).

Persistent identifiers combine with semantic technologies to enable a whole that is much more than the sum of its parts, that surely points the way forward in science publishing; it allows, for instance, Mendeley Data users to directly address and compare different versions of a dataset (for some other examples see my slides here).

Celebrating the restoration of lost datasets

A further highlight was the third International Data Rescue Award. This award, the third so far, is intended to reward and celebrate the usually thankless task of restoring datasets that would otherwise disappear or be unavailable. This reward brings together (and aims to support the creation) of a community of very diverse researchers, who all have a passion for restoring data.

This year’s winners, were from the University of Colorado in Boulder. Over a period of more than fifteen years, they rescued and made accessible the data at the Roger G. Barry Archive at the National Snow and Ice Data Center, which consists of a vast repository of materials, including over 20,000 prints, over 100,000 images on microfilm, 1,400 glass plates, 1,600 slides, over 100 cubic feet of manuscript material and over 8,000 ice charts. The material dates are incredibly diverse and date from 1850 to the present day and include, for instance, hand-written 19th century exploration diaries and observational data. Projects such as these tell us the incredible importance of data, especially in times like these, where so much is changing so quickly. Looking at the pictures in the Glacier Photograph Collection shows the incredible extent of glacial erosion between, for instance, 1941 and today, when an entire glacier has simply vanished and offers a grim reminder of the extent to which global warming is affecting our world.

In short, there is a lot out there for all of us to learn from going to the AGU. Earth science is abuzz with data sharing initiatives: there are exciting new frontiers to explore, important lessons to be learned, and invaluable data to be saved.

Mendeley Data is FAIR2

The FAIR side of Mendeley Data

Mendeley Data is FAIRMendeley hosts a Hack Day aimed at making Mendeley Datasets accessible by FAIR

Earlier this year we launched Mendeley Data, an open data repository where researchers from all disciplines can deposit their datasets. Because we want to support all fields of science, we allow all file formats, and are flexible in the kinds of metadata researchers have to provide. However, we still want to ensure that it is easy for others to find the data, access the data, and work with the data.

That’s where FAIR comes in. FAIR stands for Findable, Accessible, Interoperable and Reusable and is an approach for data developed since January 2014 by a wide range of scientific and research data organisations including the Dutch Techcentre for Life Sciences (DTL), and which Elsevier and Mendeley and others support strongly.

In the FAIR Data approach, data should be:

  • Easy to find by both humans and computer systems, with metadata that allow the discovery of interesting datasets;
  • Stored for long term such that they can be easily accessed and/or downloaded with well-defined license and access conditions, whether at the level of metadata, or at the level of the actual data content;
  • Ready to be combined with other datasets by humans as well as computer systems;
  • Ready to be used for future research and to be processed further using computational methods.

Community organizations and funding agencies are starting to recognize the importance of data being FAIR; for example the European Commission is providing researchers that receive funding through Horizon2020 with FAIR data management guidelines.

Mendeley Data wants to support researchers making their data available in a FAIR manner and so we’re delighted to be able to collaborate with the DTL, who are developing FAIR tools.

Mendeley Data is FAIR2

Hacking the data

Last Friday developers from DTL joined the Mendeley Data developers for a Mendeley hack day. The goal for the hack day was to extend Mendeley Data API, to be able to expose the FAIR metadata, which allows researchers to discover datasets in Mendeley Data based on detailed metadata attributes.

The end goal is that a researcher using a FAIR-enabled tool can carry out a detailed search operation (for example search for datasets about a particular disease condition) and find relevant results from a range of repositories, including Mendeley Data.

In order to enable this, ultimately, we need to create an endpoint which exposes detailed metadata for our datasets. We knew this would be a tall order for our hack day, so we created a proof-of-concept endpoint which exposed this metadata for some static/hardcoded instances of collections and datasets.

This was enough to show the FAIR Data Point in action, starting off accessing Mendeley Data, and then drilling down into these example catalogues and from there finding the example datasets.

By the end of the hack day we had:

  • Mapped our datasets’ metadata to the FAIR metadata layers of the FAIR Data Point, including W3C’s DCAT spec;
  • Implemented the proof-of-concept FAIR Data Point-compatible endpoint providing metadata which can be consumed by FAIR-enabled tools;
  • Demoed the Mendeley Data FAIR Data Point in action, navigating through the layers of FAIR metadata including the data repository (Mendeley Data), catalogue, datasets and data files.

The outcomes of the hack day were: a much better understanding of how to make our datasets available as FAIR resources, so they can be found, integrated and reused by researchers along with other FAIR datasets; and creation of an endpoint which is only a few steps away from being productionised and available to use by the community.

We really enjoyed working closely with Luiz, DTL’s CTO, and developers Rajaram and Kees to concretely and tangibly make progress towards making Mendeley Data datasets more findable, accessible, interoperable and reusable!

Follow Mendeley Twitter to hear when we launch this capability!

Mendeley Brainstorm: Climate Change – Too Little, Too Late?

Difficult decisions lay ahead if our planet is to avoid environmental catastrophe
Difficult decisions lay ahead if our planet is to avoid environmental catastrophe

2016 is set to be the hottest year on record. Rising sea levels have already forced out entire communities; melting permafrost may have unleashed an anthrax epidemic in Russia.  In response, the United States and China have promised to curb their carbon emissions.  However, is this a case of too little, too late? We are looking for the most well thought out answer to this question in up to 150 words: use the comment feature below the blog and please feel free to promote your research!  The winner will receive an Amazon gift certificate worth £50 and a bag full of Mendeley items; competition closes October 19.

2016: The Hottest Year on Record?

According to NASA and the United Nations, 2016 promises to be the hottest year on record.  This past June was, according to the UN, the “14th month for record heat” on land and sea.  This change represents a 1.3 degrees Celsius increase on the temperatures of the pre-industrial era.

The consequences of climate change have already been severe.  In August, the coastal village of Shishmaref, Alaska voted to relocate itself due to rising sea levels.  Elevated temperatures have been linked to melting of the permafrost in Russia, which may have sparked an outbreak of anthrax.  More extreme weather events and their follow on consequences have been widely predicted.

The World Responds

At the recent G20 summit, the two nations which emit the most carbon, China and the United States, agreed to make significant reductions.  In August, the Netherlands discussed banning petrol and diesel fueled cars. President Obama also promised $40 million to island nations in order to help them cope with the effects of climate change.

Too Little, Too Late?

The nations of the world are finally grappling with the reality of climate change, but are these efforts too little, too late?  Tell us!

Try Mendeley Data!

mendeleydata-climatechange

Climatologists already use Mendeley Data to store their findings; it’s handy, easy to use and offers a broad variety of licensing schema so that your data can be distributed, embargoed and utilised in any way you choose.  It also interlocks with the wider Mendeley ecosystem for added convenience.  Visit http://data.mendeley.com

About Mendeley Brainstorms

Our Brainstorms are challenges so we can engage with you, our users, on the hottest topics in the world of research.  We look for the most in-depth and well thought through responses; the best response as judged by the Mendeley team will earn a prize.

References

Bogado, A. (2016) Alaska native village votes to relocate in the face of rising sea levels. Climate Desk. Available at: http://climatedesk.org/2016/08/alaska-native-village-votes-to-relocate-in-the-face-of-rising-sea-levels/ (Accessed: 6 September 2016).

Luhn, A. (2016) Did climate change cause Russia’s deadly anthrax outbreak? Climate Desk. Available at: http://climatedesk.org/2016/08/did-climate-change-cause-russias-deadly-anthrax-outbreak/ (Accessed: 6 September 2016).

Parkinson, J. (2016) Obama, Chinese president ratify landmark climate deal ‘to save our planet’. ABC News. Available at: http://abcnews.go.com/International/obama-chinese-president-xi-ratify-climate-change-agreement/story?id=41842303 (Accessed: 6 September 2016).

The Guardian. (2016). 2016 set to be world’s hottest year on record, says UN. [online] Available at: https://www.theguardian.com/environment/2016/jul/21/2016-worlds-hottest-year-on-record-un-wmo [Accessed 6 Sep. 2016].

Sheppard, K. (2016) Obama to announce new climate change help for island nations. Huffington Post. Available at: http://www.huffingtonpost.com/entry/obama-climate-change_us_57c855dee4b0e60d31dda9bd (Accessed: 6 September 2016).

Staufenberg, J. (2016) Climate change: Netherlands on brink of banning sale of petrol-fuelled cars. The Independent. Available at: http://www.independent.co.uk/environment/climate-change/netherlands-petrol-car-ban-law-bill-to-be-passed-reduce-climate-change-emissions-a7197136.html (Accessed: 6 September 2016).

Introducing Elsevier DataSearch

Elsevier takes the next step in making researchers’ lives easier with the new DataSearch engine.  You can search for research data across numerous domains and various types, from a host of domain-specific and cross-domain data repositories. It’s available at (https://datasearch.elsevier.com/) – please join our User Panel to help improve it!

More Focused Searching

Mass search engines are ubiquitous and useful; however, when it comes to specific information tailored to the needs of the modern researcher, a more focused application is required.  In response to this need, Elsevier has created DataSearch.  Drawing on reputable repositories of information across the internet, researchers can readily find the data sets they need to accelerate their work.

DataSearch offers a new and innovative approach.  Most search engines don’t actively involve their users in making them better; we invite you, the user, to join our User Panel and advise how we can improve the results.  We are looking for users in a variety of fields, no technical expertise is required (though welcomed).  In order to join us, visit https://datasearch.elsevier.com and click on the button marked “Join Our User Panel”. Please detail in your e-mail the following:

  • Your Name
  • Institution
  • Research Interests

We look forward to working with you and improving the research experience.

Putting data in the hands of researchers with Hivebench

Lab notebook tool Hivebench will be integrated with Mendeley to help researchers enrich and manage their data

Hivebench

Research data is the foundation on which scientific, technical, social and medical knowledge is built. That’s why enabling access to, sharing and reuse of data is tremendously valuable to everyone involved in advancing science.

Of course, making research data manageable for researchers and their colleagues is not always easy. Proper data management requires solutions that help researchers not just store, but also share, discover and re-use their data. That way, authors receive credit for their work while the wider research community benefits from discovering and using research data.

Using research data to its full potential requires consistency in the way it is collected and stored. Hivebench provides an essential first step in this process. It is a digital laboratory notebook that helps researchers prepare, conduct and analyze experiments, methods, and protocols in one place, saving them valuable time. Hivebench has thousands of registered users who position it in the center of their research process. Importantly, Hivebench allows researchers to link data and metadata without requiring them to change the way they work. This avoids making data collection feel like an administrative overhead.

On Wednesday 1 June, Elsevier acquired Hivebench to help further streamline the workflow of researchers – putting research data management at their fingertips. The added value of the integration lies in linking Hivebench with Elsevier’s existing Research Data Management portfolio for products and services. The research data that researchers have stored in the Hivebench notebook are linked to the Mendeley Data repository, which will be linked to Pure. This way, the research data is linked with metadata such as the DOI, the published article, controlled data versioning, and the methodology, which adds instant value to the datasets because they become far more suitable for reuse.

Researchers will benefit in a number of ways. Many funders these days require insight into the research design, process, and data sets. This becomes easier with the help of an electronic lab notebook. Research also shows that articles that are linked with their underlying data get cited more. In addition, well-described data sets can sometimes be more useful than an article itself. Sometimes when doing research, the number of articles to read and digest can be overwhelming – it can be hard to determine what to read and what not to read. Data can provide more information, provided of course that the right metadata are linked to it so the data sets are adequately described. And that’s exactly what we’re doing by linking Hivebench to Mendeley Data.

“Saving researchers time by providing them with a user-friendly way to store and manage their data has been our focus until now,” said Dr. Julien Thérier, CEO and founder of Shazino, the Lyon, France-based company that launched Hivebench. “But we knew that if we wanted to scale up our activities and create additional added value, our product would need to be integrated with a chain of tools that catered to the need of researchers to share and reuse data sets as well. We’ve been collaborating with Elsevier’s Mendeley for the past two years and already enable Hivebench users to export their results to Mendeley Data.”

The integration with Elsevier will enable Hivebench to make its services available to many more researchers, making sharing and reuse possible on an unprecedented scale – and unlocking the full potential of research data.