Store, Share and Find: Manage It All with Mendeley Data

Mendeley answers NEWYou recently learned about how Elsevier’s Mendeley Research Network can help you stay updated on the latest trends and developments in your field. But there’s another tool within Mendeley that can give you peace of mind about the data you’ve already generated in your research. Mendeley Data  is a free, secure cloud-based repository where you can store, share and find data, wherever you are. A vital part of the unified Mendeley ecosystem, Mendeley Data enables you to check if there is data out there for a new project that you are working on, as well as to execute your funding mandate and data management plans without so much time-consuming administrative overheads.

Seek and You Shall Find

When you start a new project, or apply for funding, you always check the latest research on your chosen topic and look into what has been done already previously. Why don’t you take a look at existing data on a topic as well? With Mendeley Data Search you can find related data easily, and with  over nine million datasets from over 30 repositories worldwide indexed, that’s a wealth of information readily available for you to easily preview relevant data to support  your project.  Your funder will also be impressed if you show that you’ve taken the time to ensure that you’re not duplicating efforts.

Get Credit for Sharing Your Data

An open science repository, Mendeley Data allows you to quickly and easily upload files of any type – with as many as 10GB per dataset. You can import your own folder structure, and your data is automatically tagged with subject classifications. Mendeley Data has received the widely recognized CoreTrustSeal certification, so you can be confident that your data always will be safe and accessible. Plus, your data is archived for as long as you need it by Data Archiving and Networked Services (DANS), the Netherlands-based institute for permanent access to digital research resources. Best of all, you retain complete control and copyright over the data, and choose the terms under which others may consume and reuse it.

Mendeley Data also supports versioning – making longitudinal studies easier to manage. All published versions of a dataset can be viewed and compared by clicking on the links in “Version” history.

There’s a vetting process to store data in Mendeley; each collection of research data files is checked by a qualified reviewer, to ensure the content constitutes research data, is scientific in nature, and doesn’t solely contain a previously published research article. Datasets also may not contain executable files or archives that are unaccompanied by individually detailed file descriptions; copyrighted content (audio, video, images) to which you do not own the copyright; or sensitive information (such as HIPPA-protected patient details or birthdates).

Could the process be any more painless?

>             Register/log in to Mendeley Data.
>             Click “New dataset.”
>             Upload data files.
>             Add metadata (including Title, Description and Contributors) for the                                          dataset.
>             Save
>             Hit “Publish.” (only when you’re absolutely ready for it to go public).

Each researcher’s dataset is discoverable, because it’s deeply-indexed in Mendeley Data’s powerful search engine. In addition, it is marked with the standard schema.org metadata markup language.

Datasets in Mendeley Data are viewed and downloaded frequently – on average once per month. As a result, we see that articles having accompanying datasets get cited more often.
Every dataset in Mendeley has a unique and permanent DataCite DOI(digital object identifier) which makes it much simpler for you, or other researchers, to locate and reference your data. When you publish your research, you can connect your paper to the cited dataset via the DOI and it will be indexed in OpenAIRE, the EU initiative aimed at improving the discovery and reuse of research publications and data.

Share Your Data – Or Not

When you use Mendeley Data, you control who gets to use your data and when. You have the option to securely share your data with colleagues and co-authors before publication, or publish your data to the world when you’re ready to do so.

With many Elsevier journals, it’s possible to upload and store your dataset to Mendeley Data during the manuscript submission process. You can also send your data directly to the repository. In each case, your data can be linked to any associated journal article on Elsevier’s ScienceDirect, making it easy for readers to find and reuse.

Mendeley Data benefits not only you, but your institution. By saving time in searching, collecting and sharing data, it prevents re-work. Mendeley showcases institutional research outputs, boosting your reputation as well as that of your employer. With quick access to so much data, institutions are able to improve collaborations internally and externally.

Let Mendeley Manage What You Generate

It’s time to get more credit for your data. Mendeley Data has the power to make this happen – enabling  your data to be citable, accessible and discoverable with  optimal data management, so you can focus on your research. Isn’t that what really matters?

Get started with Mendeley Data

 

 

Mendeley Data awarded Data Seal of Approval

On June 22, it was announced that Mendeley Data’s open research data repository won the Data Seal of Approval certification; this award confirms that the repository complies with the Data Seal of Approval guidelines, and is a trusted digital repository.

The 16 rigorous guidelines include guarantees to the “integrity and authenticity of the data” and “protection of the facility and its data, products, services, and users”.  Also, these guidelines ensure that data can be easily cited.

When choosing a repository to deposit and share your data, it’s important to know that your data will be stored safely, and will be available, findable and accessible over the long-term. Choosing a certified repository is a way to ensure this. For this reason JISC, the UK’s publicly-funded research advice body, recommends selecting a certified repository to store your data.

The Data Seal of Approval certification highlights the value of the services provided by Mendeley Data, and Elsevier’s wider commitment to helping researchers make maximum use of their data, as well as store their vital data safely.

About Mendeley Data

Mendeley Data is a secure cloud-based repository where researchers can store data, ensuring it is easy to share, access and cite, wherever they are. Research data is published with a Force11 compliant citation; it is backed up by DANS (Data Archiving Networking Services) to ensure that it is safely archived.

Mendeley Data can be accessed at http://data.mendeley.com/.

Mendeley at JCDL 2014

by Patrick Hochstenback @hochstenbach
Image by Patrick Hochstenback @hochstenbach

The Mendeley Data Science team have been busy attending some important events around the world. One of them has been JCDL 2014, the most prominent conference in the Digital Libraries arena. The conference looks at many of the problems we’re tackling at the moment, such as article recommendations and the best ways of automatically extracting information from research articles.

Maya Hristakeva, Senior Data Scientist at Mendeley, was particularly excited about the various approaches to topic modelling that were discussed at the event. “Topics were used as features for a diverse range of tasks, such as prediction of an author’s future citation counts, making personalised recommendations, search, author disambiguation, and creating more relevant citation networks, all features that make a direct impact to the research workflow on Mendeley.”

“We saw some really thought-provoking output come out of the JCDL14 proceedings such as Characterizing Scholar Popularity : A Case Study in the Computer Science Research Community. In JCDL’14” explains Kris Jack, Chief Data Scientist at Mendeley. “Some of the interesting research questions raised included one by Gonçalves, G. D., Figueiredo, F., Almeida, J. M., & Gonçalves, M. A. (2014) which asked whether it is possible to represent the popularity of a researcher using the number of readers that they have.”

It was also nice to see evidence in some of the papers presented that Mendeley readership is highly correlated with various measures of academic impact, such as h-index and publication venue importance,” says Mendeley Senior Data Scientist Phil Gooch.

Overall, this was a really valuable opportunity to connect with researchers who are working on similar problems to Mendeley, such as metadata extraction, recommendations, and citation/author/venue disambiguation, so we’re thinking about the idea of perhaps running an open challenge to focus this research into concrete output that could be of use in features for our users. If you have any ideas around that, do get in touch on Twitter with @_krisjack @mayahhf and @Phil_Gooch

Note: At Mendeley, we believe in dogfooding (it’s not as disgusting at it sounds, merely techy slang for using your own product to validate the qualities of that product…) so Maya, Kris and Phil took notes using Mendeley Desktop 🙂

 

Discussing the Future of Recommender Systems at RecSys2014


Maya and Kris from the Mendeley Data Science team have just returned from RecSys2014, the most important conference in the Recommender System world. RecSys is remarkable in that it attracts an equal number of participants from industry and academia, many of whom are at the forefront of innovation in their fields.

The team had a chance to exchange perspectives and experiences with various researchers, scholars and practitioners.

“To me, it was encouraging to see how top companies across the world are investing in recommenders, as they are shown to enhance customer satisfaction and bring real value to both users and companies,” says Mendeley Senior Data Scientist Maya Hristakeva. “LinkedIn reported that 50% of the connections made in their social network come from their follower recommender, while Netflix says that if they can stop 1% of users from cancelling their subscription then that’s worth $500M a year, which of course justifies the fact they are investing $150M/year in their content recommendation team, consisting of 300 people.”

But one of the advantages of such a hybrid event is that it did not shy away from addressing the broader issues, such as how to ward against creating a “filter bubble” effect, how to preserve user’s privacy, and optimising systems for what really matters (and how this can be effectively defined). Daniel Tunkelang, LinkedIn, and Xavier Amatriain, Netflix, moderated a panel on “Controversial Questions About Personalization“, tackling some of these topics head on. Hector Garcia Molina from Stanford University also put forward the view that we’ll increasingly see a convergence of recommendations, search and advertising, despite noticeable scepticism from the attendees.

Kris Jack, Chief Data Scientist at Mendeley, says one of the main messages that he took away from the conference was the importance of winning a user’s trust in the early stages of using a recommender system.

“The best systems have been shown to start off by providing recommendations that can quickly be evaluated by users as being useful before gradually introducing more novel recommendations. So in the case of helping researchers to find relevant articles to read, it’s probably best to start by recommending well known but important articles in their field, before recommending some less well known but very pertinent articles to their specific problem domain.” explains Kris. “Other important factors include reranking (the order in which recommendations should be shown), the UI design that can best support interaction with the recommender system, and the ways in which we can build context-aware recommendations.”

What do you think of the current recommendation features on Mendeley? Are there any particular ones that you’d like to see implemented? Would you like to join the team and work on making them even better? Let us know in the comments below, or Tweet the team directly @_krisjack @mayahhf and @Phil_Gooch .If you’re interested in finding out more about what the Data Science Team is developing in that arena, you can also watch their Mendeley Open Day presentation here.

 

 

Mendeley moves into the cloud: It’s nice up here!

Mendeley Kite Cloud by Tom Atkinson www.r3digital.co.uk
Photo by Tom Atkinson @R3Digital

Last week we took what might seem like a small step, but was in fact a very giant leap by moving mendeley.com into the cloud. Now you might be thinking “Mendeley is already cloud-based, what are you talking about?” It’s true that our users can access their papers, annotations and all other data on any device, so we’re very much a cloud platform. In the past, however, Mendeley’s own servers were not cloud-based, which meant that the process of maintaining, updating and developing the product was sometimes not as optimal as it could be.

It’s a problem that many start-ups face, specially as they scale up, since it’s expensive and time-consuming to overhaul your systems without causing significant disruption to your users*. That, however, is one of the advantages of having the support and resources from Elsevier, who are investing on the Mendeley structure to make sure that we’re sustainable, scalable, and able to integrate with and develop tools and functionalities to meet researcher’s needs.

Having our data in the cloud means more reliability, speed and the ability to really make the whole development process more agile. That certainly means a happier Mendeley team, and we know it will help bring a better, faster-improving product for our community.

There was a real space-launch atmosphere as various Mendeley teams came together to work out the complex logistics of moving over 100 Terabytes of user data safely into the cloud, but it all went smoothly, thanks to the brave efforts of Robin Stephenson, James Rasell, Chris Barr, Callum Anderson, Kubilay Kara, James Gibbons and Merrick Barton (Jan was just basking in the atmosphere while feeling smug following the Germany-Brasil game).

Mendeley Control Room

We hope you like the improvements that this change will bring, we’re certainly excited about the future up here in the cloud!

* We did have a small amount of down time on Wednesday as the move happened, and apologies go to anybody who was inconvenienced.

Mendeley supports the FORCE11 Data Citation Principles

Mendeley was at the very first “Beyond the PDF” meeting in San Diego, which grew into FORCE11. We have been engaged with this community for almost as long as we have existed as a company, and though we aren’t on the group which drafted these principles and as yet have no formal stake in data management, we know personally and frequently interact with many of the people who are and do, thus we think it’s important that we announce our support for their work.

The Data Citation Principles cover a wide range of issues related to data, including specific issues relevant to us, such as credit, attribution, research impact, unique identification, and access. After all, what good is a citation that fails to resolve to the cited object, for either the citing or cited entity, and thus what use are they to a citation manager?

With our work as a leader in the altmetric community, we support researchers getting credit for all their work, not just that which is presented as a narrative publication. Looking at the broader research ecosystem, we can see that we must connect the whole provenance trail from the generation of the raw data to the publication of the figure to complete the cycle from reading and post-publication peer review to the generation of new hypotheses, protocols, and experiments. To this end, we’re also working on reproducible workflows with the Reproducibility Initiative, the importance of which was highlighted by a recent Nature editorial from the Director of the NIH and featured in today’s Elsevier Connect article from Genomics Data.

Congratulations to the FORCE11 team and the Data Citation Synthesis Working Group for taking this important step forward.

Mendeley labs project turns heads at Webscience 2013

headstart

Head Start, a Mendeley Labs project, has been nominated for best poster by conference participants at Web Science 2013. Head Start is intended to facilitate and improve the process of literature search. The visualization aims at providing an overview of an academic field, based on Mendeley data.

You know the problem… when you’re first exploring a research area, it is very hard to get an overview of the field. First, you might enter some keywords into an academic search engine such as Google Scholar. Then, you might read through the top results and read their references, provided your institution has access or if they’re available from an open access journal. With time and patience, you build a mental model of the field. There are several drawbacks to this approach: it is very laborious and time-consuming, and it’s very hard to read papers in their order of importance or even to know if you’ve found all the most important papers.

Peter Kraker from the Know-Center at Graz University of Technology has taken on the challenge to overcome these problems. During a research stay at Mendeley for the EU project TEAM, he has developed Head Start in cooperation with the Data Science group led by Kris Jack. The application presents you with the main areas in an academic field, and lets you zoom into relevant publications within each area. This allows a researcher to do most of the exploration in a single user interface.

The overview is generated (almost) automatically using Mendeley’s data about readership of academic papers within a discipline. Readership co-occurrence is used as a measure of subject similarity. The more often two books are checked out of the library together, the more likely they’re on the same subject, and so with academic papers – the more often two papers occur in someone’s Mendeley library, the more likely they are to be on similar subjects. The documents are then grouped by subject area and displayed using D3.js, a JavaScript library for making interactive visualizations on the web, made popular by the New York Times graphics department.

Peter will present Head Start at a webinar of the Web Science Trust Laboratories. The virtual presentation will take place on Wednesday, June 12 at 16:00 London time. Attendance is free; it just needs a simple registration following this link. More information is also available from this paper.

Please check out Peter’s demo and poster and let us know what you think!