Meet the Mendeley Data advisory board: Amy Neeser

In this series of interviews, we meet some of the members of the Mendeley Data advisory board and get their thoughts on the role of research data management (RDM), and how Mendeley Data can contribute to this.

Amy NeeserName: Amy Neeser

Job Title: Consulting and Outreach Lead at University California Berkeley (UC Berkeley)

Bio: Amy is a data librarian working in Research IT. She coordinates the consulting efforts across the Data Management and Research Computing programs to offer a holistic approach to data and computation. She also facilitates their community, partnership, and outreach programs. She previously worked as the Research Data Management Program Manager at UC Berkeley, as Data Curation Librarian at the University of Michigan, and as a science librarian at the University of Minnesota.

What motivates you each morning?

I am passionate about research. I love that I get to help enable world changing research by helping Berkeley faculty, students, and staff address the challenges and opportunities associated with research data and computing.

What challenges do you want to see RDM fix?

There are two main things really. In terms of practicality, I would love to see RDM really focus on sensitive data needs. Currently this is often managed at an institutional level, but it would make a huge impact if there was a nationwide, or product-based solution that could address this. That would be huge.

Secondly, I think RDM is vital for reproducibility. Technologies like containers and Jupyter Notebooks enable users to share not only their data but also the software, versions, and specs to analyze it. As these types of technologies data management practices become more commonly used, it will be much easier to share and reproduce results!

What excites you most about Mendeley Data?

I like how the different modules and features available can easily interact with each other. And it’s practical, supporting the data management process.

I feel that Mendeley Data Repository can help institutions address the reproducibility crisis, and it can save the need for institutions to create a repository at a local level.

What do you think the future holds for RDM?

I don’t think RDM can or should be “owned” by one unit or department, such as the library. It’s too big an area to be managed alone, and different players bring difference expertise and experience. It calls for a combined effort.

A lot of the questions that I get are in the active phase of the research lifecycle and often include sensitive data. IT can help with these issues, but also needs the library’s expertise around the beginning (planning, finding) and end (publishing, sharing, preserving) of the research lifecycle to provide researchers with a holistic approach to their scholarship.

More researchers from across domains use data and computational resources, and I think IT must be closely aligned with the library and other important players on campus such as the office of research.

 

Find out more about Mendeley Data here.

Meet other members of the Mendeley Data advisory board here.

Meet the Mendeley Data advisory board: David Groenewegen

In this series of interviews, we meet some of the members of the Mendeley Data advisory board and get their thoughts on the role of research data management (RDM), and how Mendeley Data can contribute to this.

David GroenewegenName: David Groenewegen

Job Title: Director, Research, Monash University Library

Bio: David Groenewegen is the Director, Research. He is responsible for Library client services to the science, technology, engineering and medicine disciplines at Monash University, as well as the contribution the Library makes to the University’s research activity.

David has wide-ranging experience working in the areas of electronic information provision and related technology. Before returning to Monash University Library in 2013 he spent four years as a Director of the Australian National Data Service, where he was involved with the development and implementation of data management solutions across the Australian university sector.

What motivates you each morning?

The thing I most love doing is trying to find ways to help our researchers do their job better, which in the library means giving them the tools, training and resources they need, at the time they need it, and in ways that simplifies their life, not complicates it. I’ve been lucky to have the chance to try lots of new and cool things in my career, and I’m always looking for the next one.

What challenges do you want to see RDM fix?

I want things to become frictionless. I’d like to see software that’s smart enough to understand the subtleties of where data is stored and create that connect with other software and processes throughout the researcher lifecycle. This would really help to overcome the messiness caused by having information all over the place.

What excites you most about Mendeley Data?

One valuable thing that Mendeley Data is trying to address is how to bring data together, and manage it in a consistent end-to-end way. But for me, the modular aspect of Mendeley Data is the most exciting part. You’re not locked into one solution, instead you’re able to plug in different Mendeley Data modules into your own workflows – it’s the way universities like ours want to work

What do you think the future holds for RDM?

The need for RDM is well known, but there are still a lot of people struggling with finding the most frictionless way of doing things. Bespoke software might appear to be the best solution, but often this won’t work fantastically well, as integrating new processes into existing workflows isn’t easy. RDM isn’t as simple as storing data in a repository. I’m seeing growing recognition of the need to curate data and package it up for later use, so that others can get a decent answer out of it. Most of the tools currently available don’t support this very well.

Following on from this, long-term curation and management of shared data is also a key area I’d like to see develop. What was considered a lot of data 10 years ago isn’t now, but it’s not feasible to continue buying more storage so that we can keep everything just in case. Improving metadata goes a long way towards addressing this as it enables you to make quick decisions later on, but I’d like to see new processes developed that help us to identify if we no longer require to hold certain data.

 
Find out more about Mendeley Data here.

Meet other members of the Mendeley Data advisory board here.

Meet the Mendeley Data advisory board: Rebecca Koskela

Sharing research data has the potential to make research more reproducible and efficient. When developing Mendeley Data – an ecosystem that enables data to be stored, shared and re-used – we worked with a board of librarians and research leaders from across the research data management community.

In this series of interviews, we meet some of the members of the Mendeley Data advisory board and get their thoughts on the role of research data management (RDM), and how Mendeley Data can contribute to this.

Rebecca KoskelaName: Rebecca Koskela

Job Title: Executive Director of DataONE at University of New Mexico

Bio: Rebecca Koskela is responsible for the day-to-day operation of DataONE—coordinating all technical, management, reporting, and budget issues.

Prior to her current position, Rebecca was the Life Sciences Informatics Manager for Alaska INBRE, and the Biostatistics and Epidemiology Core Manager for the Center for Alaska Native Health Research at the University of Alaska Fairbanks. In addition to her bioinformatics experience, Rebecca has over 25 years’ of experience in high performance computing, including positions at Sandia National Laboratories, Los Alamos National Laboratory, Cray Research and Intel.

What motivates you each morning?

In addition to duties at DataONE, I’m a volunteer for other projects, such as EarthCube and Research Data Alliance, which are also concerned with research data management. The collaboration with these other projects moves them all forward.

What challenges do you want to see RDM fix?

There are two main challenges that I’d like to see addressed more quickly.
It’s great that more and more funding agencies are requiring data management plans, but I think we’re lagging in the development of tools to help people do the actual planning.

I also still see problems today around data discovery and the need for adequate documentations to re-use data. In 2010, we carried out a survey at DataONE which found that researchers had limited understandings of metadata standard. Unfortunately, even with the emphasis on FAIR data, we still have a long way to go to highlight the significance of metadata.

What excites you most about Mendeley Data?

The thing that stands out to me the most about Mendeley Data is that, contrary to what people may think, Elsevier doesn’t own the data – it remains in the control of the researcher. I love that.

Mendeley Data 5 factsI also really like the fact that users can pick and choose which modules they’d like to use. This means that you can get started somewhere, and have the option to expand into other RDM tools when it suits you, instead of having to start using everything from the offset.

What value does Mendeley Data bring to the space?

Mendeley Data is all about education – it helps people learn what is meant by RDM, and then provides the tools to do it.

I also like the fact that you can manage different metadata standards with Mendeley Data. It’s a good quality product built on strong coding.

What do you think the future holds for RDM?

I hope that people will pay attention to the need for quality metadata. I’d like to see better tools being developed that will speed up change here.
I also think that education needs to play at important part in RDM – it should go hand-in-hand with tool creation. I also want to see some success stories that show how added effort can really pay off.

 

Find out more about Mendeley Data here.

Meet other members of the Mendeley Data advisory board here.

 

Meet the team: Wouter Haak

Name: Wouter Haak
Job title: VP Research Data Management

Wouter HaakWouter is responsible for research data management at Elsevier, specifically the Mendeley Data platform. This is an open ecosystem of researcher data tools: a data repository, an electronic lab notebook, a data search tool, and a data project management tool. Aside from his work for Elsevier, Wouter is part of several open data community initiatives; for example he co-chairs the RDA-WDS Scholix working group on data-article linking; he is part of the JISC Data2paper advisory board; and his group participates in the NIH Data Commons pilot project. It is all about the ‘R’ of FAIRdata: focusing on data re-use.

Prior to Elsevier, Wouter worked in online product and strategy roles. He has worked at eBay Classifieds, e.g. Marktplaats.nl, Kijiji.it – in roles varying from business development to overall responsibility for the classified’s businesses in Italy, France, Belgium and Turkey. Furthermore, he has worked for the Boston Consulting Group.

When did you join Mendeley?

2016

What do you love most about your job?

I love speaking to researchers, about their projects and visions. Going to universities and learning about the things they do, I’m proud that I can contribute a tiny piece to this amazing world.

What book did you most recently read?

I read the Cicero trilogy by Robert Harris. Amazing how something that takes place during the Roman empire is still actual today. The main character is not Cicero but his slave: Tiro. Tiro – quietly working in the background – is actually the hero of this story.

What’s the one thing you want people to know about Mendeley?

That Mendeley is becoming more than a reference manager. I would like to see Mendeley grow to becoming a daily virtual partner of researchers.

How would you explain your job to a stranger on a bus?

I help researchers and universities with re-using the data and measurements that they create better.

What’s the most exciting part of your job?

In my direct team of about 50 people, I find it exciting that we have more than 10 nationalities. I have lost count and that is fun.

What keeps you awake at night?

Nothing keeps me awake at night. Having gone through raising young kids, I have learned that problems are best tackled during the day.

What’s the most interesting thing you’ve learned this week?

I learned that the European Open Science Cloud project is starting to have areas that are going to be very real and helpful for research overall. My plan is to see if we can contribute to this. Less so to the infrastructure but more likely on the ‘tools’ or ‘commons’ side.

Find out more about Mendeley Data

Find out more about all-things Mendeley

Effective research data management with Mendeley Data

The science of tomorrow will require the data from today

All the information underpinning research articles offers value to other researchers: raw and processed data, protocols and methods, machine and environment settings, and scripts and algorithms. Sharing and using such research data can increase the impact, validity, reproducibility, efficiency, and transparency of research.

To unlock the true potential of research data, the Mendeley Data team believe that there is a need to move beyond solely making data available and find a dependable solution that enables data to be stored, shared and re-used. So we launched Mendeley Data. When collaborating with the research community to develop Mendeley Data, we followed four guiding data principles:

  1. Data needs to be discoverable
  2. Data needs to be comprehensible
  3. Researchers should be able to take ownership of their data
  4. Research data management (RDM) solutions need to be interoperable.

Discover more about the four principles for unlocking the full potential of research data.

Empowering researchers to perform research data management

Open science benefits research and society, and drives research performance. Here are five things you need to know about RDM with Mendeley Data:

  1. Mendeley Data supports the entire lifecycle of research data: modules are specifically designed to utilize data to its fullest potential, simplifying and enhancing current ways of working
  2. Researchers own and control their data: you can choose to keep data private, or publish it under one of 16 open data licenses
  3. Mendeley Data is an open system: modules are designed to be used together, as standalones, or combined with other RDM solutions
  4. Mendeley Data can increase the exposure and impact of research: Mendeley Data Search indexes over 10 million datasets from more than 35 repositories
  5. We actively participate in the open data community: we are currently working on more than 20 projects globally

View an infographic on the five facts

Mendeley Data 5 facts

Striving for superior data management for researchers

No one can solve RDM challenges alone, nor can one business unleash the full potential of research data sharing. However, through following core data principles, and continually evaluating and improving the RDM solutions built on our Mendeley Data platform, we hope to be able to contribute to supporting researchers discover the value of their data .

Get started with Mendeley Data.

Find out more about all-things Mendeley here

Mendeley Data: Introducing Folders

We’ve introduced Folders to make organising your data easier.

At Mendeley Data, the open research data repository, we’ve just launched folders to help every dataset author group and logically organise their research data files into folders, in the same way they would organise files on their computer.

“It would be great if a folder structure would be applicable for datasets. For example, I would like to share data from a method comparison study. One folder for each dataset within this comparison would be most convenient.”

 

A folders feature was requested by our users via survey results and feedback. We will continue to listen to researchers in order to improve our service and add features most relevant to our end users.

Authors are able to drag and drop to either create subfolders, or change the order of the folders, with any data files outside the folder structure ordered alphabetically. Click ‘Create Folder’ to start organising your files.

The process of uploading data, with the ability to click or drop any file type, will remain the same. For those datasets that are already published, the ordering of files will not change. However, for those datasets which are in draft form or if another version is subsequently created, all ordering of data files uploaded will change to an alphabetical ordering system rather than the one the dataset author had previously set.

1800 Journals Enable Data Sharing Through Mendeley Data

Use Mendeley Data to safely store, share and cite your research data.

You may have noticed that funding bodies and universities increasingly require you to share your research data at the end of your project.  This often coincides with the time when you publish papers about your research.  Therefore, journals are looking for ways to make it easier to you to share your data and comply with funder mandates. Mendeley Data can help with that.

Elsevier announced earlier this month that they are now implementing journal data guidelines for all their journals. This means that all journals will clearly explain whether you are expected to make your data available. More importantly, this means that all journals now provide the right infrastructure for data sharing.

For most journals this means that they will provide three options. First, it is possible to link to your data in a domain-specific data repository. Domain-specific repositories are often the best place for your data because they can ask for the information that is relevant in your field. However, in cases where there is no good domain-specific repository available, these journals enable you to share your data through Mendeley Data.

When you upload your data to Mendeley Data during the article submission process, a draft of your data will become available. Only you, the editor, and the reviewers have access to this draft. This gives editors and reviewers the opportunity to take a look and provide feedback. You can then still make changes to improve your dataset. By default, your dataset will only become publicly available when your article is published. If you want to analyze your data further before sharing with the world, you can also set an embargo data so that the dataset will become available at a later time.

In cases where you cannot share your data at all, you will have the option to make a data statement, explaining why your data is unavailable. Should you wish to make your data available at a later point in time, just go to data.mendeley.com and indicate that this dataset is linked to an article. We will make sure your article links back to your dataset to ensure it gets the attention it deserves.

Mendeley Data awarded Data Seal of Approval

On June 22, it was announced that Mendeley Data’s open research data repository won the Data Seal of Approval certification; this award confirms that the repository complies with the Data Seal of Approval guidelines, and is a trusted digital repository.

The 16 rigorous guidelines include guarantees to the “integrity and authenticity of the data” and “protection of the facility and its data, products, services, and users”.  Also, these guidelines ensure that data can be easily cited.

When choosing a repository to deposit and share your data, it’s important to know that your data will be stored safely, and will be available, findable and accessible over the long-term. Choosing a certified repository is a way to ensure this. For this reason JISC, the UK’s publicly-funded research advice body, recommends selecting a certified repository to store your data.

The Data Seal of Approval certification highlights the value of the services provided by Mendeley Data, and Elsevier’s wider commitment to helping researchers make maximum use of their data, as well as store their vital data safely.

About Mendeley Data

Mendeley Data is a secure cloud-based repository where researchers can store data, ensuring it is easy to share, access and cite, wherever they are. Research data is published with a Force11 compliant citation; it is backed up by DANS (Data Archiving Networking Services) to ensure that it is safely archived.

Mendeley Data can be accessed at http://data.mendeley.com/.

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

Topic-specific conferences are no longer just focused on research; Data sharing initiatives are now a major part of the discussions happening in research. With Mendeley Data and other Data Management initiatives, we are always looking to learn more, directly from the researchers. Anita DeWaard, Vice President of Data Research and Collaboration for Elsevier, shares how she learned data repositories often struggle with similar issues — and how collaboration can help address those issues.

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

I attended my first AGU meeting in New Orleans last fall, with the intention to learn more about informatics, metadata and research data in the Earth and Planetary sciences. For a newbie, this meeting is an intimidating affair where over 25,000 scientists gather to discuss topics ranging from plate tectonics to probabilistic flood mapping, and from solar prominences to paleoclimatology.

Informatics and metadata played a huge role in the program sessions. The Earth and Space Science Informatics Section alone, for example, amounted a staggering 1,200 talks and posters on that topic alone. And that by no means comprises the full extent of sessions about informatics and metadata: for instance, the Hydrology Section has not one but two sessions (with 10 – 20 papers each) on  ‘Advances in Data Integration, Inverse Methods, and Data Valuation across a Range of Scales in Hydrogeophysics’, and the Public Affairs Section hosts ‘Care and Feeding of Physical Samples and Their Metadata’.

It is easy to feel overwhelmed. Yet once I stopped focusing on watching the endless streams of people moving up and down escalators to more and more rooms full of posters and talks (and once I finally retrieved my Flat White after the seemingly endless fleece-clad line at the Starbucks!) I learned that if you just jump in the stream and go with the flow, the AGU is really just a great ride.

I was involved in three events: a session about data discovery, one on Unique Identifiers, and the Data Rescue award, which Elsevier helped organize, together with IEDA, the Interdisciplinary Earth and Data Alliance (http://www.iedadata.org/).

Data Repository issues; Or, how to come up with a means of survival

In the Data discovery session, we had 8 papers pertaining to searching for earth science data. Siri Jodha Khalsa and myself are co-chairing a nascent group as part of the Research Data Alliance on this same topic, very is quite relevant to us in developing our Datasearch platform. It struck me how very comfortable and aware of various aspects of data retrieval the earth science community seems to be, compared to repositories in other domains, who are just starting to talk about this.

The data repositories that presented were struggling with similar issues; how to scale the masses of content that need to be uploaded, how to build tools that provide optimal relevancy ranking over heterogeneous and often distributed data collections, keep track of usage, provide useful recommendations, and offer personalisation services when most search engines do not ask for login details, all with a barebones staff an an organisation that is more often than not asked to come up with means for its own survival.

The end of download science

At the Poster session that evening, it was exciting to see the multitude of work being done pertaining to data discoverability. One of the most interesting concepts for me was in a poster by Viktor Pankratius from MIT, who developed a ‘computer-aided discovery system’ for detecting patterns, generating hypotheses, and even driving further data collection from a set of tools running in the cloud.

Pankratius predicted the ‘end of download science’: whereas in the past, (earth) scientists did most of their data-intensive work by downloading datasets from various locations, writing tools to parse, analyze and combine them, and publish (only) their outcomes, Pankratius and many others are developing analysis tools that are native to the cloud, and are shared and made available together with the datasets for reuse.

Persistent Identifiers

On Thursday, I spoke at a session entitled:“Persistent Identification, Publication, and Trustworthy Management of Research Resources”: two separate but related topics. The first three talks focused on trustworthiness, Persistent identifiers are a seemingly boring topic, that, however, just got their own very groovy conference, PIDaPalooza (leave it to Geoff Bilder to groovify even the nerdiest of topics!).

One of the papers in that session (https://agu.confex.com/agu/fm16/meetingapp.cgi/Paper/173684) discussed a new RDA Initiative, Scholix, which uses DOIs for papers and datasets to enable a fully open linked data repository that connects researchers with their publications and published datasets. Scholix represents a very productive collaboration, spearheaded by the RDA Data Publishing Group, involving many parties including publishers (including Thomson Reuters, IEEE, Europe PMC and Elsevier), data centres (the Australian National Data Service, IEDA, ICPSR, CCDC, 3TU DataCenter, Pangaea and others) and aggregators and integrators (including CrossRef, DataCite and OpenAIRE).

Persistent identifiers combine with semantic technologies to enable a whole that is much more than the sum of its parts, that surely points the way forward in science publishing; it allows, for instance, Mendeley Data users to directly address and compare different versions of a dataset (for some other examples see my slides here).

Celebrating the restoration of lost datasets

A further highlight was the third International Data Rescue Award. This award, the third so far, is intended to reward and celebrate the usually thankless task of restoring datasets that would otherwise disappear or be unavailable. This reward brings together (and aims to support the creation) of a community of very diverse researchers, who all have a passion for restoring data.

This year’s winners, were from the University of Colorado in Boulder. Over a period of more than fifteen years, they rescued and made accessible the data at the Roger G. Barry Archive at the National Snow and Ice Data Center, which consists of a vast repository of materials, including over 20,000 prints, over 100,000 images on microfilm, 1,400 glass plates, 1,600 slides, over 100 cubic feet of manuscript material and over 8,000 ice charts. The material dates are incredibly diverse and date from 1850 to the present day and include, for instance, hand-written 19th century exploration diaries and observational data. Projects such as these tell us the incredible importance of data, especially in times like these, where so much is changing so quickly. Looking at the pictures in the Glacier Photograph Collection shows the incredible extent of glacial erosion between, for instance, 1941 and today, when an entire glacier has simply vanished and offers a grim reminder of the extent to which global warming is affecting our world.

In short, there is a lot out there for all of us to learn from going to the AGU. Earth science is abuzz with data sharing initiatives: there are exciting new frontiers to explore, important lessons to be learned, and invaluable data to be saved.