Introducing Elsevier DataSearch – Beta Two

Elsevier DataSearch has been updated and improved.

Introduction

Elsevier DataSearch (https://datasearch.elsevier.com) is a data search engine that allows scientists and researchers to search for many different data types and formats across a variety of domain-specific and cross-domain institutional data repositories and other data sources. Results display datasets in a unified way to facilitate finding relevant and useful research data, and allowing users to quickly preview and assess data in-situ before viewing in the destination repository.  By generating previews of the actual data inline (e.g., spreadsheets, images, interactive maps, etc.), DataSearch helps users scan through multiple potentially interesting datasets much faster.  DataSearch indexes both metadata and data to facilitate the matching of queries to objects described in the research.

DataSearch is one of the complementary offerings in Elsevier’s Mendeley Data Platform for Institutions.

Beta Two

After the initial launch in June 2016, we gathered feedback from users to make iterative improvements in the search experience, especially around relevancy and ranking.  Users can also facet by data type, data source, data source type and publication date. Development is in progress to soon allow users to facet by subject classification, based on Elsevier’s OmniScience taxonomy.

Data sources covered by DataSearch now include:

Many more data sources will be added in the coming months, including life sciences repositories.

If you would like to have your institution’s data repository, local data and /or local active data indexed by DataSearch, please contact us at datasearch-support@elsevier.com

APIs

DataSearch has a “Pull” API that allows users to embed DataSearch results and data previews in their applications. Development is in progress for a “Push” API that will soon allow any repository to push data directly to DataSearch to make it discoverable and previewable.

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

Topic-specific conferences are no longer just focused on research; Data sharing initiatives are now a major part of the discussions happening in research. With Mendeley Data and other Data Management initiatives, we are always looking to learn more, directly from the researchers. Anita DeWaard, Vice President of Data Research and Collaboration for Elsevier, shares how she learned data repositories often struggle with similar issues — and how collaboration can help address those issues.

Beyond ‘Download Science’: Or How to Not Drown in Data at the AGU

I attended my first AGU meeting in New Orleans last fall, with the intention to learn more about informatics, metadata and research data in the Earth and Planetary sciences. For a newbie, this meeting is an intimidating affair where over 25,000 scientists gather to discuss topics ranging from plate tectonics to probabilistic flood mapping, and from solar prominences to paleoclimatology.

Informatics and metadata played a huge role in the program sessions. The Earth and Space Science Informatics Section alone, for example, amounted a staggering 1,200 talks and posters on that topic alone. And that by no means comprises the full extent of sessions about informatics and metadata: for instance, the Hydrology Section has not one but two sessions (with 10 – 20 papers each) on  ‘Advances in Data Integration, Inverse Methods, and Data Valuation across a Range of Scales in Hydrogeophysics’, and the Public Affairs Section hosts ‘Care and Feeding of Physical Samples and Their Metadata’.

It is easy to feel overwhelmed. Yet once I stopped focusing on watching the endless streams of people moving up and down escalators to more and more rooms full of posters and talks (and once I finally retrieved my Flat White after the seemingly endless fleece-clad line at the Starbucks!) I learned that if you just jump in the stream and go with the flow, the AGU is really just a great ride.

I was involved in three events: a session about data discovery, one on Unique Identifiers, and the Data Rescue award, which Elsevier helped organize, together with IEDA, the Interdisciplinary Earth and Data Alliance (http://www.iedadata.org/).

Data Repository issues; Or, how to come up with a means of survival

In the Data discovery session, we had 8 papers pertaining to searching for earth science data. Siri Jodha Khalsa and myself are co-chairing a nascent group as part of the Research Data Alliance on this same topic, very is quite relevant to us in developing our Datasearch platform. It struck me how very comfortable and aware of various aspects of data retrieval the earth science community seems to be, compared to repositories in other domains, who are just starting to talk about this.

The data repositories that presented were struggling with similar issues; how to scale the masses of content that need to be uploaded, how to build tools that provide optimal relevancy ranking over heterogeneous and often distributed data collections, keep track of usage, provide useful recommendations, and offer personalisation services when most search engines do not ask for login details, all with a barebones staff an an organisation that is more often than not asked to come up with means for its own survival.

The end of download science

At the Poster session that evening, it was exciting to see the multitude of work being done pertaining to data discoverability. One of the most interesting concepts for me was in a poster by Viktor Pankratius from MIT, who developed a ‘computer-aided discovery system’ for detecting patterns, generating hypotheses, and even driving further data collection from a set of tools running in the cloud.

Pankratius predicted the ‘end of download science’: whereas in the past, (earth) scientists did most of their data-intensive work by downloading datasets from various locations, writing tools to parse, analyze and combine them, and publish (only) their outcomes, Pankratius and many others are developing analysis tools that are native to the cloud, and are shared and made available together with the datasets for reuse.

Persistent Identifiers

On Thursday, I spoke at a session entitled:“Persistent Identification, Publication, and Trustworthy Management of Research Resources”: two separate but related topics. The first three talks focused on trustworthiness, Persistent identifiers are a seemingly boring topic, that, however, just got their own very groovy conference, PIDaPalooza (leave it to Geoff Bilder to groovify even the nerdiest of topics!).

One of the papers in that session (https://agu.confex.com/agu/fm16/meetingapp.cgi/Paper/173684) discussed a new RDA Initiative, Scholix, which uses DOIs for papers and datasets to enable a fully open linked data repository that connects researchers with their publications and published datasets. Scholix represents a very productive collaboration, spearheaded by the RDA Data Publishing Group, involving many parties including publishers (including Thomson Reuters, IEEE, Europe PMC and Elsevier), data centres (the Australian National Data Service, IEDA, ICPSR, CCDC, 3TU DataCenter, Pangaea and others) and aggregators and integrators (including CrossRef, DataCite and OpenAIRE).

Persistent identifiers combine with semantic technologies to enable a whole that is much more than the sum of its parts, that surely points the way forward in science publishing; it allows, for instance, Mendeley Data users to directly address and compare different versions of a dataset (for some other examples see my slides here).

Celebrating the restoration of lost datasets

A further highlight was the third International Data Rescue Award. This award, the third so far, is intended to reward and celebrate the usually thankless task of restoring datasets that would otherwise disappear or be unavailable. This reward brings together (and aims to support the creation) of a community of very diverse researchers, who all have a passion for restoring data.

This year’s winners, were from the University of Colorado in Boulder. Over a period of more than fifteen years, they rescued and made accessible the data at the Roger G. Barry Archive at the National Snow and Ice Data Center, which consists of a vast repository of materials, including over 20,000 prints, over 100,000 images on microfilm, 1,400 glass plates, 1,600 slides, over 100 cubic feet of manuscript material and over 8,000 ice charts. The material dates are incredibly diverse and date from 1850 to the present day and include, for instance, hand-written 19th century exploration diaries and observational data. Projects such as these tell us the incredible importance of data, especially in times like these, where so much is changing so quickly. Looking at the pictures in the Glacier Photograph Collection shows the incredible extent of glacial erosion between, for instance, 1941 and today, when an entire glacier has simply vanished and offers a grim reminder of the extent to which global warming is affecting our world.

In short, there is a lot out there for all of us to learn from going to the AGU. Earth science is abuzz with data sharing initiatives: there are exciting new frontiers to explore, important lessons to be learned, and invaluable data to be saved.

Mendeley Brainstorm: Assistive Technology – Powerful and Pervasive

Thanks to assistive technologies, impaired no longer means disabled.
Thanks to assistive technologies, impaired no longer means disabled.

The Paralympic Games open on September 7th; they are a visible example of how powerful and pervasive assistive technology has become. This month, we’re asking: what is the most innovative assistive technology application you’ve seen?  We are looking for the most well thought out answer to this question in up to 150 words: use the comment feature below the blog and please feel free to promote your research!  The winner will receive an Amazon gift certificate worth $50 and a bag full of Mendeley items; competition closes September 28th.

Powerful and Pervasive Technologies

Assistive technologies are diminishing physical limitations.  During the Democratic National Convention in Philadelphia, the delegates were addressed by Rep. Tammy Duckworth of Illinois.  She strode to and from the podium, fully mobile, despite having lost her legs while serving in the military.

The forthcoming Paralympic Games are another powerful illustration that impairment does not mean disabled: competition is conducted at the highest level.  New materials (such as carbon fibre) combined with engineering nous have created products such as the “Flex-Foot Cheetahwhich enable athletes to run who could not otherwise have walked. Other technologies compensate for the absence or impairment of senses.

For the Elderly Too

These technologies also assist the elderly. A “Smart Walker”, for example, can have a range of functionality including an “Advanced human–machine interface” in addition to providing physical support. (Martins et al., 2012, p. 555) One type of “Smart Walker” is the “SIMBIOSIS”: “This walker presents a multisensory biomechanical platform for predictive human–machine cooperation….the forces that are applied by the user on each forearm-support while walking are measured and the guidance information can be inferred. This turns out to be a natural and transparent interface that does not need previous training by the user.” (Martins et al., 2012, p. 558)

The Future?

It’s clear that assistive technology is enhancing lives, but what is the most innovative application you’ve encountered?  Tell us!

Try Elsevier DataSearch!

DataSearch results
Partial results for DataSearch lookup for “Flex-foot Cheetah”

Note: much more information for researchers can be found via Elsevier Datasearch (https://datasearch.elsevier.com/):  DataSearch works with reputable repositories across the Internet to help researchers readily find the data sets they need to accelerate their work. DataSearch offers a new and innovative approach.  Most search engines don’t actively involve their users in making them better; we invite you, the user, to join our User Panel and advise how we can improve the results.  We are looking for researchers in a variety of fields, no technical expertise is required (though welcomed).  In order to join us, visit https://datasearch.elsevier.com and click on the button marked “Join Our User Panel”.

About Mendeley Brainstorms

Our Brainstorms are challenges so we can engage with you, our users, on the hottest topics in the world of research.  We look for the most in-depth and well thought through responses; the best response as judged by the Mendeley team will earn a prize.

References

MARTINS, M., SANTOS, C., FRIZERA-NETO, A. and CERES, R. (2012). Assistive mobility devices focusing on Smart Walkers: Classification and review. Robotics and Autonomous Systems, 60(4), pp.548-562.

Össur Americas. (2016) Flex-Foot Cheetah. [ONLINE] Available at: http://www.ossur.com/prosthetic-solutions/products/sport-solutions/cheetah. [Accessed 10 August 2016].

Introducing Elsevier DataSearch

Elsevier takes the next step in making researchers’ lives easier with the new DataSearch engine.  You can search for research data across numerous domains and various types, from a host of domain-specific and cross-domain data repositories. It’s available at (https://datasearch.elsevier.com/) – please join our User Panel to help improve it!

More Focused Searching

Mass search engines are ubiquitous and useful; however, when it comes to specific information tailored to the needs of the modern researcher, a more focused application is required.  In response to this need, Elsevier has created DataSearch.  Drawing on reputable repositories of information across the internet, researchers can readily find the data sets they need to accelerate their work.

DataSearch offers a new and innovative approach.  Most search engines don’t actively involve their users in making them better; we invite you, the user, to join our User Panel and advise how we can improve the results.  We are looking for users in a variety of fields, no technical expertise is required (though welcomed).  In order to join us, visit https://datasearch.elsevier.com and click on the button marked “Join Our User Panel”. Please detail in your e-mail the following:

  • Your Name
  • Institution
  • Research Interests

We look forward to working with you and improving the research experience.