Mendeley at JCDL 2014

by Patrick Hochstenback @hochstenbach
Image by Patrick Hochstenback @hochstenbach

The Mendeley Data Science team have been busy attending some important events around the world. One of them has been JCDL 2014, the most prominent conference in the Digital Libraries arena. The conference looks at many of the problems we’re tackling at the moment, such as article recommendations and the best ways of automatically extracting information from research articles.

Maya Hristakeva, Senior Data Scientist at Mendeley, was particularly excited about the various approaches to topic modelling that were discussed at the event. “Topics were used as features for a diverse range of tasks, such as prediction of an author’s future citation counts, making personalised recommendations, search, author disambiguation, and creating more relevant citation networks, all features that make a direct impact to the research workflow on Mendeley.”

“We saw some really thought-provoking output come out of the JCDL14 proceedings such as Characterizing Scholar Popularity : A Case Study in the Computer Science Research Community. In JCDL’14” explains Kris Jack, Chief Data Scientist at Mendeley. “Some of the interesting research questions raised included one by Gonçalves, G. D., Figueiredo, F., Almeida, J. M., & Gonçalves, M. A. (2014) which asked whether it is possible to represent the popularity of a researcher using the number of readers that they have.”

It was also nice to see evidence in some of the papers presented that Mendeley readership is highly correlated with various measures of academic impact, such as h-index and publication venue importance,” says Mendeley Senior Data Scientist Phil Gooch.

Overall, this was a really valuable opportunity to connect with researchers who are working on similar problems to Mendeley, such as metadata extraction, recommendations, and citation/author/venue disambiguation, so we’re thinking about the idea of perhaps running an open challenge to focus this research into concrete output that could be of use in features for our users. If you have any ideas around that, do get in touch on Twitter with @_krisjack @mayahhf and @Phil_Gooch

Note: At Mendeley, we believe in dogfooding (it’s not as disgusting at it sounds, merely techy slang for using your own product to validate the qualities of that product…) so Maya, Kris and Phil took notes using Mendeley Desktop 🙂

 

Discussing the Future of Recommender Systems at RecSys2014


Maya and Kris from the Mendeley Data Science team have just returned from RecSys2014, the most important conference in the Recommender System world. RecSys is remarkable in that it attracts an equal number of participants from industry and academia, many of whom are at the forefront of innovation in their fields.

The team had a chance to exchange perspectives and experiences with various researchers, scholars and practitioners.

“To me, it was encouraging to see how top companies across the world are investing in recommenders, as they are shown to enhance customer satisfaction and bring real value to both users and companies,” says Mendeley Senior Data Scientist Maya Hristakeva. “LinkedIn reported that 50% of the connections made in their social network come from their follower recommender, while Netflix says that if they can stop 1% of users from cancelling their subscription then that’s worth $500M a year, which of course justifies the fact they are investing $150M/year in their content recommendation team, consisting of 300 people.”

But one of the advantages of such a hybrid event is that it did not shy away from addressing the broader issues, such as how to ward against creating a “filter bubble” effect, how to preserve user’s privacy, and optimising systems for what really matters (and how this can be effectively defined). Daniel Tunkelang, LinkedIn, and Xavier Amatriain, Netflix, moderated a panel on “Controversial Questions About Personalization“, tackling some of these topics head on. Hector Garcia Molina from Stanford University also put forward the view that we’ll increasingly see a convergence of recommendations, search and advertising, despite noticeable scepticism from the attendees.

Kris Jack, Chief Data Scientist at Mendeley, says one of the main messages that he took away from the conference was the importance of winning a user’s trust in the early stages of using a recommender system.

“The best systems have been shown to start off by providing recommendations that can quickly be evaluated by users as being useful before gradually introducing more novel recommendations. So in the case of helping researchers to find relevant articles to read, it’s probably best to start by recommending well known but important articles in their field, before recommending some less well known but very pertinent articles to their specific problem domain.” explains Kris. “Other important factors include reranking (the order in which recommendations should be shown), the UI design that can best support interaction with the recommender system, and the ways in which we can build context-aware recommendations.”

What do you think of the current recommendation features on Mendeley? Are there any particular ones that you’d like to see implemented? Would you like to join the team and work on making them even better? Let us know in the comments below, or Tweet the team directly @_krisjack @mayahhf and @Phil_Gooch .If you’re interested in finding out more about what the Data Science Team is developing in that arena, you can also watch their Mendeley Open Day presentation here.

 

 

Mendeley Mini-Conference on Recommender Systems

Mendeley Recommender Workshop

Last week, Mendeley hosted an all-day mini-conference on Academic-Industrial Collaborations for Recommender Systems.  As we’re fast running out of space in our London office, we rented a nearby venue called Headrooms.  With friendly staff looking after everyone’s needs and great start-up décor, we’ll definitely be coming back for future Mendeley event.  In the morning and early afternoon we were treated to seven talks from a variety of speakers who shared their experiences of academic-industrial collaborations and recommender systems.  We finished the afternoon by splitting into smaller groups to discuss the challenges involved in making such collaborations a success and sharing useful advice with one another.  The day then finished, as all good days do, with a quick trip to the funkily named Giant Robot, to taste some of their excellent cocktails. Our Chief Data Scientist Kris Jack, who masterminded this great event, shares some of the day’s highlights:

Presentations

Seven presentations were delivered by our eight speakers, one of them being an entertaining double act.  We tried to film as much of the event as we could so we could share them with you, so click on the links below to watch the presentations!

First off, Jagadeesh Gorla began with a presentation entitled A Bi-directional Unified Model.  Jagadeesh talked about the results presented in his www2013 paper on group recommendations via Information Matching, a new probabilistic model based on ideas from the field of Information Retrieval, which learns probabilities expressing the match between arbitrary user and item features: this makes it both flexible and powerful.  He is currently working on developing an online implementation for deployment in an online gaming platform.

Our double act, Nikos Manouselis and Christoph Trattner then followed with the intriguingly entitled presentation Je t’aime… moi non plus: reporting on the opportunities, expectations and challenges of a real academic-industrial collaboration.  They gave an honest and candid reflection of their expectations for working together and how some of their past experiences in other collaborations weren’t as successful as hoped.  It was great material that fed into the discussions later in the day.

Heimo Gursch then gave some Thoughts on Access Control in Enterprise Recommender Systems.  While his project is still in the early stages, he had quite a few experiences that he could share from working with industry partners from the perspective of an academic.  He was working on designing a system that would allow employees in a company to effectively share their access control rights with one another rather than relying on a top down authority to provide them.  It’s also the first time that I’ve seen a presenter give his car keys to a member of the audience.  I do hope that the got them back.

Maciej Dabrowski delivered an exciting presentation Towards Near Real-Time Social Recommendations in an Enterprise.  His team and him have been working on a cross-domain recommendation system that works in a federated manner.  It exploits semantic data from linked data repositories to generate recommendations that spans multiple domains.

Mark Levy, from our team here at Mendeley, then presented some of the work that he has been doing in a talk entitled Item Similarity Revisited.  The presentation was filled with useful advise from an industrial perspective on what makes a good recommender system.  He also explored the idea that simple algorithms may be more useful than complex ones in an industry setting, showing some impressive results to back it up.

Benjamin Habegger then took us on a rollercoaster ride exploring some of his successes and failures in his last startup, 109Lab: Feedback from a Start-up experience in Collaboration with Academia.  He reflected on many of his experiences co-founding a start-up and the learning from the mistakes that were made.  Although he worked with academia during the process, he wasn’t clear about the value that it actually brought.

Finally, Thomas Stone presented Venture Rounds, NDAs and Toolkits – experiences in Applying Recommender Systems to Venture Finance.  Thomas had some nightmare experiences with NDAs during his PhD.  So much so, that he’s still unclear what he has the right to publish in his thesis.  He also gave a nice introduction to PredictionIO, an open source machine learning server.

Discussion Groups

Once the presentations were given, everyone was invited think about the challenges and difficulties that they had faced in working in academic-industry collaborations and to write down some topics on a flip chart.  We then split into three groups and, using these topics as guidance, discussed the issues faced and presented some solutions.

A number of issues were identified including:

  • · Prototypes vs production code – do the partners know what is expected from whom?
  • · How to find the right partners
  • · Access to data (e.g. NDA issues)
  • · Evaluating systems
  • · Best practices

After the three groups discussed the points we all gathered back to share our thoughts and conclusions.  In general, we all seemed to share similar problems in making academic industry collaborations successful.  We discussed that there should always be a clear set of expectations agreed from the outset and that partners should know their roles.  Communication lines should be kept open and the spirit of collaboration encouraged.  What’s more, it can help to have members of the teams working together in the same physical location, even if it’s just for a brief period, in order to work well together.

Working in academic-industrial collaborations is hugely rewarding but it can be tough.  Finding the right partners who understand each other’s goals and constraints is important from the outset.  We can all learn from one another but we need to put in some effort in order to enjoy the rewards.

I’d like to thank everyone who put in the effort to make the workshop a success and, as I follow up the several e-mails that I’ve got, hope to start some new and fruitful collaborations!