About a week ago, I had the pleasure of attending Science Hack Day with about 150 other scientists and hackers. It was an amazingly fun event with people from all over the world coming together to build cool, quirky, and otherwise awesome things over the span of a weekend. It’s a sort of high holy day for geeks like me, so I was especially thrilled that Mendeley was able to be a sponsor this year. It was also fun spending quality time with some of the PLoS developers and collaborating on a fun hack. Here’s some of the highlights:
The most fun hack of the night was Syneseizure. This was an amusing play on the concept of synesthesia, the phenomenon of experiencing sensations via multiple senses at once, like hearing a sound associated with a shape or having a taste associated with a color. For this hack, the team took input from a webcam and mapped the color intensity into vibration intensity, which allowed the wearer of a special helmet to feel what they were looking at.
There were other fun hacks using mobile phone accelerometers as an earthquake detection system and one nice data mashup of medical coverage vs. mobile phone coverage which suggested hotspots where access to medical information via mobile phones could be literally life-saving. The tastiest hack award had to go to the DNAquiri team, which made drinkable DNA extractions.
I spent some time working on adapting a sentiment analysis algorithm, used by businesses to see what people are saying about them online, to analysis of the scientific literature. For my hack, I extracted phrases expressing certainty or uncertainty from scientific papers at PLoS using their full-text search API. I was then able to give each paper a “certainty score” based on a very simplistic counting of the number of times they used a phrase expressing certainty in a result minus the number of phrases expressing uncertainty. Then I looked up the # of readers via the Mendeley API for each of the papers and plotted the results. Perhaps unsurprisingly, I found that most scientific papers expressed confidence in their results with more confident papers tending to have more readers on Mendeley. More surprisingly, I found that phrases expressing certainty are found more slightly often (p<∞) than statements indicating statistical significance.
The folks from PLoS in attendance did an analysis of publication gender and if we had had more time, we would have merged the two projects to see if the stereotype of the arrogant male professor has any basis in reality or if successful female scientists use just as assertive language as males.
Being a weekend exercise in frivolity, these results should certainly be taken with a healthy dose of skepticism, but it was lots of fun. My little hack does suggest that if all papers were open access and as practically accessible in fact as PLoS papers, we could use modern literature analysis tools to help produce valuable insights about research.
What would you do if all research papers were as accessible as those published by PLoS?