Do you dream of creating the Blippy for BriteKite, or the Gowalla for GetGlue? No? Well, maybe you’re thinking beyond better ways to sell stuff to people and wanting to try something a little bigger. You wouldn’t be alone. Universities, governmental bodies, and companies have increasingly begun to make their data available to the public and they want it to be used! All we need now is for smart developers to realize there’s as much money and considerably more fame to be had in helping people find the next cure for cancer or spotting public health issues than in spotting buy-one-get-one deals at the local store. Please join us on June 11th and 12th for Hack4Knowledge.
Changing research by opening up the world’s knowledge is what we’re all about at Mendeley, so we are proud to host Hack4Knowledge on June 11th and 12th. This two day event will feature Lightning talks + API presentations by Brainscape, Elsevier, GeoIQ, and more. Hacking runs for 28 hours straight until 2PM EDT Sunday, when you’ll have the chance to demo what you’ve built and vote for your favorite hack. Food and entertainment will be provided. The event will run concurrently in the Mendeley offices in both London and New York, with live streaming video between the two. Hack on whatever you want, but please note that entries using the Mendeley API will be eligible to win the $10001 Binary Battle and all Binary Battle entrants get free AWS credits.
Sound like fun? Register here:
Here’s some resources to get started:
The CDC has data on the health of Americans of all ages by age, gender, race/ethnicity, and geographic location.
Health Indicators Warehouse presents a unified API to thousands of datasets covering public health.
IBM’s CityForward.org has a list of publicly available data about cities and metropolitan areas around the world.
DataSF.org has datasets from the City and County of San Francisco
Sciencemuseum.org.uk has an API for its exhibits.
Publicdata.eu does what it says on the tin.
Here’s a Quora thread listing 60+ popular data sources.
Google has a public data explorer, with many great examples.
Hilary Mason has made a bit.ly link bundle of interesting dataset links.
Pete Warden‘s Data Science Toolkit is really useful. (Ryan Elmore’s R interface to DSTK.)
For cleaning messy data, may I recommend Google Refine and Stanford’s Data Wrangler?
3taps collects real-time data from sites like Ebay, Craigslist, and Twitter.
Here’s a Mendeley group on Linked Open Data in Science
The DataMarkets blog has written a great post about the emerging field of data markets, including a list of the major ones, such as Infochimps, Factual, Freebase, and, of course, Mendeley neighbors Timetric!