Where to Find Open Data

by Brett Lord-Castillo

Need a sample open data set? Look at these sites.
Open data is commonly used in education and training. You want to learn new tools or teach your skills to others, and for that you need sample data. Open data gives you access to a wide variety of data without concerns on licensing; even access to “big data” datasets. Just today, I was helping someone with the basic question, “Where can I find sample datasets?” and realized I should share my response to help further this specific use of open data.

As always, my answer is biased towards geospatial.

Socrata Communities

One of the first places to look is the Socrata Communities.
https://communities.socrata.com/
Most of these sets were generated during the 2013 National Day of Civic Hacking, so they may be a little dated, but there are still just over 1000 sample datasets across many locations and topics. The foundry API on top of most of these datasets can make them very interesting to work with.
Just go to Browse datasets from all groups to search through all the data available.
(And OpenDataSTL has one of the all-time most accessed datasets on this site.)

Socarata Open Data Network

The Socrata Open Data Network is the same Socrata portal hosting as Socrata Communities, but regularly updated and from authoritative publishers.
https://data.opendatanetwork.com/
Participation is still relatively small, but you can expect long term stability and updating out of these portals.

CKAN and DKAN

CKAN is a self-hosted open source equivalent to Socrata (Socrata is open source as well). While Socrata is mostly in North America, CKAN covers the world. You can find a list of known CKAN portals at http://ckan.org/instances/.

There is also the Drupal DKAN Project, but unfortunately I am not aware of any comprehensive list of DKAN portals.

ArcGIS Open Data

ArcGIS Open Data is still in its early stages but has some interesting advantages over other types of portals.
https://opendata.arcgis.com/
It has by far the most complete geospatial hosting of any of these options. (So if you are looking for polygon data like jurisdictional boundaries or parcels, look here first). More importantly,many local governments across the world have free access to ArcGIS Open Data as part of their ArcGIS for Desktop licensing. That means it integrates easily into the most common GIS software used by local government worldwide for no additonal cost, so you can expect a lot of agencies to start utilizing this portal.

plenar.io

Lastly, plenar.io is tied to Code for America and is looking like a site of the future alongside ArcGIS Open Data.
http://plenar.io/
What makes plenar.io especially interesting is its API into space-time slicing of data rather than just space slicing. With the support plenar.io recieves from Code for America and its popularity with CfA Brigades lately, there will likely be a lot of community assembled datasets appearing on plenar.io in the future.
At OpenDataSTL, we have plans in the work to move a large number of datasets over to plenar.io this year.

Contribute to this post

Do you have other recommendations or questions about this post? Feel free to contact us or send those to us via Issues on this repo.

Written on January 7, 2015

About The Author

photo of Brett Lord-Castillo Brett Lord-Castillo

Brett is a Geographic Information Systems Programmer for St. Louis County Emergency Management and a passionate advocate of the value of geography. Co-founder and former co-captain of OpenDataSTL, Brettfocuses on creating connections between local government and the tech community. Brett frequently uses

http://opendatastl.github.io/