Open Data Day - DC Hackathon

For those of you who aren’t stirred from bed in the small hours to learn data science, you might have missed that March 5th was international open data day. There are hundreds of local events around the world; I was lucky enough to attend DC’s Open Data Day Hackathon. I met a bunch of great people doing noble things with data who taught me a crap-ton (scientific term) and also validated my love for data science and how much I’ve learned since beginning my journey almost two years ago. Here is a quick rundown of what I learned and some helpful links so that you can find out more, too. Being that it is an Open Data event, everything was well documented on the hackathon hackpad.

Introduction to Open Data

Eric Mill gave an really nice overview of what JSON is how to use APIs to access the JSON and thus, the data the website is conveying. Though many APIs are open and documented, many are not. Eric gave some tips on how to access that data, too.

This session really opened my eyes to how to access that previously unusable data that was hidden in plain sight in the text of websites.

Data Science Primer

This was one of the highlights for me - A couple of NIST Data Scientists, Pri Oberoi and Star Ying, gave a presentation and walkthrough on how to use k-means clustering to identify groupings in your data. The data and jupyter notebook is available on github.

I will definitely be using this in my journey to better detect and remediate compromised user accounts at Comcast.

Hackathon

I joined a group that was working to use data science to identify Opioid overuse. Though I didn’t add much (the group was filled with some really really smart people), I was able to visualize the data using R and share some of those techniques with the team.

Intro to D3 Visualizations

The last session and probably my favorite was a tutorial on building out a D3 Visualization. Chris Given walked a packed house through building a D3 viz step-by-step, giving some background on why things work they work and showing some great resources.

I am particularly proud of the results (though I only followed his instruction to build this).

Closing

I also attended 2 sessions about using the command line that totally demystified the shell prompt. All in all, it was a great two days! I will definitely be back next year (unless I can convince someone to do one in Philly).