Everyone in the world has a “how to” guide to data science… well, maybe not everyone - but there are a lot of “guides” out there. I get this question infrequently, so I thought I would do my best to put together what have been my best resources for learning.
Personally, I learned statistics by getting my Masters in Applied Statistics at Villanova University - it took 2.5 years. I got my introduction to R by working through the Johns Hopkins University Data Science Specialization on Coursera. Similarly for python, I got an online introduction via DataCamp.
This was all bolstered by working with these tools at work and in side projects. The repetition of working with these tools every day has made it more fluent.
Here are some resources that I’ve used or know of - I’ve tried to outline them and group them to the best of my ability. There’s many more out there, and you may find some better or worse depending on your style.
- Johns Hopkins University Data Science Specialization on Coursera: As mentioned above
this course gave me my start with R, RStudio, and git.
- Kaggle: If you are as competitive as I
am, this site should get you going - the interactive kernals and social aspects of
this site make it a great place to see other data science in action. Plagiarism is
greatest form of flattery (and easiest way to learn - thanks, Stack Overflow).
- EdX - R Programming: I haven’t used EdX much, but there is a wealth of MOOCs here.
LEARNING STATISTICS & OTHER IMPORTANT MATH
- Khahn Academy - Statistics: I
have used Khahn Academy on multiple occasions for refreshers in Statistics and Linear Algebra.
The classes are interactive, manageable, and self-paced.
- Khahn Academy - Linear Algebra
- Coursera - Statistics with R
- EdX - Data Analytics & Statistics courses
- Of course - higher education, as well.
- Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy - Cathy O’Neil: Cathy O’Neil
does a great job of outlining how data algorithms can have unintended negative consequences.
Anyone who builds an machine learning algorithm should read.
- The Wall Street Journal Guide to Information Graphics: The Dos and Don’ts of Presenting Data, Facts, and Figures - Dona M. Wong:
I have this book on my desk as a reference. Quick read filled with easy to understand
rules and objectives for creating data visualizations. Analyzing data is hard - this
book teaches tips to build clear and informative visualizations that don’t take away
from the message.
- The Signal and the Noise: Why So Many Predictions Fail-but Some Don’t - Nate Silver: Nate
Silver is [in]famous for predicting elections. This book gets into the details of how he does that.
Super interesting for a guy increasingly interested in politics.
- How Not to Be Wrong: The Power of Mathematical Thinking - Jordan Ellenberg: Critical
thinking is crucial in data science and analytics. This book gives some great tips on
how to approach “facts” with the right mindset.
- Thinking, Fast and Slow - Daniel Kahneman: Currently on my list to read.
- Hidden Brain: NPR podcast covering many topics. I find it super interesting. While not
distinctly data related, it frequently covers topics that have tangential importance to
being a good data scientist.
- Exponential View: Not primarily focused on data, but is very frequently covering
artificial intelligence and machine learning topics. I recommend the newsletter that
goes along with this podcast (link below).
- Not So Standard Deviations: Richard Peng and Hilary Parker host a podcast on all things data science.
- The Data Lab Podcast: Local [to Philly] data podcast interviewing local data
scientists. I find it reassuring to hear that my habits are often in line with
these peoples, plus I’ve picked up many really great tidbits (like the Exponential
- O’Reilly Data Show: I have attended the Strata data conference by O’Reilly. Much
like the conference, this podcast covers many relevant data themes.
- Data Skeptic: Another data podcast that covers many good data topics.
BLOGS & NEWSLETTERS
- Exponential View: Billed as a weekly “wondermissive”,
the author Azeem Azhar covers many topics relevant to data and the greater technology
economy. I truly look forward to getting this newsletter every Sunday morning.
- Twitter: I follow many great data people on twitter and get a great deal of my
data news there.