Data Science Languages

David Crook walks through his data science workflow and discusses language choice:

So I’ve spent a while now looking at 3 competing languages and I did my best to give each one a fair shake. Those 3 languages were F#, Python and R. I have to say it was really close for a while because each language has its strengths and weaknesses. That said, I am moving forward with 2 languages and a very specific way I use each one. I wanted to outline this, because for me it has taken a very long time to learn all of the languages to the level that I have to discover this and I would hate for others to go through the same exercise.

Read on for his decision, as well as how you go from “here’s some raw data” to “here are some services to expose interesting results.”

Related Posts

Interpreting The Area Under The Receiver Operating Characteristic Curve

Roos Colman explains what a Receiver Operating Characteristic (ROC) curve is and how we interpret the Area Under the Curve (AUC): The AUC can be defined as “The probability that a randomly selected case will have a higher test result than a randomly selected control”. Let’s use this definition to calculate and visualize the estimated […]

Read More

Naive Bayes Against Large Data Sets

Catherine Bernadorne walks us through using Naive Bayes for sentiment analysis: The more data that is used to train the classifier, the more accurate it will become over time. So if we continue to train it with actual results in 2017, then what it predicts in 2018 will be more accurate. Also, when Bayes gives […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930