Asking The Right Question

Buck Woody argues that the hardest thing about data science is asking the right question:

When I started down the path of learning Data Science, I was nervous. I have to work hard at math – it’s a skill I love but one that does not come naturally to me. I was nervous because I thought the most daunting task I would face in Data Science waslearning all the algebra, statistics, and other maths I would need to do the job.

But I was wrong.

Math isn’t the hardest thing in Data Science. Actually, since it’s so mature, and documented, and well-known, it’s quite possibly the easiest thing to conquer in the skillset. No, the hardest thing about Data Science is asking the right question.

I’ll lodge a bit of a disagreement here.  I’m okay with the argument that asking the right question is the toughest part, but the math’s not particularly easy either…  Knowing when to use which distribution, which model, and which parameters requires a definite amount of skill.

Related Posts

Bayesian Average

Jelte Hoekstra has a fun post applying the Bayesian average to board game ratings: Maybe you want to explore the best boardgames but instead you find the top 100 filled with 10/10 scores. Experience many such false positives and you will lose faith in the rating system. Let’s be clear this isn’t exactly incidental either: […]

Read More

Calculating Relative Risk In T-SQL

Mala Mahadevan explains how to calculate relative risk using T-SQL: In this post we will explore a common statistical term – Relative Risk, otherwise called Risk Factor. Relative Risk is a term that is important to understand when you are doing comparative studies of two groups that are different in some specific way. The most […]

Read More


January 2016
« Dec Feb »