Mastering Tools

The folks at Sharp Sight Labs explain that future obsolescence of a tool does not mean you should not master it:

The heart of his critique is this: data science is changing very fast, and any tool that you learn will eventually become obsolete.

This is absolutely true.

Every tool has a shelf life.

Every. single. one.

Moreover, it’s possible that tools are going to become obsolete more rapidly than in the past, because the world has just entered a period of rapid technological change. We can’t be certain, but if we’re in a period of rapid technological change, it seems plausible that toolset-changes will become more frequent.

The thing I would tie it to is George Stigler’s paper on information theory.  There’s a cost of knowing—which the commenter notes—but there’s also a cost to search, given the assumption that you know where to look.  Being effective in any role, be it data scientist or anything else, involves understanding the marginal benefit of pieces of information.  This blog post gives you a concrete example of that in the realm of data science.

Related Posts

CSV Import Speeds With H2O

WenSui Liu benchmarks three CSV loading methods in R: The importFile() function in H2O is extremely efficient due to the parallel reading. The benchmark comparison below shows that it is comparable to the read.df() in SparkR and significantly faster than the generic read.csv(). I’d wonder if there are cases where this would vary significantly; regardless, […]

Read More

Linear Prediction Confidence Region Flare-Out

John Cook explains why the confidence region of a tracked object flares out instead of looking conical (or some other shape): Suppose you’re tracking some object based on its initial position x0 and initial velocity v0. The initial position and initial velocity are estimated from normal distributions with standard deviations σx and σv. (To keep […]

Read More


December 2016
« Nov Jan »