R Is Bad For You?

Bill Vorhies lays out a controversial argument:

I have been a practicing data scientist with an emphasis on predictive modeling for about 16 years.  I know enough R to be dangerous but when I want to build a model I reach for my SAS Enterprise Miner (could just as easily be SPSS, Rapid Miner or one of the other complete platforms).

The key issue is that I can clean, prep, transform, engineer features, select features, and run 10 or more model types simultaneously in less than 60 minutes (sometimes a lot less) and get back a nice display of the most accurate and robust model along with exportable code in my selection of languages.

The reason I can do that is because these advanced platforms now all have drag-and-drop visual workspaces into which I deploy and rapidly adjust each major element of the modeling process without ever touching a line of code.

I have almost exactly the opposite thought on the matter:  that drag-and-drop development is intolerably slow; I can drag and drop and connect and click and click and click for a while, or I can write a few lines of code.  Nevertheless, I think Bill’s post is well worth reading.

Related Posts

Visualizing Emergency Room Visits

Eugene Joh has a great blog post showing how to parse ICD-9 codes using regular expressions and then visualize the results as a treemap: It looks like there is a header/title at [1], numeric grouping  at [2] “1.\tINFECTIOUS AND PARASITIC DISEASES”,  subgrouping by ICD-9 code ranges, at [3] “Intestinal infectious diseases (001-009)” and then 3-digit ICD-9 […]

Read More

Subplots In Maps

Ilya Kashnitsky shows how to embed subplots within a map using ggplot2: So, with this map I want to show the location of more and less urbanized NUTS-2 regions of Europe. But I also want to show – with subplots – how I defined the three subregions of Europe (Eastern, Southern, and Western) and what […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *


May 2017
« Apr