Exploratory Data Analysis In R

Laura Ellis walks us through some easy techniques for learning about our data using R:

DIM AND GLIMPSE

Next, we will run the dim function which displays the dimensions of the table. The output takes the form of row, column.

And then we run the glimpse function from the dplyr package. This will display a vertical preview of the dataset. It allows us to easily preview data type and sample data.

Spending some quality time doing EDA can save you in the long run, as it can help you get a feel for things like data quality, the distributions of variables, and completeness of data.

Related Posts

Sentiment Analysis with Spark on Qubole

Jonathan Day, et al, have a tutorial on using Qubole to build a sentiment analysis model: This post covers the use of Qubole, Zeppelin, PySpark, and H2O PySparkling to develop a sentiment analysis model capable of providing real-time alerts on customer product reviews. In particular, this model allows users to monitor any natural language text […]

Read More

Running Spark MLlib to Feed Power BI

Brad Llewellyn shows how you can take Spark MLlib results and feed them into Power BI: MLlib is one of the primary extensions of Spark, along with Spark SQL, Spark Streaming and GraphX.  It is a machine learning framework built from the ground up to be massively scalable and operate within Spark.  This makes it […]

Read More

Categories

November 2018
MTWTFSS
« Oct Dec »
 1234
567891011
12131415161718
19202122232425
2627282930