Multivariate Analysis In R

Kevin Feasel



Mala Mahadevan looks at using R to describe data sets with two explanatory variables:

From the plot we can see that type 3 trees have the smallest circumference while type 4 have the largest, with type 2 close to type 4. We can also see that type 1 trees have the thinnest dispersion of circumference while type 4 has the highest, closely followed by type 2.  We can also see that there are no significant outliers in this data.

Understanding whether variables are categorical or continuous is vital to understanding what you can and should do with them.

Related Posts

Defining Tidy Data

John Mount shares thoughts about the concept of tidy data: A question is: is such a data set “tidy”? The paper itself claims the above definitions are “Codd’s 3rd normal form.” So, no the above table is not “tidy” under that paper’s definition. The the winner’s date of birth is a fact about the winner […]

Read More

Visualizing Earthquake Data

Giorgio Garziano continues a series on analyzing earthquake data: This is the third part of our post series about the exploratory analysis of a publicly available dataset reporting earthquakes and similar events within a specific 30 days time span. In this post, we are going to show static, interactive and animated earthquakes maps of different flavors by […]

Read More


December 2016
« Nov Jan »