As I mentioned in my last post, I am currently in an exploratory phase with my data analytics project. Although I would love to dive in and do some cool predictive analytics or machine learning projects, I really need to continue learning as much about my data as possible before diving into more advanced techniques.
My data exploration process has the following four steps:
Assess the data that I have at a high level
Determine how this data is relevant to the analytics project I want to undertake
Get a general overview of the data characteristics by calculating simple statistics
Understand the “middles” and the “ends” of your numeric data points
There’s some good stuff in here. I particularly appreciate Stacia’s consideration of data exploration as an iterative process.