Relationships Between Numerical Features

Stacia Varga continues her exploratory data analysis series using hockey data:

Let’s start with something easy and understandable to analyze. If I put age on the horizontal axis and weight on the vertical axis. It’s a common practice to put an explanatory variable on the horizontal axis and a response variable on the vertical axis. In other words, I’m looking to see how an increase in age (explanation) affects – or not – weight (response) for all the hockey players in the current season, regardless of team.

If I put age on the horizontal axis – does this explain weight? Sort of – the combinations of age and weight have some groupings. It almost appears that there is a greater number of younger, heavier players than older, heavier players, but it’s hard to tell here how the age/weight combinations are distributed because I can’t see all the individual points.

Read the whole thing, while keeping in mind that correlation does not imply causation.

Related Posts

Where Machine Learning And Econometrics Collide

Dave Giles shares some thoughts on how machine learning and econometrics relate: What is Machine Learning (ML), and how does it differ from Statistics (and hence, implicitly, from Econometrics)? Those are big questions, but I think that they’re ones that econometricians should be thinking about. And if I were starting out in Econometrics today, I’d […]

Read More

Solving Naive Bayes By Hand

I have a post that requires math and is meaner toward the Buffalo Bills than I normally am: Trust the ProcessThere are three steps to the process of solving the simplest of Naive Bayes algorithms. They are:1. Find the probability of winning a game (that is, our prior probability).2. Find the probability of winning given each input variable: whether Josh Allen starts the game, whether the team is […]

Read More


May 2018
« Apr Jun »