Sales Predictions with Pandas

Megan Quinn shows how you can use Pandas and linear regression to predict sales figures:

Pandas is an open-source Python package that provides users with high-performing and flexible data structures. These structures are designed to make analyzing relational or labeled data both easy and intuitive. Pandas is one of the most popular and quintessential tools leveraged by data scientists when developing a machine learning model. The most crucial step in the machine learning process is not simply fitting a model to a given data set. Most of the model development process takes place in the pre-processing and data exploration phase. An accurate model requires good predictors and, in order to acquire them, the user must understand the raw data. Through Pandas’ numerous data wrangling and analysis tools, this important step can easily be achieved. The goal of this blog is to highlight some of the central and most commonly used tools in Pandas while illustrating their significance in model development. The data set used for this demo consists of a supermarket chain’s sales across multiple stores in a variety of cities. The sales data is broken down by items within the stores. The goal is to predict a certain item’s sale.

Click through for an example of the process, including data cleansing and feature extraction, data analysis, and modeling.

Related Posts

Logistic Regression Defaults and sklearn

Giovanni Lanzani shares some thoughts on scikit-learn defaults for Logistic Regression: If you read the post, you can see that the biggest problem with the choice is that, unless your data is regularized, you will train a model that probably under performs: you are unnecessarily penalizing it by making it learn less than what it […]

Read More

Python and R Data Reshaping

John Mount takes us through a couple of data shaping packages: The advantages of data_algebra and cdata are: – The user specifies their desired transform declaratively by example and in data. What one does is: work an example, and then write down what you want (we have a tutorial on this here).– The transform systems can print what a transform is going to […]

Read More

Categories

June 2019
MTWTFSS
« May Jul »
 12
3456789
10111213141516
17181920212223
24252627282930