Press "Enter" to skip to content

Category: Python

Measuring Advertising Effectiveness

Layla Yang and Hector Leano walk us through measuring how effective an advertising campaign was:

At a high level we are connecting a time series of regional sales to regional offline and online ad impressions over the trailing thirty days. By using ML to compare the different kinds of measurements (TV impressions or GRPs versus digital banner clicks versus social likes) across all regions, we then correlate the type of engagement to incremental regional sales in order to build attribution and forecasting models. The challenge comes in merging advertising KPIs  such as impressions, clicks, and page views from different data sources with different schemas (e.g., one source might use day parts to measure impressions while another uses exact time and date; location might be by zip code in one source and by metropolitan area in another).

As an example, we are using a SafeGraph rich dataset for foot traffic data to restaurants from the same chain. While we are using mocked offline store visits for this example, you can just as easily plug in offline and online sales data provided you have region and date included in your sales data. We will read in different locations’ in-store visit data, explore the data in PySpark and Spark SQL, and make the data clean, reliable and analytics ready for the ML task. For this example, the marketing team wants to find out which of the online media channels is the most effective channel to drive in-store visits.A

Click through for the article as well as notebooks.

Comments closed

Building a CRUD Application with Cloudera Operational DB and Flask

Shlomi Tubul puts together a proof of concept app:

In this blog, I will demonstrate how COD can easily be used as a backend system to store data and images for a simple web application. To build this application, we will be using Phoenix, one of the underlying components of COD, along with Flask. For storing images, we will be using an HBase (Apache Phoenix backend storage) capability called MOB (medium objects). MOB allows us to read/write values from 100k-10MB quickly. 

*For development ease of use, you can also use the Phoenix query server instead of COD. The query server is a small build of phoenix that is meant for development purposes only, and data is deleted in each build. 

Click through for the demo and for a link to the GitHub repo.

Comments closed

SQL Server R and Python Language Extensions Now Open Source

The SQL Server team has an announcement:

Previously, we announced a Java extensionToday, we are sharing that we are open sourcing the R and Python language extensions for SQL Server for both Windows and Linux on GitHub.

These extensions are the latest examples using an evolved programming language extensibility architecture which allows integration with a new type of language extension. This new architecture gives customers the freedom to bring their own runtime and execute programs using that runtime in SQL Server, while leveraging the existing security and governance that the SQL Server programming language extensibility architecture provides.

Very interesting.

Comments closed

Correlation and Predictive Power Score in Python

Abhinav Choudhary looks at two methods for understanding the relationship between variables:

dataframes while working in python which is supported by the pandas library. Pandas come with a function corr() which can be used in order to find relation amongst the various columns of the data frame. 
Syntax :DataFrame.corr() 
Returns:dataframe with value between -1 and 1 
For details and parameter about the function check out Link 
Let’s try this in action. 

Read on to see how it works, how to visualize results, and where Predictive Power Score can be a better option.

Comments closed

Implementing an LSTM Model with Python

Mrinal Walia takes us through the concept of Long Short Term Memory:

A simple Recurrent Neural Network has a very simple structure, that forms a chain of repeating modules of a neural network, with just a single activation function such as tanh layer, similarly LSTM too have a chain-like structure with repeating modules just like RNN but instead of a single Neural network layer in RNN, LSTM has four layers which are interacting in a very different way each performing its unique function in the network.

Read on for a good amount of theory followed by an example using Keras.

Comments closed

Generating Predictions with SQL Server ML Services

Jeffin Mathew walks us through SQL Server Machine Learning Services:

The purpose of this blog is to explore the process of running ML predictions on SQL server using Python. We are going to train and test the data to predict information about bike sharing for a specific year. We are going to be using the provided 2011 data and predict what 2012 will result in. The 2012 data already exists inside the dataset, so we will be able to compare the predicted to the actual amount.

For certain use cases—especially when the data already exists in SQL Server, and especially especially when you can use native scoring—Machine Learning Services does a great job.

Comments closed

Using Jupyter as an External Tool for Power BI Desktop

David Eldersveld continues a series on Power BI external tools:

Many people use Python with notebooks, so let’s take a look at one possible way to enable a Jupyter external tool for Power BI Desktop. The following stepwise approach begins with simply opening Jupyter. It then progresses to creating and opening a notebook that includes Power BI’s server and database arguments. Finally, it works its way toward downloading a notebook definition contained in a GitHub gist and connects to Power BI’s tabular model to start to make this approach more useful.

This post continues a series of posts related to Python and Power BI. The first three parts of this blog series introduced some possible uses for Python connected to a Power BI model, how to setup a basic Python external tool, and how to both use it with a virtual environment and connect to the Tabular Object Model.

This was a cool usage of Power BI’s external tool functionality and starts to give you an idea of how powerful it can be.

Comments closed

Getting Started with Jupyter Notebooks

Aveek Das takes us through the most popular name in notebooks:

In this article, I am going to explain what Jupyter Notebooks are and how to install the same on your machine. Further, I will demonstrate how to use these notebooks using Visual Studio Code and perform data analysis and other development activities. It is an open-source platform using which you can create and share documents that contain live code, equations, and visualizations along with the formatted text. Most importantly, these notebooks can be run on the web browser by just starting a server and using it. This open-source project is maintained by the team at Project Jupyter.

This is a fairly basic introduction to the topic, good if you have heard about notebooks but don’t know where to begin.

Comments closed

Methods and Functions in Python

Sairam Uppugundla distinguishes methods from functions in Python:

Function and method both look similar as they perform in an almost similar way, but the key difference is the concept of ‘Class and its Object’. Functions can be called only by its name, as it is defined independently. But methods cannot be called by its name only we need to invoke the class by reference of that class in which it is defined, that is, the method is defined within a class and hence they are dependent on that class.

Read on to see how to create each, as well as more details on types of functions.

Comments closed