Press "Enter" to skip to content

Category: Python

Using the Open Source R or Python Runtime with Machine Learning Services

Niels Berglund walks us through using the open source extensibility framework to install R or Python:

When Java became a supported language in SQL Server 2019, Microsoft mentioned that communication between ExternalHost and the language extension should be based on an API, regardless of the external language. The API is the Extensibility Framework API for SQL Server. Having an API ensures simplicity and ease of use for the extension developer.

From the paragraph above, one can assume that Microsoft would like to see 3rd party development of language extensions. That assumption turned out to be accurate as, mentioned above, Microsoft open-sourced the Java language extension, together with the include files for the extension API, in September 2020! This means that anyone interested can now create a language extension for their own favorite language!

However, open sourcing the Java extension was not the only thing Microsoft did. They also created and open-sourced language extensions for R and Python!

Click through for more detail and a walkthrough on installation of Python.

Comments closed

Measuring Advertising Effectiveness

Layla Yang and Hector Leano walk us through measuring how effective an advertising campaign was:

At a high level we are connecting a time series of regional sales to regional offline and online ad impressions over the trailing thirty days. By using ML to compare the different kinds of measurements (TV impressions or GRPs versus digital banner clicks versus social likes) across all regions, we then correlate the type of engagement to incremental regional sales in order to build attribution and forecasting models. The challenge comes in merging advertising KPIs  such as impressions, clicks, and page views from different data sources with different schemas (e.g., one source might use day parts to measure impressions while another uses exact time and date; location might be by zip code in one source and by metropolitan area in another).

As an example, we are using a SafeGraph rich dataset for foot traffic data to restaurants from the same chain. While we are using mocked offline store visits for this example, you can just as easily plug in offline and online sales data provided you have region and date included in your sales data. We will read in different locations’ in-store visit data, explore the data in PySpark and Spark SQL, and make the data clean, reliable and analytics ready for the ML task. For this example, the marketing team wants to find out which of the online media channels is the most effective channel to drive in-store visits.A

Click through for the article as well as notebooks.

Comments closed

Building a CRUD Application with Cloudera Operational DB and Flask

Shlomi Tubul puts together a proof of concept app:

In this blog, I will demonstrate how COD can easily be used as a backend system to store data and images for a simple web application. To build this application, we will be using Phoenix, one of the underlying components of COD, along with Flask. For storing images, we will be using an HBase (Apache Phoenix backend storage) capability called MOB (medium objects). MOB allows us to read/write values from 100k-10MB quickly. 

*For development ease of use, you can also use the Phoenix query server instead of COD. The query server is a small build of phoenix that is meant for development purposes only, and data is deleted in each build. 

Click through for the demo and for a link to the GitHub repo.

Comments closed

SQL Server R and Python Language Extensions Now Open Source

The SQL Server team has an announcement:

Previously, we announced a Java extensionToday, we are sharing that we are open sourcing the R and Python language extensions for SQL Server for both Windows and Linux on GitHub.

These extensions are the latest examples using an evolved programming language extensibility architecture which allows integration with a new type of language extension. This new architecture gives customers the freedom to bring their own runtime and execute programs using that runtime in SQL Server, while leveraging the existing security and governance that the SQL Server programming language extensibility architecture provides.

Very interesting.

Comments closed

Correlation and Predictive Power Score in Python

Abhinav Choudhary looks at two methods for understanding the relationship between variables:

dataframes while working in python which is supported by the pandas library. Pandas come with a function corr() which can be used in order to find relation amongst the various columns of the data frame. 
Syntax :DataFrame.corr() 
Returns:dataframe with value between -1 and 1 
For details and parameter about the function check out Link 
Let’s try this in action. 

Read on to see how it works, how to visualize results, and where Predictive Power Score can be a better option.

Comments closed

Implementing an LSTM Model with Python

Mrinal Walia takes us through the concept of Long Short Term Memory:

A simple Recurrent Neural Network has a very simple structure, that forms a chain of repeating modules of a neural network, with just a single activation function such as tanh layer, similarly LSTM too have a chain-like structure with repeating modules just like RNN but instead of a single Neural network layer in RNN, LSTM has four layers which are interacting in a very different way each performing its unique function in the network.

Read on for a good amount of theory followed by an example using Keras.

Comments closed

Generating Predictions with SQL Server ML Services

Jeffin Mathew walks us through SQL Server Machine Learning Services:

The purpose of this blog is to explore the process of running ML predictions on SQL server using Python. We are going to train and test the data to predict information about bike sharing for a specific year. We are going to be using the provided 2011 data and predict what 2012 will result in. The 2012 data already exists inside the dataset, so we will be able to compare the predicted to the actual amount.

For certain use cases—especially when the data already exists in SQL Server, and especially especially when you can use native scoring—Machine Learning Services does a great job.

Comments closed