Press "Enter" to skip to content

Category: Python

Using the Azure Architecture Icons

Steve Jones tries out some of the Azure Architecture Icons:

The icons are svg, so while they work in PowerPoint, adding them to something like this post in OpenLiveWriter doesn’t work. However, I could make a quick diagram and capture an image of it.

Not great, but it shows I can put icons on a page with arrows.

Going one step further, I’ve been digging into Diagrams by mingrammer lately. With it, you use Python to generate diagrams, and there are quite a few Azure icons in there, as well as AWS, on-prem, etc.

Here’s a quick example of what you can do, taken from an upcoming talk of mine:

There are some limitations based on the underlying library, such as how you can’t connect cluster to cluster—meaning I can’t draw a line from “Logging” to “Storage\Logs”; I have to draw it from a particular element (Loki) to a particular element (Elasticsearch). In a lot of traditional reference architecture diagrams, though, that isn’t a problem.

Comments closed

Python and Power BI Desktop

David Eldersveld continues a series on Python as an external tool in Power BI. Part 2 is about defining a proper JSON formatted file:

To see a new tool in the Power BI Desktop ribbon, you need to define a JSON file and place it in a specific folder on your workstation. The featured external tools Tabular Editor, DAX Studio, and ALM Toolkit have installers that take care of this step. Since you do not have a dedicated installer to place this file in the required directory, you need to manually define your own.

As long as the “enhanced metadata format” for the data model is enabled, and the JSON in your file is accurate, you should see your new tool in your ribbon after you re-open Power BI Desktop.

Part 3 shows off virtual environments in Python as well as how to connect to the Tabular Object Model:

The Tabular Object Model (TOM) library for .NET opens Power BI’s data model to external tools. Two pieces are required to allow Python to interface with .NET:
1) Pythonnet package for .NET CLR (pip install pythonnet)
2) Python-SSAS module (ssas_api.py placed in the same folder as the main script you’d like to run)

The python-ssas (ssas_api.py) Python module that facilitates the TOM connection is all the work of Josh Dimarsky–originally for querying and processing Analysis Services. I simply repurposed it for use with Power BI Desktop or the XMLA endpoint in Power BI Premium and extended it with some relevant examples. Everything relies on Josh’s Python module, which has functions to connect to TOM, run DAX queries, etc.

This does look pretty handy.

Comments closed

Python in Power BI Desktop

David Eldersveld dives into using Python as an external tool in Power BI:

Why use Python as an external “tool”? Even though Python isn’t a “tool” in the same sense as the “Big 3” community tools focused this month, I want to show how versatile the External Tools feature is. I also want to encourage people to use imagination and also explore how Power BI isn’t really as closed as some people think–at least the data model…

Some of these ideas are not exclusive to Python, but there’s enough variety in the Power BI and data science communities for people to possibly figure out if some of this might be useful within the context of their own environments, skills, and organizations.

David also follows up with a series of sample ideas.

Comments closed

Comparing Gradient Descent to the Normal Equation for Small Data Sets

Pushkara Sharma compares two techniques for regression:

In this article, we will see the actual difference between gradient descent and the normal equation in a practical approach. Most of the newbie machine learning enthusiasts learn about gradient descent during the linear regression and move further without even knowing about the most underestimated Normal Equation that is far less complex and provides very good results for small to medium size datasets.

If you are new to machine learning, or not familiar with a normal equation or gradient descent, don’t worry I’ll try my best to explain these in layman’s terms. So, I will start by explaining a little about the regression problem.

I was surprised by the results.

Comments closed

Comparing Koalas to PySpark

Tori Tompkins gives us an understanding of where Koalas fits in the Spark world:

One significant difference between Spark’s implementation of Dataframes and pandas is its immutability.

With Spark dataframes, you are unable to make changes to the existing object but rather create a brand new dataframe based on the old one. Pandas dataframes, however, allow you to edit the object in place. With Koalas, whilst still spark Dataframes under the hood, have kept the mutable syntax of pandas.

It does this by introducing this concept of an ‘Internal Frame’. This holds the spark immutable dataframe and manages the mapping between the Koalas column names and Spark column names. It also manages the Koalas index names to spark column name to replicate the index functionality in pandas (covered below). It acts as a bridge between Spark and Koalas by mimicking the pandas API with Spark. This Internal Frame replicates the mutable functionality of pandas by creating copies of the internal frame but appearing to be mutable.

Read the whole thing.

Comments closed

Delaying Loops in Python

Jon Fletcher gives us three methods to delay a Python loop:

In this blog post, I will run through 3 different ways to execute a delayed Python loop. In these examples, we will aim to run the loop once every minute.
To show the delay, we will print out the current datetime using the datetime module.

I think I’d be a bit concerned about all three potentially failing—option 1 can get out of sync if it takes more than 1 second to process inside the loop, option 2 might miss a minute, and option 3 can also sometimes miss. But it’s interesting to think about.

Comments closed

Stock Price Predictions with LSTM Models

Thenuja Shanthacumaran walks us through training a Long Short-Term Memory neural network model for predicting stock prices:

LSTM could not process a single data point. it needs a sequence of data for processing and able to store historical information. LSTM is an appropriate algorithm to make prediction and process based-on time-series data. It’s better to work on the regression problem.

The stock market has enormously historical data that varies with trade date, which is time-series data, but the LSTM model predicts future price of stock within a short-time period with higher accuracy when the dataset has a huge amount of data.

Click through for the process and a demo.

Comments closed

Installing TensorFlow and Keras for R on SQL Server 2019 ML Services

I have a post on using TensorFlow and Keras in R on SQL Server 2019 Machine Learning Services:

What I’m doing is building a new virtual environment named r-reticulate, which is what the reticulate package in R desires. Inside that virtual environment, I’m installing the latest versions of tensorflow-probabilitytensorflow , and keras. I had DLL loading problems with TensorFlow 2.1 on Windows, so if you run into those, the proper solution is to ensure that you have the appropriate Visual C++ redistributables installed on your server.

Then, I switched back to the base virtual environment and installed the same packages. My thinking here is that I’ll probably need them for other stuff as well (and don’t tell anybody, but I’m not very good with Python environments).

Please continue not to tell anybody that I’m not very good with Python environments. I tend to dump things in the base environment, forget which one I’m in, and all kinds of other bad practices. I think I’m secretly undermining myself in Python, but I don’t have enough proof yet.

Comments closed

Portfolio Optimization with SAS and Python

Sophia Rowland shows off the sastopypackage:

I started by declaring my parameters and sets, including my risk threshold, my stock portfolio, the expected return of my stock portfolio, and covariance matrix estimated using the shrinkage estimator of Ledoit and Wolf(2003). I will use these pieces of information in my objective function and constraints. Now I will need SWAT, sasoptpy, and my optimization model object.

Read on for a demo.

Comments closed

Avoiding Overfitting and Underfitting in Neural Networks

Manas Narkar provides some advice on optimizing neural network models:

Adding Dropout

Dropout is considered as one of the most effective regularization methods. Dropout is basically randomly zero-ing or dropping out features from your layer during the training process, or introducing some noise in the samples. The key thing to note is that this is only applied at training time. At test time, no values are dropped out. Instead, they are scaled. The typical dropout rate is between 0.2 to 0.5.

Click through for a demo on dropout, as well as coverage of several other techniques.

Comments closed