Press "Enter" to skip to content

Author: Kevin Feasel

Residuals

Simon Jackson discusses the concept of residuals:

The general approach behind each of the examples that we’ll cover below is to:

  1. Fit a regression model to predict variable (Y).

  2. Obtain the predicted and residual values associated with each observation on (Y).

  3. Plot the actual and predicted values of (Y) so that they are distinguishable, but connected.

  4. Use the residuals to make an aesthetic adjustment (e.g. red colour when residual in very high) to highlight points which are poorly predicted by the model.

The post is about 10% understanding what residuals are and 90% showing how to visualize them and spot major discrepancies.

Comments closed

Personalizing Power BI Dashboards

Avi Singh shares a few methods of allowing users to personalize their Power BI dashboards:

iv. Row Level Security

Row Level Security proved to be an effective approach for us to provide users a personalized view of their Dashboard & Reports based on the Organization they belonged to. The org hierarchy data was pulled directly from the Human Resource (HR) system, which allowed the Power BI Model to identify which user belonged to which department. In our sample data set, it looks as below.

Read the whole thing.

Comments closed

Fork Bombs

Brent Ozar creates a fork bomb in SQL Server:

I’ve always found fork bombs funny because of their elegant simplicity, so I figured, why not build one in SQL Server?

In order to do it, I needed a way to spawn a self-replicating asynchronous process, so I built:

  1. A stored procedure

  2. That creates an Agent job

  3. That runs the stored procedure

I didn’t think it was possible.  I certainly didn’t think it would take a half-dozen lines of code.

Comments closed

Introduction To R

Allison Tharp takes a look at R:

 

RStudio has several ways to import data.  One of the easiest ways is to import via URL.  This link (https://data.montgomerycountymd.gov/api/views/6rqk-pdub/rows.csv?accessType=DOWNLOAD) gives us the salaries of all of the government employees for Montgomery County, MD in a CSV format.  To import this into RStudio, copy the URL and go to Tools -> Import Dataset -> From Web URL…

R and Python are both good languages to learn for data analysis.  I lean just a little bit toward R, but they’re both strong choices in this space.

Comments closed

Azure SQL Database Alerts With Powershell

Mike Fal shows how to create Azure SQL Database alerts using Powershell:

So let’s get down to brass tacks and actually create an alert. To do this, we need some info first:

  • The Resource Group we will create the alert in.

  • An Azure location where the alert will live.

  • An Azure SQL Database server and database we are creating the alert for.

  • What metric we will monitor and what is the threshold we will be checking.

  • (optional) An email to send an alert to.

Mike follows this up with code and shows it’s not scary at all to create these alerts from within Powershell.

Comments closed

Getting Started With Azure ML

Koos van Strien gives a quick overview of Azure ML:

Before I started, I was already quite comfortable programming Python and did some R programming in the past. This turned out pretty handy, though not really needed to start off with – because starting with Azure ML, the data flow can be created much like BI specialists are used to in SSIS.

A good place to start for me was the Tutorial competition (Iris Petal Competition). It provides you with a pre-filled workspace with everything in place to train and test your first ML model:

I’d like to see Azure ML get more traction; I’m not optimistic that it will.

Comments closed

Not Catching Them All

Hanjo Odendaal explains clustering techniques using Pokemon:

To collect the data on all the first generation pokemon, I employ Hadley Wickam’s rvest package. I find it very intuitive and can handle all of my needs in collecting and extracting the data from a pokemon wiki. I will grab all the Pokemon up until to Gen II, which constitutes 251 individuals. I did find the website structure a bit of a pain as each pokemon had very different looking web pages. But, with some manual hacking, I eventually got the data in a nice format.

This probably means a lot more to you if you grew up in front of a Game Boy, but there’s some good technique in here regardless.

Comments closed

Migrating To Azure SQL Data Warehouse

Rangarajan Srirangam has a detailed article on steps you should take when migrating a database to Azure SQL Data Warehouse:

This article focuses on migrating data to Azure SQL Data Warehouse with tips and techniques to help you achieve an efficient migration. Once you understand the steps involved in migration, you can practice them by following a running example of migrating a sample database to Azure SQL Data Warehouse.

Migrating your data to Azure SQL Data Warehouse involves a series of steps. These steps are executed in three logical stages: Preparation, Metadata migration and Data migration.

It’s a lengthy read, but well worth it.

Comments closed

JupyterLab

Serdar Yegulalp reports that Jupyter is getting a major facelift:

JupyterLab uses a web-based UI that’s akin to the tab-and-panel interface used in IDEs like Visual Studio or Eclipse. Notebooks, command-line consoles, code editors, language references, and many more items can be arranged in various combinations, powered by the PhosphorJSframework.

“The entire JupyterLab [project] is built as a collection of plugins that talk to kernels for code execution and that can communicate with one another,” the developers wrote. “We hope the community will develop many more plugins for new use cases that go far beyond the basic system.”

It looks like they’re making major changes to keep up with Zeppelin on the back end.  The biggest advantage Jupyter had for me over Zeppelin was its installation simplicity, so I hope they keep it just as easy as installing Anaconda and then loading JupyterLab.

Comments closed

Copying A File Using SQL Server

Slava Murygin makes me want to add a “wacky ideas” category with this one:

At first, you have to read the file you want to copy into a SQL Server. You have to choose a database to perform that action. It can be Test database or you can create a new database to perform that action or it can be even TempDB. There is only two requirements for the database:
– It must not be a production Database;
– Database should have enough of space to accommodate the file you want to copy.

The idea is that if the database engine’s service account has rights to a file you want to access but don’t have permissions to access, you can bulk copy the contents as a binary blob and then grab the contents and write the results to your local system using bcp.  Sure, it becomes your company’s most expensive file copy tool, but I love the mad ingeniousness behind it.

Comments closed