Press "Enter" to skip to content

Author: Kevin Feasel

Power BI AutoML

Teo Lachev takes a look at AutoML in Power BI:

Let’s see how AutoML works based on what’s in the private preview (the usual disclaimer is that things will probably change). To start with, AutoML requires a dataflow (a note to Microsoft here is that AutoML will become more pervasive if it’s available in Power BI Desktop and it doesn’t require a premium capacity). In the private preview, AutoML requires the following steps. Presumably. the first (and most difficult step), preparing the dataset and cleansing the data is already done and available as a dataflow entity:

It looks like Microsoft’s taking what they learned from Azure ML and trying to port it over to Power BI.

Comments closed

The Pain of DST and the Lessened Pain with AT TIME ZONE

Bert Wagner shows how you can use AT TIME ZONE as of SQL Server 2016 to make dealing with Daylight Savings Time a little less painful:

The fallacy above is that I said our two datetime2’s are in UTC, but SQL Server doesn’t actually know this. The datetime2 (and datetime) datatype doesn’t allow for time zone offsets so SQL Server really doesn’t know what time zone the data is in.

Using AT TIME ZONE on a datetime2 without offset information causes SQL Server to “…[assume] that [the datetime] is in the target time zone”. That explains why the two datetime2s above, intended to be in UTC, are actually seen as Eastern Daylight Time by SQL Server.

Read the whole thing. Dates and times are a lot more difficult than they first appear. And then they turn out to be a lot more difficult than that.

Comments closed

Big SSAS News In SQL Server 2019 CTP 2.3

Chris Webb is excited about what’s in SQL Server 2019 CTP 2.3:

With the release of CTP 2.3 of SQL Server 2019 today there was big news for Analysis Services Tabular developers: Calculation Groups. You can read all about them in detail in this blog post:

https://blogs.msdn.microsoft.com/analysisservices/2019/03/01/whats-new-for-sql-server-2019-analysis-services-ctp-2-3/

In my opinion this is the most important new feature in DAX since… well, forever. It allows you to create a new type of calculation – which in most cases will be a time intelligence like a year-to-date or a previous period growth – that can be applied to multiple measures; basically the same thing that we have been doing in SSAS Multidimensional for years with the time utility/shell/date tool dimension technique. It’s certainly going to solve a lot of problems for a lot of SSAS Tabular implementations, many of which have hundreds or even thousands of measures for every combination of base measure and calculation type needed.

Click through for more of Chris’s thoughts and how calculation groups will make your life easier.

Comments closed

Failing A Powershell Step In SQL Agent

Stuart Moore shows us how we can get a SQL Agent job running a Powershell step to recognize failure:

You might want the same response to make sure your monitoring is letting you know when jobs fail. On that note, it would also be nice if you could raise an error or failure message in your PowerShell step and have that propagate back up to SQL Server

Unfortunately the usual scripting standbys of returning 0 or $false don’t work.

Stuart does have a solution, though, so read on to learn what it is.

Comments closed

Multi-Server Real-Time Scoring with R

David Smith points out a reference architecture for real-time scoring with R distributed through Kubernetes:

Let’s say you’ve developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you expect a heavy load of requests, R running on a single server isn’t going to cut it: you’ll need some kind of distributed architecture with enough servers to handle the volume of requests in real time.

This reference architecture for real-time scoring with R, published in Microsoft Docs, describes a Kubernetes-based system to distribute the load to R sessions running in containers.

Looks interesting.

Comments closed

Pivot Tables and Grouping Sets With data.table in R

Jozef Hajnala shows how you can use data.table to perform fast pivoting and arbitrary grouping of data in R:

Data manipulation and aggregation is one of the classic tasks anyone working with data will come across. We of course can perform data transformation and aggregation with base R, but when speed and memory efficiency come into play, data.table is my package of choice.

In this post we will look at of the fresh and very useful functionality that came to data.table only last year – grouping sets, enabling us, for example, to create pivot table-like reports with sub-totals and grand total quickly and easily.

Grouping sets are also available in SQL dialects and tend to be something people tend not to be aware of. This is a shame because they’re quite powerful, and Jozef shows how powerful they can be in R.

Comments closed

Flowcharts in R

Anisa Dhana builds a sample flowchart with DiagrammeR:

After some search, I found that there are a few packages in R which allow making exemplary flowcharts. The one which I found easy to use was DiagrammeR. The advantage of this packages is that generate diagrams using code within R Markdown syntax.

The taped-glasses nerd in me wants to point out that flow charts use geometric shapes to show flow and that this is more properly labeled a graph (the examples are directed acyclic graphs), but hush, taped-glasses nerd self.

Comments closed

Analyzing Autosteer Data (Or Lack Thereof)

Elliot Williams has an interesting analysis of the NHTSA report on Tesla’s Autosteer capabilities:

But the NHTSA report went a step further. Based on the data that Tesla provided them, they noted that since the addition of Autosteer to Tesla’s confusingly named “Autopilot” suite of functions, the rate of crashes severe enough to deploy airbags declined by 40%. That’s a fantastic result.

Because it was so spectacular, a private company with a history of investigating automotive safety wanted to have a look at the data. The NHTSA refused because Tesla claimed that the data was a trade secret, so Quality Control Systems (QCS) filed a Freedom of Information Act lawsuit to get the data on which the report was based. Nearly two years later, QCS eventually won.

Looking into the data, QCS concluded that crashes may have actually increased by as much as 60% on the addition of Autosteer, or maybe not at all. 

This is a great exercise in statistical analysis and the problem of garbage in, garbage out.

Comments closed

Using the ALL Function in DAX

Reza Rad shows us how we can make use of the ALL function in a DAX query:

Among all the functions in DAX; the behavior of ALL function still seems mysterious for many. Many people, don’t use it at all and end up writing a very complicated calculation for a scenario that only one simple expression can do the same job. Some people, use it, but don’t exactly know how the function works, and get some unexpected results, and call it an error. In this article, I’m going to explain what ALL function is, how it can be used, and what are use cases of using such a function in DAX and Power BI. If you like to learn more about Power BI, read the Power BI book from Rookie to Rock Star.

Read on to see how the function behaves. Reza does a good job getting into the nuance of this function.

Comments closed