Wanted: Filtering Power BI Data Views

Rob Collie is rabble-rousing and building up a pitchfork mob:

I don’t know how we’ve managed to go this long without the ability to Filter in the Data View:

As simple as it sounds, this is the #1 thing I miss when I leave the Excel Power Pivot environment for Power BI Desktop.

Why do I miss it?  Debugging and Validating, primarily.  When I’m getting results from a measure that I don’t trust, it is super common for me to filter a table in Power Pivot so that I’m just looking at a specific set of rows.  And then I often filter by a second column, and then sometimes a third, in order to see if there are any unexpected surprises in the data.  Scrolling through thousands of rows looking with my eyes just doesn’t cut it.

This is positively crucial, at least for me, and creating a Table or Matrix visualization to perform the same task is such a high-friction alternative that I’m usually tempted to start over in Power Pivot.  Ugh.

In fact this is the overwhelming #1 reason why, if I’m starting a model from scratch, I try to do as much of the work as I can in Power Pivot before converting over to PBIX format.

Rob has provided a link to the Power BI Ideas forum, making it easy to vote this item up.

Better Grouping With dplyr

John Mount builds a function to improve upon the group-by to mutate model in dplyr:

The advantages of the shorthand are:

  • The analyst only has to specify the grouping column once.
  • The data (mtcars) enters the pipeline only once.
  • The analyst doesn’t have to start thinking about joins immediately.

Frankly I’ve never liked the shorthand. I feel it is a “magic extra” that a new user would have no way of anticipating from common use of group_by() and summarize(). I very much like the idea of wrapping this important common use case into a single verb. Adjoining “windowed” or group-calculated columns is a common and important step in analysis, and well worth having its own verb.

Below is our attempt at elevating this pattern into a packaged verb.

Click through for the script.  I’d like to see something like this make its way into dplyr.

Azure SQL Database Security Basics

Arun Sirpal explains some of the security features exposed in Azure SQL Database:

You do not really have to use all of them, this is down to you and your requirements but at least you have decisions to make. I will mention TDE (Transparent Data Encryption), I know a lot of people will opt for TDE in Azure. The big advantage of TDE in Azure over the earthed flavour is that Microsoft does a lot of the work for you, especially around the key management side of things. Also assuming your database is in a geo-replication partnership it will be protected by a different key on each of the servers. Microsoft will also rotate your certificate at least every 90 days, doing this with a local based SQL Server can be quite manual and fiddly (well I think it is).

Read the whole thing if you’re thinking of moving forward with Azure SQL Database, or if you already have a database up in Azure and haven’t checked the latest offerings yet.

Building Multiple Docker Containers At Once

Andrew Pruski shows how to use Docker compose to build multiple containers simultaneously:

As SQL Server people we’re only going to be interested in one application but that doesn’t mean we can’t use compose to our advantage.

What I’m going to do is go through the steps to spin up 5 containers running SQL Server, all listening on different ports with different sa passwords.

Bit of prep before we run any commands. I’m going to create a couple of folders on my C:\ drive that’ll hold the compose and dockerfiles: –

Andrew also explains a couple of common errors as he walks us through the process.

Annoying Date Formats

Randolph West shows that even The Best Date Format can deceive you under certain circumstances:

Look carefully. DATE and DATETIME2 are showing the date of 12 July 2017 as expected. Unfortunately, the DATETIME and SMALLDATETIME data types are showing a date of 7 December 2017.

That’s not good at all. It means that the ISO 8601 standard does not work the way we might expect it to. The reason is simple, if annoying: we need to add a time to the date to make it pass the ISO 8601 standard.

I don’t like the idea of having to write 20170713 instead of 2017-07-13, but that is the only date format in SQL Server that I’ve run across that will work with any language and culture settings.

Installing Python Support In SQL Server

Ginger Grant has a teaser for her upcoming 24 Hours of PASS talk:

The process for using Python in SQL Server is very similar to the previous process of installing R.  Microsoft renamed R Services to Machine Learning Services, and now allows both R and Python to be installed, as shown in the screen.  Microsoft’s version of Python uses Anaconda, which is an open source analytics platform created by Continuum. This is where Python differs from other open source languages, as Continuum is providing the version of Python as it contains data science components which are not included in the standard distribution of Python. Continuum also sells an enterprise version of Anaconda, with of course more features than come with the free version. It is important to remember the python environment as you will need select the same distribution when running Python code outside of SQL Server.

Read on to see how to install Python support in SQL Server 2017 and for a few links to tools.

Parameter Sniffing On Conditional Statements

Kendra Little explains that SQL Server will cache parameter values for invalid statements:

The first time that dbo.ReviewFlags is executed after the database comes online, it’s with an invalid parameter, like this:

  • EXEC dbo.ReviewFlags @Flag = null;
  • GO

This is caught by the IF block, hits the RAISERROR, and goes down to the THROW block, and the output is:

  • Msg 50000, Level 11, State 1, Procedure ReviewFlags, Line 8 [Batch Start Line 70]
  • @Flag must be a value between 1 and 5

But even though SQL Server didn’t execute the SELECT statement, it still compiled it. And it also cached the plan.

Read on to understand the trouble this can cause, as well as a few ways of solving the problem.  This is a special case of parameter sniffing problems, but the solutions are the same as in the general case.

STOP Date Formats

Dave Mason notes that the STOPAT date option when restoring a log backup is temperamental:

There’s nothing I see in the documentation regarding the format for “time“. But there are a couple of examples, including this one:

RESTORE LOG AdventureWorks
FROM AdventureWorksBackups
WITH FILE=4, NORECOVERY, STOPAT = 'Apr 15, 2020 12:00 AM';

That string looks suspiciously like a US English date format. I suspect that wouldn’t work for languages that don’t recognize “Apr” as a month. And what if the date is displayed in one of the many date formats used outside of the US? Lets find out!

Dave tried 21 different date formats; click through for the results.


July 2017
« Jun