Polybase External Data Source To Hadoop

I take a look at connecting to a Hadoop cluster for Polybase:

There are a couple of things I want to point out here.  First, the Type is HADOOP, one of the three types currently available:  HADOOP (for Hadoop, Azure SQL Data Warehouse, and Azure Blob Storage), SHARD_MAP_MANAGER (for sharded Azure SQL Database Elastic Database queries), and RDBMS (for cross-database Elastic Database queries on Azure SQL Database).

Second, the Location is my name node on port 8020.  If you’re curious about how we figure that one out, go to Ambari (which, for me, is http://sandbox.hortonworks.com:8080) and go to HDFS and then Configs.  In the Advanced tab, you can see the name node:

There are different options available for different sources, but this post is focused on Hadoop.

The Value Of Unused Indexes

Erik Darling provides a scenario in which an index which does not get used in an execution plan can nonetheless help query performance:

We can see an example of this with unique indexes and constraints, but another possibility is that the created index had better statistical information via the histogram. When you add an index, you get Fresh Hot Stats, whereas the index you were using could be many modifications behind current for various reasons. If you have a big table and don’t hit auto-update thresholds often, if you’re not manually updating statistics somehow, or if you’re running into ascending key weirdness. These are all sane potential reasons. One insane potential reason is if you have autocreate stats turned off, and the index you create is on a column that didn’t have a statistics object associated with it. But you’d see plan warnings about operators not having associated statistics.

Again, we’re going to focus on how ADDING an index your query doesn’t use can help. I found out the hard way that both unique indexes and constraints can cease being helpful to cardinality estimation when their statistics get out of date.

This is sort of like a triple bank shot solution:  even if it works that one time, there are easier ways to do it—and those ways are more likely to succeed to boot.

Comments And Performance

Aaron Bertrand looks at whether comments affect query performance:

Every once in a while, a conversation crops up where people are convinced that comments either do or don’t have an impact on performance.

In general, I will say that, no, comments do not impact performance, but there is always room for an “it depends” disclaimer.

I’m glad that there’s no appreciable difference.  Even if there were, good comments are valuable enough to make me not care about performance implications.  But fortunately, that’s not a trade-off I have to make.

Powershell Cmdlets For SSRS

Aaron Nelson reports that there are now Powershell cmdlets for SQL Server Reporting Services:

I have been testing these commands for several weeks and so far my favorite command is Write-RsFolderContent because it will allows you to write the .RDL & .RSD files from a directory on your machine to your SSRS folder. Like the whole thing. You don’t have to throw it into a loop or anything. Try it out!

This is a wonderful replacement for the old RSScripter app (of which I still have a copy squirreled away somewhere).

Biml-Based Audio

Kevin Feasel



Bill Fellows shows the most unique usage of Biml I’ve seen yet:

How often do you need to play audio while you’re compiling your Biml packages? Never? Really? Huh, just me then. Very well, chalk this blog post as one to show you that you really can do *anything* in Biml that you can do in C#.

When I first learned how I can play audio in .NET, I would hook the Windows Media Player dll and use that. The first thing I then did was create an SSIS package that had a script task which played the A-Team theme song while it ran. That was useless but a fun demo. Fast forward to using Biml and I could not for the life of me get the Windows Media Player to correctly embed in a Biml Script Task. I suspect it’s something to do with the COM bindings that Biml doesn’t yet support. Does this mean you shouldn’t use Biml – Hell no. It just means I’ve wandered far into a corner case that doesn’t yet have support.

Read on because it will make you a better person.

Power BI Streaming Datasets

Chris Webb shows how to use Streaming Datasets functionality in Power BI:

This query does the following:

  • Reads the data from the Sales table in the workbook

  • Converts the data to JSON (for some background on how it does this, see here)

  • Sends the data to the streaming dataset using Web.Contents() to make a POST request. See this post on how to make POST requests using Web.Contents() and this post on the technique I’m using to handle HTTP errors manually.

  • Returns a table containing a message saying whether the data was updated successfully or not, and the time of execution like so:

There are some limitations on what’s available now, but getting streaming data out to Power BI can make a near-real-time dashboard possible.


November 2016
« Oct Dec »