Press "Enter" to skip to content

Month: November 2016

Upcoming Polybase Enhancements

James Serra knows how to get my interest:

Polybase was first made available in Analytics Platform System in March 2013, and then in SQL Server 2016.  The announcement at the PASS Summit was that by preview early next year, in addition to Hadoop and Azure blob storage, PolyBase will support Teradata, Oracle, SQL Server, and MongoDB in SQL Server 2016.  And the Azure Data Lake Store will be supported in Azure SQL Data Warehouse PolyBase.

With SQL Server 2016, you can create a cluster of SQL Server instances to process large data sets from external data sources in a scale-out fashion for better query performance (see PolyBase scale-out groups):

I’m excited for the future of Polybase and looking forward to vNext and vNext + 1 (for the stuff which they can’t possibly get done in time for vNext).

Comments closed

Timeline Visual

Devin Knight looks at a new Power BI custom visual:

  • The Timeline is similar to the native slicer in Power BI but has several more customizations available.

  • Not surprising, this visual can only accept date values.

  • If you need to adjust the start date of the Timeline based on your works Fiscal Calendar that is possible in the format settings.

This is a pretty nice visual, but when I tried to use it, I remember it feeling a little limiting, particularly around drilling into date slices.

Comments closed

STRING_SPLIT Results

Louis Davidson looks at a couple edge cases with the STRING_SPLIT function in SQL Server 2016:

But what about the two versions of an empty value? ” (zero-length/empty string) and NULL. My NULL sense told me that the NULL one would return a single row with NULL, and the empty string would return a single empty string row.  Of course, I was wrong, and it makes sense why (a row of NULL would be really annoying, especially if you want to use the output as an exclusion list, because A NOT IN (SET(B,NULL)) always returns NULL, not TRUE. )

For example, say the output could include NULL. You could end up with something like the following, where even though the input value of A is not in the NOT IN list, no rows are returned:

Click through for more details.

Comments closed

Quickly Reloading Tables

Kenneth Fisher uses table partitioning to perform fast loads of data:

Now if this table is paritioned you’d use SWITCH and bring in a new partition.

For those that don’t know, when a table is partitioned, you can create a new empty partition, and a new empty table, load the table, make the table exactly match the partition (structure, check constraints, & indexes for example) and you can SWITCH it in. The SWITCH part is a metadata operation and is fast!

But what do you do if the table isn’t partitioned? Well, I was having a conversation with Andy Mallon (b/t) and he reminded me of something.

Read on for the details.  The upshot is that you can take your time loading the second table and once you’re ready to swap out, it’s a quick metadata change.  That’s really useful for ETL scenarios.

Comments closed

High Availability On Linux

David Bermingham looks at high availability within SQL Server on Linux:

With Microsoft’s recent release of the first public preview of MS SQL Server running on Linux, I wondered what they would do for high availability. Knowing how tightly coupled AlwaysOn Availability Groups and Failover Clustering is to the Windows operating system I was pretty certain they would not be options and I was correct.

Well, the people over at LinuxClustering.Net answered my question on how to provide high availability failover clusters for MS SQL Server v.Next on Linux with this great Step by Step article.

The linked article is amazing.  It uses a piece of third-party software to perform clustering, so it’s not a free solution.  We’ll see if Microsoft is able to build in a full HA solution in the first version of Linux-supported SQL Server, but if not, it looks like there’s an alternative.

Comments closed

Thinking About Linux Internals

Anthony Nocentino speculates on the internals of SQL Server on Linux:

OK, so everyone wants to know how Microsoft did it…how they got SQL Server running on Linux. In this article, I’m going to try to figure out how.

There’s a couple of approaches they could take…a direct port or some abstraction layer…A direct port would have been hard, basically any OS interaction would have had to been looked at and that would have been time consuming and risk prone. Who comes along to save the day? Abstraction. The word you hear about a million times when you take Operating Systems classes in undergrad and grad computer science courses. 🙂

Anthony talks about picoprocesses, which causes me to say that containers (like Docker) are probably the most important administrative concept of the decade.  If you don’t fundamentally get the concept, learning it opens so many doors.

Comments closed

Modifying The Query Store

Grant Fritchey answers a query store question:

When I was presenting on this topic at the PASS Summit a few weeks ago, one great question came up (great question = answer is “I don’t know”), well, I defaulted to an “I don’t know” answer, but my guess was, “No.” The question was: can you take a plan from one server, let’s say a test server, export it in some way, and then import it to production? In this manner, you ensure that a plan you like gets into production without having to clear the plan from cache & generate a plan by running the query.

Great idea.

Read on for the answer, as well as ways to manipulate query store data.

Comments closed