Live Query Stats Versus Actual Execution Plans

Kendra Little compares and contrasts Live Query Statistics against actual execution plans:

Getting plan details isn’t free. The amount of impact depends on what the query is doing, but there’s a stiff overhead to collecting actual execution plans and to watching live query statistics.

These tools are great for reproing problems and testing outside of production, but don’t time query performance while you’re using them– you’ll get too much skew.

Live Query Statistics is one additional tool, but won’t replace actual execution plans.  At its best, it will make you think more about what’s going on with the system, whether row counts are what you’re expecting, and take account of which operators stream data through without blocking (such as nested loop joins) versus those which require all the data before continuing (sorts).

Power BI Auto-Installation

Simon Sabin uses Powershell to install Power BI:

Having recently been having rebuilding my machine I finally decided to automate the process of installing the software I need.

This was a life saver as I was reinstalling a few times to try and figure out why I wasn’t getting sound on my external monitor. So I was gradually uninstalling everything until I found out that it was Hyper-v that was causing the problem.

The outcome meant I was installing PowerBI lots and had to automate it.

This looks like the first step toward a Chocolatey script.

Joining On NULL

Erik Darling has opened a can of worms here:

WITH ALL THE TROUBLE NULLS CAUSE…

You’d think people would be more inclined to avoid them. Slap a NOT NULL constraint and a default value on your column and call it a day. I’m perfectly fine with bizarro world canary values. If it’s an integer column, some really high (low?) negative number. If it’s date-based, why not have it be the lowest value your choice accomodates?

Check out the comments, definitely.  I don’t think it’s as clear-cut as Erik argues; the idea of NULL has been and will remain controversial because it’s a useful concept but one which requires explicit consideration.

SQLPS Update

Chrissy LeMaire has an update on SQLPS and SQL Server Linux:

  • Microsoft is investigating options for a cross-platform lightweight SQL Management Studio GUI tool for Linux.

  • Microsoft is investigating open sourcing the SQL Server PowerShell provider and cmdlets, and that it “makes a lot of sense” and “aligns with what Microsoft has already done with our Azure PowerShell cmdlets on github.” This is being tracked by connect item 2442788.

  • Microsoft doesn’t have dates or more details to share for any of these items at this time and will keep the community updated on their progress as they continue to evaluate our plans based on customer feedback

I’m most interested in the first of these points, but this is all interesting news.  Also check out her guest appearance on the PowerScripting Podcast.

Find Object Dependencies

Kevin Feasel

2016-03-18

T-SQL

Manoj Pandey has pulled out the code used in Management Studio to get dependencies:

And here is a very lengthy (~900 lines) T-SQL Code that I generated from SSMS & SQL Profiler to check the same Dependencies of a Table in SQL Server 2014. You can also create a Stored Procedure and apply the Table & Schema as parameters.

You can just replace the Table & Schema in the first 2 lines and execute the code to check table dependencies

You might be able to optimize this script, but it’s nice to have a starting point.

Nested Display Folders

Koen Verbeeck shows how to use nested display folders in Analysis Services and get Power BI to use them as well:

On the same day, I also found out it’s possible to nest display folders in SSAS. Which was even better, because I have a project with dozens of measures and this functionality really makes the difference. All you have to do is putting backslashes to indicate where a new child folder begins

This makes intuitive sense, so good on Microsoft for supporting this.

Querying Active Directory From SQL Server

Ryan Adams shows us how to use OPENROWSET and OPENQUERY to connect to a domain controller and query Active Directory using LDAP:

In the code below, the first thing we do is enable Ad Hoc Distributed Queries so we can try out the OPENROWSET method.  The advantage to this method is not having a linked server and being able to call it directly out of TSQL.  Once we have that enabled we write our query and you’ll notice that we are essentially doing 2 queries.  The first query is the LDAP query inside the OPENROWSET function.  Once those results are returned we are using another query to get what we want from the result set.  Here is where I want you to stop and think about things.  If my LDAP query pulls back 50 attributes, or “columns” in SQL terms, and I tell it I only want 10 of them, what did I just do?  I brought back a ton of extra data over the wire for no reason because I’m not planning to use it.  What we should see here is that the columns on both SELECT statements are the same.  They do not, however, have to be in the same order.  The reason for that is because LDAP does not guarantee to return results in the same order every time.  The attribute or “column” order in your first SELECT statement determines the order of your final result set.  This gives you the opportunity to alias anything if you need to.

You can query LDAP using SELECT statements, but the syntax isn’t T-SQL, so in my case, it was a bit frustrating getting the data I wanted out of Active Directory because I was used to T-SQL niceties.  Nevertheless, this is a good way of pulling down AD data.

Power Pivot Or Power Query?

Avi Singh explains when to use Power Query versus Power Pivot:

On Power BI Desktop, you don’t even have a choice – the only route to connect to data is via the “Get Data/Power Query” interface. Which is A-Okay with me. Even with Excel, I now connect to ANY data using Power Query.

Use Power Query to fill all your Get Data needs

Yes, ANY data. Even if I could connect using Power Pivot to those data sources and did not need any transformation  – I still always use Power Query.

Power Query to get data, Power Pivot to model data.  Avi then gives a few examples of scenarios, explaining where each fits in.

Blocking Operators

Gail Shaw explains how operators like sort can reduce the actual row count:

A non-blocking operator is one that consumes and produces rows at the same time. Nested loop joins are non-blocking operators.

A blocking operator is one that requires that all rows from the input have been consumed before a single row can be produced. Sorts are blocking operators.

Some operators can be somewhere between the two, requiring a group of rows to be consumed before an output row can be produced. Stream aggregates are an example here.

Gail ends by explaining that this is why “Which way do you read an execution plan, right-to-left or left-to-right?” has a correct answer:  both ways.  This understanding of blocking, non-blocking, and partially-blocking operators will also help you optimize Integration Services data flows by making you think in terms of streams.

Ragged Right Files

Sifiso Ndlovu walks us through ragged right formatted files in Integration Services:

The configuration of columns is perhaps a critical part of the entire ETL process as it helps us build mapping metadata for your ETL. In fact, regardless of where or not SSIS/SSMS can detect delimiters, if you skip Column Mapping section – your ETL will fail validation. In order to clarify how Ragged right formatted files work, I have gone a step back and used Figure 4 to actually displayed a preview of our fictitious Fruits transaction dataset from Notepad++. It can already be seen from Notepad++ that the file only has row delimiter in a form of CRLF.

Read the whole thing.

Categories

August 2018
MTWTFSS
« Jul  
 12345
6789101112
13141516171819
20212223242526
2728293031