Estimated Number Of Rows To Be Read

Rob Farley discusses a successful Connect item’s implementation:

I opened it up, and sure enough, no sign of that 7,276 value. It looks just the same as the estimated plan I just showed.

Getting plans out of the cache is where the estimated values come into their own. It’s not just that I’d prefer to not actually run potentially-expensive queries on customer databases. Querying the plan cache is one thing, but running queries to get the actuals – that’s a lot harder.

With SQL 2016 SP1 installed, thanks to that Connect item, I can now see the Estimated Number of Rows to be Read property in estimated plans, and in the plan cache. The operator tooltip shown here is taken from the cache, and I can easily see that Estimated property showing 7,276. This is shown from Management Studio because Plan Explorer doesn’t yet call out this property explicitly:

If you’re looking to use SQL Server 2016 SP1, read the whole thing; this will make query tuning without running those horribly expensive queries a bit easier.

Wanted: Database-Level Wait Stats

Arun Sirpal would like to see database-level wait stats:

Wait Stats is my “go-to” thing, when you want to dig into performance issues everyone knows you will probably end up using sys.dm_os_wait_stats. You cannot use this in the Azure world, you have to use a DMV that is scoped to the database level. I think this would be a nice idea to have with the “earth” based SQL Servers – the ability to return information about all the waits encountered by threads that executed at the database level.

The connect item can be found at this link:

I like this idea.

Wanted: Sorting Stacked Charts In Power BI

Melissa Connors would like to control stack order for a stacked chart in Power BI:

Currently, it’s displaying the stacked column in alphabetical order from the bottom up (coyote/red on the bottom, fox/black in the middle, and wolf/gray on the top). I want the ability to display it by the count for each type, not the name. In this example, I want the order of coyote, wolf, and fox. I believed that there must be a way to control the order within a stacked column, so I clicked all over the UI and fired up the Google.

Read the whole thing and vote for this Power BI idea if it’s interesting.

Wanted: Forced Parallelism

James Anderson would like a query hint to force parallelism:

I would like to see a hint that causes the optimiser to consider a parallel plan no matter the cost of the query. It’s possible to get this behaviour with trace flag 8649 but it’s unsupported by Microsoft and therefore unfit for production use.

I only tend to use query hints as a very last resort. It’s almost always better to allow the optimiser to make these decisions and continue to reevaluate these decisions as your data changes but sometimes they can be a get out of jail free card.

I’ve written about this before and so has Paul White. Paul created this connect item for something similar. Please vote it up if you think it could help with your query tuning.

Click through for the full argument, and then hit the Connect item if you agree.

Parsing JSON In R

Kevin Feasel



Tomaz Kastrun shows how to feed a JSON data set into R and turn that into a proper data frame:

JSON has very powerful statements for converting to and from JSON for storing into / from SQL Server engine (FOR JSON and JSON VALUE, etc).  And since it is gaining popularity for data exchange, I was curious to give it a try with R combination.

I will simply convert a system table into array using for json clause.

There’s an R library.  There’s always an R library.

String Or Binary Data

Lori Edwards wants to fix one of the most annoying error messages in SQL Server history:

The Connect item that I’m promoting today is Connect #339410 : Please fix the “String or binary data would be truncated” message to give the column name. This Connect item was opened in 2008 and has 1,328 upvotes. If we can create virtual realities, we should be able to add a column name to an error message. Help us Microsoft, you’re our only hope.

This error message is incredibly annoying, particularly because it seems the engine should know which record has an error and on which column that error is.  We finally got information on which row caused a primary or unique key constraint violation, and that was helpful; this would be at least as helpful.

Replication Support For Dropping Tables

Drew Furgiuele recovers from a case of the vapors after discovering a replication codebase update:

To enable this change though, there’s more to it than just “you need to be running this service pack level on your publisher.” It’s far, far more than that:

  • Your publisher needs to be running SQL Server 2016 SP1 or 2014 SP2. We’ve already seen that is a requirement based on the content of the KB article. Big whoop.
  • In addition, if you have the same article in more than one publication, you have to enable this feature on all of them or you’ll still get the error that a table is in replication and can’t be dropped.
  • Your subscribers don’t care about the SQL Server version (except when you have a data type that exists in the publisher that doesn’t at the subscriber. Then it cares. But for this change, it doesn’t).
  • Your distributor also needs to be running a version that supports this functionality.

I can’t make that last point any bolder; that’s not documented anywhere that I could find. I’ll explain why later, but for now, roll with it. In fact, let’s see what happens if we don’t upgrade our distributor as well. Time to go to the lab!

Read the whole thing, particularly the warning about how dropping a table on the publisher will not drop tables on the subscriber.

Reading Extended Event Data From Powershell

Dave Mason builds a Powershell script to parse Extended Events information:

Powershell takes center stage for this post. Previously, I showed how to handle a SQL Server Extended Event in C# by accessing the event_stream target. We can do the same thing in PowerShell. The code translates mostly line-for-line from C#. Check out the last post if you want the full back story. Otherwise, continue on for the script and some Posh-specific notes.

Read on for the code.

Connect Items Galore

Aaron Bertrand has a series of Connect items of interest:

There are so many active Connect items, it’s difficult to choose just one, but after spending some time looking at the generate_series function in PostgreSQL, the one at the top of my list right now is this suggestion from Erland Sommarskog (blog):

Connect #258733 : Add a built-in table of numbers

I think it would be cheap and easy for SQL Server to provide a defaut table of numbers, always in memory, just like any other catalog view or DMV. This thing can use compression now, regardless of edition, so even a table with 2 billion rows will only take 13 MB, and should be easy to populate either at start-up or on first use. I could list out all of the potential uses for a numbers table, but they’re fairly well documented in the following sources (and most use cases don’t need anywhere near 2 billion values, so maybe the range could be defined using a sys.configurations or database-level SCOPED CONFIGURATION setting)

Click through for additional Connect items, and please vote on any Connect items which catch your eye.

The Importance Of Auditing

Louis Davidson has a parable about database design and systems auditing:

This brings me to my data question. If an order is processed in a store, but the expected data is not created, did that order ever occur?

Very often, the staff of a business are very focused on pleasing the customer, making sure they get their product, but due to software limitations, may not end up not capturing information about every sale in a satisfactory manner. Most of the blame I have seen lies in software that doesn’t meet the requirements of a customer, making capturing desired details tedious to achieve when the process is in the norm. Very often the excuse programmers give is that too much work of the work to build a system would need to be done for the atypical cases, but requirements are requirements, and it is generally essential that every action that occurs in a business is captured as data.

Read on for more.  My conjoined twin case is, how much information do we have about why users give up?  For example, if you have a three-part form, how many users get through part one, part two, and part three?  There’s some natural level of attrition, but if you see an abnormally low follow-through rate, that might indicate a bug or major issue.  Auditing is hard work, as you have to hit both sides of the problem at the same time.


January 2017
« Dec Feb »