Debugging a Pivot

Ed Elliott takes us through problems with the PIVOT statement:

If you have a PIVOT query and it isn’t returning the data you expect, what can you do to troubleshoot it? The thing to do is to break it down into the constituent parts. First, lets take a look at a query and see what we can do to help.

Click through for potential problems and their solutions.

Avoiding DONE Tokens in Loops

Kevin Feasel

2019-05-23

T-SQL

Emanuele Meazzo shows one reason why loops can be so much slower in T-SQL:

Not everybody knows that SQL Server sends a DONE Token to the client each time that a SQL statement completes (so, everything except variable declarations); For the query above you can basically track it with extended events by tracking the “SQL Statement Completed” event.
What happens in a loop? For each statement that’s completed, a token is sent, which means that for this loop that contains 3 statements (the WHILE loop itself it’s a statement) 15 Million tokens are sent to the client.

There can be a drastic difference, as Emanuele shows.

Modifying XML in T-SQL

Max Vernon takes us through the .modify function:

Determining the property syntax when modifying XML values in SQL Server can be time consuming if you don’t work with XML regularly. SQL Server includes a very flexible XML subsystem, called XML_DML, or XML Data Manipulation Language. XML_DML can be used to easily and effectively update XML values in an xml-typed column or variable. This question on dba.stackexchange.comasked about using the .modify function to change the value of an element, which in turn prompted this post.

Read on for a number of examples.

Considerations when Deleting Lots of Data

Kevin Feasel

2019-05-16

T-SQL

Ed Elliott takes us through things to think about before deleting a few million rows from a table:

Fragmentation
Fragmentation occurs when we delete from pages, and there is still data surrounding our data. If we have 100 rows and delete every odd row, we would have perfect fragmentation in that we have doubled the size of the data that we need. If we delete rows 1-49, even though we remove the same number of rows we don’t have any fragmentation as the data is in a continuous block. Knowing how the data is stored on disk and how the data will be deleted, is it the first x records or every x record is vital so that we know whether, after the delete, we should also reorganise the indexes to remove the deleted records.

Ed has quality insights here, so check it out.

Puzzling Through Older Problems

Kenneth Fisher shares a couple of interview puzzles:

The year is 2004. You’re taking a tech test as an interview for a SQL development job. They have a page in their application that displays up to 20 rows of information. They need a piece of code that will return the rows from a given page. Oh, and it may not always be 20 rows per page. You need to write a piece of code where they can pass in a page number and page size and get back results. So for example, if the page size is 20 and the page is 3 then you need to return back rows 41 to 60.

The answers aren’t on the page, but then again, that’s the point of a puzzle.

Testing Maximum Rows in Table Value Constructors

Solomon Rutzky dives into how big a table value constructor can be in terms of rows:

On 2019-05-08, a helpful individual, Michael B, commented on my answer saying that the 1000-row limit only existed when using a TVC as the VALUES clause of an INSERT statement. And, that there was no limit when using a TVC as a derived table.

Could this be true?

Solomon finds out. Click through and so can you.

T-SQL Tips Regarding Subqueries

Itzik Ben-Gan provides quality information on working with subqueries in SQL Server:

In this plan you see a Nested Loops (Left Semi Join) operator, with a scan of the clustered index on Customers as the outer input and a seek in the index on the customerid column in the Orders as the inner input. You also see an outer reference (correlated parameter) based on the custid column in Customers, and the seek predicate Orders.customerid = Customers.custid.

So why are you getting the plan in Figure 1 and not the one in Figure 2? If you haven’t figured it out yet, look closely at the definitions of both tables—specifically the column names—and at the column names used in the query. You will notice that the Customers table holds customer IDs in a column called custid, and that the Orders table holds customer IDs in a column called customerid. However, the code uses custid in both the outer and inner queries. 

Itzik covers three specific scenarios, all of which can cause trouble to database developers who haven’t been burned yet. And sometimes even those who have.

Splitting Arrays with OPENJSON

Kevin Feasel

2019-05-08

JSON, T-SQL

Dave Mason continues a journey into parsing JSON with T-SQL:

Starting with SQL Server 2016, Microsoft provided a STRING_SPLIT function. It is a table-valued function that splits a string into rows of substrings, based on a specified separator character. It’s been a welcome addition that we waited a long time for. It has one shortcoming, though: the order of the output rows is not guaranteed to match the order of the substrings in the input string.

Microsoft also provided support for parsing JSON data starting with SQL Server 2016. I discovered the OPENJSON function can be used to split strings, and it can also return the ordinal position of each substring from the original input string.

There are some limitations which you’d expect, namely around requirements for valid JSON.

Parsing JSON with T-SQL

Kevin Feasel

2019-05-06

JSON, T-SQL

Dave Mason has a primer on JSON parsing using T-SQL:

Microsoft added support for JSON data beginning with SQL Server 2016. JSON is an open-standard file format consisting of attribute–value pairs and array data types. It is commonly used to transmit data objects for asynchronous browser–server communication. But it is also used for storing unstructured data in files or NoSQL databases such as Microsoft Azure Cosmos DB. For most of us, SQL Server’s support for JSON probably means two things: we can convert relational data to JSON and vice versa. In this post, I’ll focus on converting JSON to relational data and share what I’ve learned from a recent experience.

I’ve been pleasantly surprised with the way JSON support works in SQL Server. It’s supported every complicated scenario I’ve had to deal with so far, including nesting, deciding with or without arrays for the outer element, quotes or no quotes around numbers, etc.

Minimal Logging When Inserting into Heaps

Paul White gives us the lowdown on minimal logging when performing INSERT..SELECT operations into heap tables:

When inserting rows using INSERT...SELECT into a heap with no nonclustered indexes, the documentation universally states that such inserts will be minimally logged as long as a TABLOCK hint is present. This is reflected in the summary tables included in the Data Loading Performance Guide and the Tiger Team post. The summary rows for heap tables without indexes are the same in both documents (no changes for SQL Server 2016):

But it’s not quite that straightforward, as Paul shows. Read the whole thing.

Categories

June 2019
MTWTFSS
« May  
 12
3456789
10111213141516
17181920212223
24252627282930