Press "Enter" to skip to content

Month: July 2017

Dynamic Unpivoting For Change Detection

Shane O’Neill has a script that dynamically unpivots a pair of rows and compares values column by column, storing the changes in XML:

Overall, the script is longer at nearly double the lines but where it shines is when adding new columns.
To include new columns, just add them to the table; to exclude them, just add in a filter clause.

So, potentially, if every column in this table is to be tracked and we add columns all the way up to 1,024 columns, this code will not increase.
Old way: at least 6,144.
New way: at least 2,048.
Dynamic: no change

Read on for that script.  Even though his developer ended up not using his solution, Shane has made it available for the rest of the world so that some day, someone else can have the maintenance nightmare of trying to root out a bug in the process.

Comments closed

System Health XE

Kenneth Fisher describes what is in the system_health Extended Events session:

Per BOL you get the following information:

  • Errors with a severity of >= 20.

  • Memory related errors (Errors 17803, 701, 802, 8645, 8651, 8657 and 8902).

  • Non-yielding scheduler problems (Error 17883).

  • Deadlocks.

  • Sessions that have waited on locks for > 30 seconds.

  • Sessions waiting for a long time on preemptive waits (waits on external API calls).

Read on to learn more of the things this session contains as well as a couple ways you can access the data.

Comments closed

When You Need To Read Memory-Optimized Data From Disk

Ned Otter enumerates the scenarios in which SQL Server needs to read data from disk for memory-optimized tables:

Those who have studied In-Memory OLTP are aware that in the event of “database restart”, durable memory-optimized data must be streamed from disk to memory. But that’s not the only time data must be streamed, and the complete set of events that cause this is not intuitive. To be clear, if your database had to stream databack to memory, that means all your memory-optimized data was cleared from memory. The amount of time it takes to do this depends on:

  • the amount of data that must be streamed

  • the number of indexes that must be rebuilt

  • the number of containers in the memory-optimized database, and how many volumes they’re spread across

  • how many indexes must be recreated (SQL 2017 has a much faster index rebuild process, see below)

  • the number of LOB columns

  • BUCKET count being properly configured for HASH indexes

Read on for the list of scenarios that might cause a standalone SQL Server instance to need to stream data from disk into memory to re-hydrate memory-optimized tables and indexes.

Comments closed

TempDB System Table Contention

Alexander Arvidsson diagnoses an interesting problem:

I ran this several times to see if there was a pattern to the madness, and it turned out it was. All waits were concentrated in database ID 2 – TEMPDB. Many people perk up by now and jump to the conclusion that this is your garden variety SGAM/PFS contention – easily remedied with more TEMPDB files and a trace flag. But, alas- this was further inside the TEMPDB. The output from the query above gave me the exact page number, and plugging that into DBCC PAGE gives the metadata object ID.

His conclusion is to reduce temp table usage and/or use memory-optimized tables.  We solved this problem with replacing temp tables with memory-optimized TVPs in our most frequently-used procedures.

Comments closed

Avoid Ticks

Michael J. Swart shows you how to convert DATETIME2 values to Ticks:

A .Net tick is a duration of time lasting 0.1 microseconds. When you look at the Tick property of DateTime, you’ll see that it represents the number of ticks since January 1st 0001.
But why 0.1 microseconds? According to stackoverflow user CodesInChaos “ticks are simply the smallest power-of-ten that doesn’t cause an Int64 to overflow when representing the year 9999”.

Even though it’s an interesting idea, just use one of the datetime data types, that’s what they’re there for. I avoid ticks whenever I can.

I agree with Michael:  avoid using Ticks if you can.

Comments closed

DAX’s ALL() Function

Matt Allington explains what the ALL() function is in DAX and when you might want to use it:

The ALL() function seems very simple on the surface however it has layers of complexity.  In its most simple usage it is a function that simply returns a table (virtual or materialised).  The syntax for ALL() is as follows

=ALL(TableOrColumn,[Column2],[ColumnN]..)

ALL() will always return a table, not a value.  Because it is a table, you cannot put the result directly into a cell in a Pivot Table or a Matrix.  Think about it, you can’t put a table with (potentially) multiple columns and (potentially) multiple rows into a single cell in a visual – it wont “fit”.

There’s a lot to ALL() and Matt does a great job explaining it.

Comments closed

Don’t Fear The Tidyverse

David Robinson explains why he prefers to explain the tidyverse version of R first rather than base R:

I’d summarize the two “competing” curricula as follows:

  • Base R first: teach syntax such as $ and [[]], loops and conditionals, data types (numeric, character, data frame, matrix), and built-in functions like ave and tapply. Possibly follow up by introducing dplyr or data.table as alternatives.
  • Tidyverse first: Start from scratch with the dplyr package for manipulating a data frame, and introduce others like ggplot2, tidyr and purrr shortly afterwards. Introduce the %>% operator from magrittr immediately, but skip syntax like [[]] and $ or leave them for late in the course. Keep a single-minded focus on data frames.

I’ve come to strongly prefer the “tidyverse first” educational approach. This isn’t a trivial decision, and this post is my attempt to summarize my opinions and arguments for this position. Overall, they mirror my opinions about ggplot2: packages like dplyr and tidyr are not “advanced”; they’re suitable as a first introduction to R.

I think this is the better position of the two, particularly for people who already have some experience with languages like SQL.

Comments closed

Using Temporal Tables For SCD2

I have a post on pain that I experienced with temporal tables:

This query succeeds but returns results we don’t really want:

ProductModelTemporalSameDate

This brings back all 9 records tied to products 1 and 2 (because product 3 didn’t exist on July 2nd at 8 AM UTC). But it gives us the same start and end date, so that’s not right. What I really want to do is replace @InterestingTime with qsp‘s DatePredictionMade, so let’s try that:

ProductModelTemporalInvalid

This returns a syntax error. It would appear that at the time FOR SYSTEM_TIME is resolved, QuantitySoldPrediction does not yet exist. This stops us dead in our tracks.

This is one of the two things I’d really like to change about temporal tables; the other thing (now that auto-retention is slated for release) is the ability to backfill data without turning off system versioning.

Comments closed

Checking For Instant File Initialization

Klaas Vandenberghe shows how to use Powershell to determine whether Instant File Initialization is turned on:

Sometimes we want to apply a filter to an array or other collection of objects, but keep both the items that pass the filter and those that fail it. Instead of cycling twice through the collection, there’s a one-step method.

Instant File Initialization is a privilege assigned in the local security policy. Here’s some explanation by MSSQL Tiger Team.
There’s a lot to tell about it, but I’m not going to do that here. Let’s just assume it’s a good thing to assign that privilege to the account with which the SQL Service runs.

Klaas explains how to use Powershell filtering with Where-Object and the Where method for people new to Powershell, and then uses this to figure out if IFI is enabled.

Comments closed

Microsoft JDBC Driver 6.2

Andrea Lam announces the a new version of the JDBC Driver for SQL Server:

Performance improvements for Prepared Statements
Improved performance for Prepared Statements through caching (including prepared statement handle re-use). This behavior can be tuned using new properties to fit your application’s needs.

Azure Active Directory (AAD) support for Linux
Connect your Linux applications to Azure SQL Database using AAD authentication via username/password and access token methods.

Federal Information Processing Standard (FIPS) enabled Java virtual machines
The JDBC Driver can now be used on Java virtual machines (JVMs) that run in FIPS 140 compliance mode to meet federal standards and compliance.

Click through for more information, including a couple interesting features like additional timeouts you can set.

Comments closed