Press "Enter" to skip to content

Curated SQL Posts

Switching To Identities

James Anderson shows how to do a table switch to switch a table without an identity column to one with an identity column:

The SWITCH statement can instantly ‘move’ data from one table to another table. It does this by updating some meta data, to say that the new table is now the owner of the data instead of the old table. This is very useful as there is no physical data movement to cause the stresses mentioned earlier. There are a lot of rules enforced by SQL Server before it will allow this to work. Essentially each table must have the same columns with the same data types and NULL settings, they need to be in the same file group and  the new table must be empty. See here for a more detailed look at these rules.

If you can take a downtime, this is pretty easy.  Otherwise, making sure that the two tables are in sync until the switchover occurs is a key problem to keep in mind.

Comments closed

System Database Restrictions

Dave Mason ran into a system database restriction regarding number of files:

One thing I recently learned is that you can’t add files to [model]. I was hoping to impact the number of files a database has when it is created. It’s common for db’s to be created, with just the standard mdf/ldf pair of files. Then the db is loaded with varying amounts of data. After a certain threshold, it makes more sense to have multiple data files, based on the environment. Sure, I could add files later and “balance” the data across the files. But it would be nice to *not* have to do this; some down time may be required and LOB data presents its own challenges

There aren’t too many restrictions on the model database, but this is a biggie.

Comments closed

Plan Handles

Andy Mallon talks about plan handles (sql_handle):

If you’ve ever looked at sys.sysprocesses or sys.dm_exec_requests (or a number of other DMVs), you’ve noticed there is a column called “sql_handle” that contains some binary gobbledygook. Books Online gives the (un)helpful definition as “Hash map of the SQL text of the request.”

“Binary gobbledygook” is probably the best description of the plan handle available.

Comments closed

Finding Lost Customers

Meagan Longoria has lost customers and is trying to find them:

I was testing some calculations in my tabular model when I noticed that my Lost Customers calculation wasn’t working as expected. This was rather interesting to me since the calculation I was using was from DAXPatterns.com.  After some experimentation, I determined that the difference between my data situation, and the situation that fit the DAX Pattern was that my customer dimension was a Type 2 Slowly Changing Dimension. That meant I couldn’t use the customer key to identify a unique customer, since each customer could have multiple customer keys  (surrogate keys) with the same customer number (business key). Specifically, if a customer made a purchase in December 2015, then changed their name, then made a purchase in January and February, my calculation was counting them as lost because it was not recognizing that the customer with the new name and same customer number was actually the same customer.

She ends up with two solutions, each with different trade-offs.  Knowing as little DAX as I do, looking at two different ways of solving this problem is great because it gives you more insight into language semantics.

Comments closed

MDX For Beginners

Dan King continues his series on MDX:

So obviously we brought back internet Sales for all years captured into the cube.  So now we need to understand the WHERE clause and how to use it properly in MDX.

The “WHERE” clause in MDX is used to define another slicer and limit the attributes from the defined slicer. The slicer used in the “WHERE” clause cannot be the same slicer used when defining your rows hence why I use the date dimension.  The MDX query below depicts the WHERE clause and syntax:

I liked Dan’s introductory-level presentation on learning MDX, and this blog series is following along those same lines.

Comments closed

Locking Basics

Jason Brimhall has an introductory-level discussion of locking in SQL Server:

A fundamental component of SQL Server is locking and locks. Locks within SQL Server are critical to the proper functioning of the database and the integrity of the data within the database. The presence of locks does not inherently mean there is a problem. In no way should locking within SQL Server be considered a monster, though locks may often times be misconstrued in that light.

This is an introductory-level discussion, so it doesn’t include optimistic concurrency or snapshot/RCSI, but if you’re unfamiliar with pessimistic concurrency, this is a good place to start.

Comments closed

Service Account Password Changes Without Rebooting

Andrea Allred shows that you can change the password of a service account without restarting an instance:

Did you know that you can change the password on the SQL Service account that is running your SQL instance without a reboot or restart?  Turns out this is true.  We have a new round of password requirements and it means that we need to change passwords on servers more often. But, since we need our servers up and reboots have to be heavily planned, we needed a solution that kept us from having to restart an instance after a password change. This lovely msdn article explains all the details, but let me give you the cliffs notes.

This is helpful for those max uptime scenarios where even a momentary service restart requires planned downtime.

Comments closed

Azure SQL Database Wait Stats

Grant Fritchey discusses Azure SQL Database wait stats:

You’ll notice that these results are wildly different from those above. What we’re looking is largely a server versus a database, but not completely. I mean that sys.dm_os_wait_stats is showing the waits for the instance on which my primary Azure SQL Database is currently running. Most of those waits are mine, but because it’s part of the management structure of Azure, sys.dm_os_wait_stats shows some information that’s not applicable, directly, to me. The “server” is not really that. It’s a logical container holding your database. There’s a lot more to it under the covers. To get the waits that are absolutely applicable to me and my databases, I have to go to sys.dm_db_wait_stats.

Azure SQL Database is going to behave a bit differently from on-premise SQL Server, so if you’ve got an Azure SQL Database, pay attention to those differences.

Comments closed

Columnstore Parallelism

Sunil Agarwal shows how clustered columnstore indexes take advantage of parallelism:

Do you need to be concerned about that a delta rowgroup is scanned single threaded? The answer is NO for two reasons (a) most columnstore indexes have very few delta rowgroups (b) if you have multiple delta rowgroups, they can be scanned in parallel with one thread per delta rowgroup

I have a beef with (a), at least for SQL Server 2014, but that’s a story for another day.

Sunil has a follow-up post on parallel bulk import:

Recall that on rowstore tables (i.e. the tables organized as rows not as columnstore), SQL Server requires you to specify TABLOCK for parallel bulk import to get minimal logging and locking optimizations. One key difference for tables with clustered columnstore index is that you don’t need TABLOCK for getting locking/logging optimizations for bulk import. The reasons for this difference in behavior is that each bulk import thread can load data exclusively into a columnstore rowgroup. If the batch size < 102400, then the data is imported into a delta rowgroup otherwise a new compressed rowgroup is created and the data is loaded into it. Let us take two following interesting cases to show this bulk import behavior. Assume you are importing 4 data files, each with one bulk import thread, concurrently into a table with clustered columnstore index

The “don’t use TABLOCK” is interesting in comparison to rowstore tables.

Comments closed

NOCOUNT Performance Differences

Aaron Bertrand looks into whether SET NOCOUNT ON provides a performance improvement:

For years, I had been operating under the impression that SET NOCOUNT ON; was a critical part of any performance strategy. This was based on observations I had made in, arguably, a different era, and that are less likely to manifest today.

Check out the comments as well.  This is an interesting conundrum as there’s a lot of ingrained knowledge that SET NOCOUNT ON is faster (and I admit that I thought I remembered it being the case when going through loops), but people have had limited success in coming up with a scenario in which it makes an appreciable difference.

Comments closed