But if we hover over that index seek, we can see in the tooltip that there’s a hidden predicate that is NOT a seek predicate. This is a hidden filter. And because this is SQL Server 2016, we can see “Number of Rows Read” — it had to read 9.3 million rows to count 1.9 million rows. It didn’t realize the 2006 rows were together– it checked all the females and examined the FakeBirthDateStamp column for each row.
Built-in functions can cause SARGability issues.
The SWITCH statement can instantly ‘move’ data from one table to another table. It does this by updating some meta data, to say that the new table is now the owner of the data instead of the old table. This is very useful as there is no physical data movement to cause the stresses mentioned earlier. There are a lot of rules enforced by SQL Server before it will allow this to work. Essentially each table must have the same columns with the same data types and NULL settings, they need to be in the same file group and the new table must be empty. See here for a more detailed look at these rules.
If you can take a downtime, this is pretty easy. Otherwise, making sure that the two tables are in sync until the switchover occurs is a key problem to keep in mind.
One thing I recently learned is that you can’t add files to [model]. I was hoping to impact the number of files a database has when it is created. It’s common for db’s to be created, with just the standard mdf/ldf pair of files. Then the db is loaded with varying amounts of data. After a certain threshold, it makes more sense to have multiple data files, based on the environment. Sure, I could add files later and “balance” the data across the files. But it would be nice to *not* have to do this; some down time may be required and LOB data presents its own challenges
There aren’t too many restrictions on the model database, but this is a biggie.
If you’ve ever looked at sys.sysprocesses or sys.dm_exec_requests (or a number of other DMVs), you’ve noticed there is a column called “sql_handle” that contains some binary gobbledygook. Books Online gives the (un)helpful definition as “Hash map of the SQL text of the request.”
“Binary gobbledygook” is probably the best description of the plan handle available.
I was testing some calculations in my tabular model when I noticed that my Lost Customers calculation wasn’t working as expected. This was rather interesting to me since the calculation I was using was from DAXPatterns.com. After some experimentation, I determined that the difference between my data situation, and the situation that fit the DAX Pattern was that my customer dimension was a Type 2 Slowly Changing Dimension. That meant I couldn’t use the customer key to identify a unique customer, since each customer could have multiple customer keys (surrogate keys) with the same customer number (business key). Specifically, if a customer made a purchase in December 2015, then changed their name, then made a purchase in January and February, my calculation was counting them as lost because it was not recognizing that the customer with the new name and same customer number was actually the same customer.
She ends up with two solutions, each with different trade-offs. Knowing as little DAX as I do, looking at two different ways of solving this problem is great because it gives you more insight into language semantics.
So obviously we brought back internet Sales for all years captured into the cube. So now we need to understand the WHERE clause and how to use it properly in MDX.
The “WHERE” clause in MDX is used to define another slicer and limit the attributes from the defined slicer. The slicer used in the “WHERE” clause cannot be the same slicer used when defining your rows hence why I use the date dimension. The MDX query below depicts the WHERE clause and syntax:
I liked Dan’s introductory-level presentation on learning MDX, and this blog series is following along those same lines.
A fundamental component of SQL Server is locking and locks. Locks within SQL Server are critical to the proper functioning of the database and the integrity of the data within the database. The presence of locks does not inherently mean there is a problem. In no way should locking within SQL Server be considered a monster, though locks may often times be misconstrued in that light.
This is an introductory-level discussion, so it doesn’t include optimistic concurrency or snapshot/RCSI, but if you’re unfamiliar with pessimistic concurrency, this is a good place to start.
Did you know that you can change the password on the SQL Service account that is running your SQL instance without a reboot or restart? Turns out this is true. We have a new round of password requirements and it means that we need to change passwords on servers more often. But, since we need our servers up and reboots have to be heavily planned, we needed a solution that kept us from having to restart an instance after a password change. This lovely msdn article explains all the details, but let me give you the cliffs notes.
This is helpful for those max uptime scenarios where even a momentary service restart requires planned downtime.
You’ll notice that these results are wildly different from those above. What we’re looking is largely a server versus a database, but not completely. I mean that sys.dm_os_wait_stats is showing the waits for the instance on which my primary Azure SQL Database is currently running. Most of those waits are mine, but because it’s part of the management structure of Azure, sys.dm_os_wait_stats shows some information that’s not applicable, directly, to me. The “server” is not really that. It’s a logical container holding your database. There’s a lot more to it under the covers. To get the waits that are absolutely applicable to me and my databases, I have to go to sys.dm_db_wait_stats.
Azure SQL Database is going to behave a bit differently from on-premise SQL Server, so if you’ve got an Azure SQL Database, pay attention to those differences.
Do you need to be concerned about that a delta rowgroup is scanned single threaded? The answer is NO for two reasons (a) most columnstore indexes have very few delta rowgroups (b) if you have multiple delta rowgroups, they can be scanned in parallel with one thread per delta rowgroup
I have a beef with (a), at least for SQL Server 2014, but that’s a story for another day.
Recall that on rowstore tables (i.e. the tables organized as rows not as columnstore), SQL Server requires you to specify TABLOCK for parallel bulk import to get minimal logging and locking optimizations. One key difference for tables with clustered columnstore index is that you don’t need TABLOCK for getting locking/logging optimizations for bulk import. The reasons for this difference in behavior is that each bulk import thread can load data exclusively into a columnstore rowgroup. If the batch size < 102400, then the data is imported into a delta rowgroup otherwise a new compressed rowgroup is created and the data is loaded into it. Let us take two following interesting cases to show this bulk import behavior. Assume you are importing 4 data files, each with one bulk import thread, concurrently into a table with clustered columnstore index
The “don’t use TABLOCK” is interesting in comparison to rowstore tables.