Okay, this is getting out of hand. The query shouldn’t have to be this complicated.
Luckily I work with a guy named Chris. He’s amazing at what he does. He questions everything without being a nitpicker (there’s a difference). He read through the Mythbusters post and followed all the links in the comments. He asked whether gbn’s JFDI pattern wasn’t better here. So I implemented it just to see what that looked like:
I’ve ended up doing the same thing in a similar scenario. But as Aaron Bertrand notes in the comments, test your results because performance could end up being even worse than before.
Enabling TDE does not protect your BACPAC files, just your database backups. If you are relying on TDE to protect your data at rest then allowing users to create BACPAC files will put you at risk. But no more risk than any other user choosing to run a SELECT statement and save the data somewhere (or perhapsjust use PowerBI to open a connection and import to Excel).
TDE has a single, specific purpose. If you want something more stringent, SQL Server 2016 Always Encrypted might be an option.
Clearing out a full transaction log is a common problem. A quick search will find you dozens of forum entries and blog posts. Because of that I’m not going to talk about the correct methods of dealing with a transaction log full error. What I want to discuss is why you shouldn’t use the following method.
And Kenneth also hits the one legitimate use: dire emergency. If this is a normal part of some process (e.g., warehouse loading), bite the bullet and either live in Simple recovery mode (understanding the risks) or get the disk space to keep it in Full mode. Switching back and forth—especially if you aren’t taking full backups immediately after switching back—is a good way to get yourself burned.
The example in this post will be the well known Hello World example in the context of SSAS, and I trust this will illustrate the possibilities with this technique well enough, for you to apply your own solution, to your challenges.
If you’re at all familiar with CLR in the database engine, this looks to be the Analysis Services equivalent. Hopefully it doesn’t have the same “We can’t possibly use this!” taboo that CLR seems to have in the database engine world.
The app’s plan was cached the day before. But wait a second! My assumption was that it had recompiled this morning due to the updated stats.
5. Your log backups run every 30 minutes. I have yet to find a company with log backups running every 30 minutes who was actually OK with losing 30+ minutes of data. Maybe you are part of the company where it’s actually true, but if you’re not 100% sure, get someone to sign off on it. With an ink pen. Really.
Funnily enough, I’ve experienced exactly this, except the business side was flabbergasted that I wanted to take transaction log backups so quickly—they had a 24-hour RPO, so why bother with such frequent backups? I kept a straight face and explained that if I had my druthers, I’d take a transaction log backup every 1-3 minutes.
The conclusion I’d take here is that CROSS APPLY ought to be a tool you keep in the front of your toolbox and use when you must execute a function for each row of a set of tables. This is one of the T-SQL techniques that I never learned early in my career (it wasn’t available), and I haven’t used much outside of looking for execution plans, but it’s a join capability I will certainly look to use in the future.
I’m one of the biggest fans of the APPLY operator out there—my favorite talk is based on it, even. But in this case, I’m going to say that writing “CROSS APPLY” really didn’t do anything here—times are similar enough that I’d be suspicious that the database engine is doing the same thing both times.
In a similar vein to last week’s blog post… I heard an interesting comment recently. “Change that Column != 2 to a Column > 2 or Column < 2 combination, it can use indexes better.”
Sounds like something that clearly needs testing!
Not shockingly, this did nothing to make the query run faster or use fewer resources. There are ways to rewrite queries to improve performance while maintaining the same result structure (a common example being rewriting query using a cursor or WHILE loop to perform one set-based operation), but Gail’s point is vital: test your changes and make sure that if you’re saying it will perform better, that it actually perform better.
That is an enormous amount of data. What if you needed to sort that? What if you joined this to another table or view and a spool was required. What it it was a hash join and a memory grant was required? The demand that this seemingly innocuous statement placed on your server could be overwhelming.
The memory grant could create system variability that is very difficult to find. There is a thread on MSDN that I started which exposes what prompted this post. (The plan that was causing much of the problem is at this link.)
It’s important to keep in mind the good enough “big round figures” that SQL Server uses for row estimation when stats are unavailable (e.g., linked server to Hive or a CLR function like in the post). These estimates aren’t always correct, and there are edge cases like the one in the post in which the estimates will be radically wrong and begin to affect your server.
These are all things that may have been necessary under the old estimator, but are likely just tying the optimizer’s hands under the new one. This is a query that could have, and should have, been tested in their dev / staging / QA environments under the new cardinality estimator long before they flipped the switch in production, and probably could have gone through series of tests where different combinations of those hints and options could have been removed. This is something for which that team can only blame themselves.
Also check out Aaron Morelli’s comment on the post.