The SQL Server query optimizer can find interesting ways to tackle seemingly simple operations that can be hard to optimize. Consider the following query on a table with two indexes, one on (a), the other on (b):SELECT a, b FROM #data WHERE a<=10 OR b<=10000;
The basic problem is that we would really want to use both indexes in a single query.
We get to see a few different versions of the query as well as the execution plans which result.
I got asked a question about the OUTPUT clause recently and realized I didn’t remember the syntax. I’ve rarely used this, so I had to look it up and thought this would be a good basic post.
The idea with OUTPUT is that the data from the inserted and deleted tables can be output from the INSERT statement, outside of your triggers. This is the same data, but you can access it in the insert.
The format is
INSERT xxx OUTPUT yyyy INTO @zzz VALUES (or SELECT) mmmm
If I had one thing I could change about OUTPUT, I’d like to be able to output directly into variables for those cases in which I know I’m only going to get one result (or maybe I only care about one arbitrary result in a set).
That diagram provides you about 260 last minutes of Server CPU usage and measured in percents.
As you can see my SQL Server is mostly doing nothing and only during that blog-post writing in the last half and hour or so it is heavily running test scripts to over-utilize my CPU, but it still barely goes more than 60% of CPU (Blue line).
The Red line represents all other processes besides of SQL Server and you can tell if anything else from outside is impacting your performance.
Combined with Glenn Berry’s diagnostic queries, you could generate some quick analytics. I’d still use R for anything more than slightly complicated, but this is great for those environments in which you don’t have good alternative tooling.
So why am I such a big fan of Service Broker and if it’s so great, why isn’t everybody using it? Let me start by telling you why I’m such a fan.
Asynchronous – The biggest benefit of Service Broker, or probably any messaging technology, is that is decouples long running processing from the client application. A great example of what I mean by this is Amazon.com. When you place an order at Amazon, a whole series of backend processes are initiated. Your payment is processed, inventory is verified and updated, fulfillment and shipping is initiated, etc. All of which ultimately end with a box arriving on your doorstep. But the website doesn’t wait for all of that stuff to finish before it thanks you for your order. Your order information is added to a queue and the page returns. All that other stuff is handled asynchronously. And that’s what Service Broker lets you do in your database.
I think the biggest impediment to adoption of Service Broker is that there was never a friendly UI. The same applied to Extended Events in 2008. Both involve a non-trivial amount of setup and maintenance, and the tooling just hasn’t been there for Service Broker. I know they’re still making (minor) improvements to the product, but if they wanted a big improvement, putting a friendly UI tie-in with Management Studio would go a long way.
My first encounter with full text indexes and degraded performance was related to an enhancement I made to an aspx page years ago. I wanted all of the search fields to use an AutoComplete AJAX extender to mimic the behavior you see when you type a few letters into the search field on Google.com or Bing.com. A traditional non-clustered index wasn’t sufficient for the “Location Address” field, so I settled on a full text index–it worked very well.
After some amount of time (I don’t remember how long), performance slowed considerably. I was surprised to find the full text index for “Location Address” had a large number of fragments. I wish I had kept some notes on my findings. I can’t remember how may fragments there were, but I’m thinking it was in the 15-20 range. If memory serves me, Orange Co., FL has about 400,000 physical location addresses. The underlying table had one row per location address. Knowing me, the indexed column was probably VARCHAR(100) or VARCHAR(128). This does’t seem like a huge amount of data, so I was surprised the full text searches were slow, even with 15-20 fragments. Reorganizing the related full text catalog made a world of difference. Performance improved drastically.
All indexes need maintenance. Dave has a script to help with full-text indexes.
Jos de Bruijn shares a couple scenarios in which In-Memory OLTP can improve performance—using memory-optimized table types and replacing certain types of temp tables with schema-only memory-optimized tables:
Tempdb can be a performance bottleneck for many applications. Workloads that intensively use table-valued parameters (TVPs), table variables and temp tables can cause contention on things like metadata and page allocation, and result in a lot of IO activity that you would rather avoid.
What if TVPs and temp tables could live just in memory, in the memory space of the user database? In-Memory OLTP can help! Memory-optimized table types and SCHEMA_ONLY memory-optimized tables can be used to replace traditional table types and traditional temp tables, bypassing tempdb completely, and providing additional performance improvements through memory-optimized data structures and data access methods.
I’ve used both of these techniques to good effect, but the harsh limitations in 2014 prevented me from doing as much with them as I wanted.
We are currently working on testing and publishing SQL Server Container Images that could speed up the process of getting started with SQL Server in Windows Containers significantly. Stay tuned for an update!
Windows getting into the Docker world is interesting.
Question: If the log is stamped with 0xC0’s instead of 0x00’s how is it a performance gain?
Many of the new hardware implementations detect patterns of 0x00’s. The space is acquired and zero’s written to stable media, then a background, hardware based garbage collector reclaims the blocks.
This is a very interesting background article which shows an integration pain point between the database platform and the storage platform.
I’ve mapped suburbs to County because that was the lowest level I’ve found in data category for geographic information. (Place and Address cannot be used for Filled Map at the time or writing this post). and I got Nothing! Not event a small area on the map. I’ve tried then removing the district and putting suburb, region, country format with County as the data category which didn’t helped again.
I’ve found that I can map some locations based on Postal Code as you see below. However not Postal Code is not always good distinguishing field for a region, as multiple regions might have a postal code shared.
Filled maps have the potential to be powerful tools, but they aren’t perfect. Check out Reza’s post for the full scoop.
What follows is an overview of my experiments that i have published into a GitHib repo. The “Examples” folder are what i would term “simple learnings” and “Full Scripts” are scripts that to a lesser or greater extent do something “useful”. Im also not suggesting that anything here is “best practice” or method A performs better than method B, I simply do not have the required size of data to make that call. My aim was to learn the language.
TLDR: Check out the script MovieLens09-CosineSimilarityFromCSVWithMax.usql for a U-SQL movie recommender.
U-SQL was introduced last year, but word of mouth about the language has been quite limited to date. I’ll be interested in seeing what other examples pop up over the next few months.