Press "Enter" to skip to content

Curated SQL Posts

Maintaining Full-Text Indexes

Dave Mason talks about full-text index maintenance:

My first encounter with full text indexes and degraded performance was related to an enhancement I made to an aspx page years ago. I wanted all of the search fields to use an AutoComplete AJAX extender to mimic the behavior you see when you type a few letters into the search field on Google.com or Bing.com. A traditional non-clustered index wasn’t sufficient for the “Location Address” field, so I settled on a full text index–it worked very well.

After some amount of time (I don’t remember how long), performance slowed considerably. I was surprised to find the full text index for “Location Address” had a large number of fragments. I wish I had kept some notes on my findings. I can’t remember how may fragments there were, but I’m thinking it was in the 15-20 range. If memory serves me, Orange Co., FL has about 400,000 physical location addresses. The underlying table had one row per location address. Knowing me, the indexed column was probably VARCHAR(100) or VARCHAR(128). This does’t seem like a huge amount of data, so I was surprised the full text searches were slow, even with 15-20 fragments. Reorganizing the related full text catalog made a world of difference. Performance improved drastically.

All indexes need maintenance.  Dave has a script to help with full-text indexes.

Comments closed

Memory-Optimized Temp Objects

Jos de Bruijn shares a couple scenarios in which In-Memory OLTP can improve performance—using memory-optimized table types and replacing certain types of temp tables with schema-only memory-optimized tables:

Tempdb can be a performance bottleneck for many applications. Workloads that intensively use table-valued parameters (TVPs), table variables and temp tables can cause contention on things like metadata and page allocation, and result in a lot of IO activity that you would rather avoid.

What if TVPs and temp tables could live just in memory, in the memory space of the user database? In-Memory OLTP can help! Memory-optimized table types and SCHEMA_ONLY memory-optimized tables can be used to replace traditional table types and traditional temp tables, bypassing tempdb completely, and providing additional performance improvements through memory-optimized data structures and data access methods.

I’ve used both of these techniques to good effect, but the harsh limitations in 2014 prevented me from doing as much with them as I wanted.

Comments closed

LDF Stamp Change

The CSS SQL Server engineering team points out that the LDF stamp has changed from 0x00 to 0xC0:

Question:  If the log is stamped with 0xC0’s instead of 0x00’s how is it a performance gain?

Many of the new hardware implementations detect patterns of 0x00’s.   The space is acquired and zero’s written to stable media, then a background, hardware based garbage collector reclaims the blocks.

This is a very interesting background article which shows an integration pain point between the database platform and the storage platform.

Comments closed

Filled Maps In Power BI

Reza Rad digs into filled maps in Power BI:

I’ve mapped suburbs to County because that was the lowest level I’ve found in data category for geographic information. (Place and Address cannot be used for Filled Map at the time or writing this post). and I got Nothing! Not event a small area on the map.  I’ve tried then removing the district and putting suburb, region, country format with County as the data category which didn’t helped again.

I’ve found that I can map some locations based on Postal Code as you see below. However not Postal Code is not always good distinguishing field for a region, as multiple regions might have a postal code shared.

Filled maps have the potential to be powerful tools, but they aren’t perfect.  Check out Reza’s post for the full scoop.

Comments closed

U-SQL Movie Recommender

Dave Ballantyne introduces us to U-SQL via a movie recommender:

What follows is an overview of my experiments that i have published into a GitHib repo. The “Examples” folder are what i would term “simple learnings” and “Full Scripts” are scripts that to a lesser or greater extent do something “useful”.  Im also not suggesting that anything here is “best practice” or method A performs better than method B,  I simply do not have the required size of data to make that call. My aim was to learn the language.

TLDR: Check out the script MovieLens09-CosineSimilarityFromCSVWithMax.usql for a U-SQL movie recommender.

U-SQL was introduced last year, but word of mouth about the language has been quite limited to date.  I’ll be interested in seeing what other examples pop up over the next few months.

Comments closed

Local Aggregation

Niko Neugebauer investigates a new line in the Columnstore Index Scan execution plan tooltip, Actual Number of Locally Aggregated Rows:

There is a new line in the properties of the iterator, showing the number of locally aggregated rows and that number equals 619255, that should be exactly the number of rows that is missing from the arrow connecting 2 iterators:

Gives us our perfect 12627608 rows.
Eureca!
Is there any more information on this operation?
Indeed, just right-click on the Columnstore Index Scan and select it’s properties:

This is tied to some columnstore performance improvements in SQL Server 2016.

Comments closed

Reorganize Columnstore Indexes

I have a new script available to reorganize columnstore indexes:

Note that this script requires SQL Server 2016 (or later) because the database engine team made some great changes to columnstore indexes, allowing us to use REORGANIZE to clear out deleted rows and compact row groups together, as well as its previous job of marking open delta stores as available for compression.

The code is available as a Gist for now, at least until I decide what to do with it.  Comments are welcome, especially if I’m missing a major reorganize condition.

As mentioned, comments are welcome.

Comments closed

Generating Fixed-Width Files With Power Query

Chris Webb shows how to generate fixed-width files using Power Query inside Excel:

While it’s fairly common to need to load fixed-width files using Power Query or Power Query (and there’s a nice walkthrough of how to do this here), occasionally you might want to use Power Query and Excel to create a fixed-width output for another system, or maybe to create some test data. You might not want to do it often but I can imagine that when/if Power Query is integrated into SSIS this will be a slightly less obscure requirement; at the very least, this post should show you how to use a couple of M functions that are under-documented.

I don’t see this being a particularly common request, but I guess I can see some scenario in which we’re loading data into a legacy system.

Comments closed