Press "Enter" to skip to content

Month: March 2016

Text Search

Anders Pedersen discusses one method he used to implement fast text search in SQL Server:

Looking into what was needed, I quickly realized there was a LOT of data, guess 50+ years of news broadcasts will do this.  Consider this was in the early 2000s, some innovation was needed from anything I had coded before.  Obviously LIKE searches was out of the question, full text search was not available.  So what to do?

Basically I decided to break down each broadcast to words into a separate table, the entire application fit in 2 tables: Story and Words.

This is a case in which thinking about the grain of data helps solve an otherwise-intractable problem.

Comments closed

Why Use Instant File Initialization?

Erin Stellato shows the benefits to using Instant File Initialization in terms of file modification and database restoration:

The time to zero out a file and write data is a function of sequential write performance on the drive(s) where the SQL Server data file(s) are located, when IFI is not enabled.  When IFI is enabled, creating or growing a data file is so fast that the time is not of significant consequence.  The time it takes to create or grow a value varies in seconds between 15K, SSD, flash, and magnetic storage when IFI is enabled.  However, if you do not enable IFI, there can be drastic differences in create, grow, and restore times depending on storage.

There’s a huge performance benefit with turning IFI on.

Comments closed

Updating Non-Updatable Columnstore Indexes

Niko Neugebauer shows how to load data into non-updated non-clustered columnstore indexes:

I decided to make a serious step back and write about something that is concerning the current (SQL Server 2014) and the elder version of SQL Server that supports Nonclustered Columnstore Indexes – (SQL Server 2012).
The Nonclustered Columnstore Indexes in SQL Server 2012 & 2014 are non-updatable, meaning that after they are built on the table, you cannot modify the table anymore – you can only read the data from it.
The common solutions for this problem are:
– Using Partitioning
– Disabling Columnstore, modifying the data and Rebuilding the Columnstore Index then (thus activating it)

Sounds easy, doesn’t it ?
Well, like with everything in the real life, there are a couple of quite important gotchas here.

The “non-updatable” part is why I ignored non-clustered columnstore indexes.  With SQL Server 2016, I’m going to take another look at them.  But if you’re living on 2012 or 2014 for a while, this is a good post to give you an idea of how to load those tables.

Comments closed

Compile SQL Project Files Using MSBuild

Richie Lee shows how to use MSBuild to deploy a SQL Server database project:

What is happening is that when a project is built in Visual Studio some targets that are external of the whole process that were installed as part of Visual Studio are used as part of the process. Obviously, when we run MSBuild via cmdline, we are not setting the parameter “VisualStudioVersion”, because this is pretty much a “headless” build. So the sqlproj file handles this by setting the parameter to 11.0, which is pretty horrible. Given that projects are created in Visual Studio, I would’ve felt that the version a project was created in would be baked in as the default, as opposed to just some random version number. At the very least,  a warning that a default version number has been set would come in useful, especially as this is 2016 now and the sqlproj files default to 2010 targets.

Fortunately, Richie has the answer to this issue.

Comments closed

Failed To Deploy Project In SSIS

Ginger Grant helps resolve “Failed to deploy project” errors (Error: 27203) in Integration Services:

The message Failed to deploy project isn’t very useful, but the rest of the message is. The operation_messages view lives in SSISDB, and the operation identifier number is how to determine what the error is. Run this query, using the number provided in the error message, which in this case is 173

I’d prefer it if we actually could see the error in the dialog, but at least there’s an explanation somewhere.

Comments closed

SQL Server On Linux

It’s coming:

Today I’m excited to announce our plans to bring SQL Server to Linux as well. This will enable SQL Server to deliver a consistent data platform across Windows Server and Linux, as well as on-premises and cloud. We are bringing the core relational database capabilities to preview today, and are targeting availability in mid-2017.

SQL Server on Linux will provide customers with even more flexibility in their data solution. One with mission-critical performance, industry-leading TCO, best-in-class security, and hybrid cloud innovations – like Stretch Database which lets customers access their data on-premises and in the cloud whenever they want at low cost – all built in.

I want Management Studio on Linux.  That’s the biggest thing keeping me tethered to Windows.

Comments closed

RPO And RTO Are Business Decisions

Grant Fritchey notes that backup frequency is a business decision, not just a technical decision:

RPO is a TLA for Recovery Point Objective. The easiest way to describe RPO is to ask, “In terms of time, how much data are we willing to lose?” The immediate answer is always going to be zero. Here is where we have to be honest. You won’t be able to guarantee zero data loss (yeah, there are probably ways to do this, but #entrylevel). Talk with the business. Most of the time, you’ll find that they’d actually be OK with 15 minutes, or maybe 5 minutes, or even an hour of lost data. It really varies, not only from business to business, but from database to database within the business (allow for this flexibility). You need to establish this number. RPO is going to help you figure out how to set up your backups, your recovery model, your logs and their backups. All that stuff that seemed so technical, it’s all based on this extremely important number that you’re going to work with the business to arrive at.

It’s good to have this talk with decision-makers, particularly when presented in menu form with a discussion of costs (price of disk space, particularly) and time.

Comments closed

Null Bytes In Text Strings

Jay Robinson has null bytes he wants to remove from Unicode strings:

As it turns out, when you have a character string in SQL Server that contains character 0x000, it really doesn’t know what to do with it most of the time, especially when you’re dealing with Unicode strings.

I did track down http://sqlsolace.blogspot.com/2014/07/function-dbostripunwantedcharacters.html, but I generally try to avoid calling UDF’s in my queries.

Jay’s got an answer which works, so check it out.  Also, I second the use of the #sqlhelp hashtag.  There’s a great community watching that hashtag.

Comments closed

Columnstore Index Reorganization

Sunil Agarwal has a couple of posts on columnstore index defragmentation in SQL Server 2016.

Part 1:

Let us now look at how you can use REORGANIZE command to defragment your columnstore index. Note, this command is only supported for clustered columnstore index (CCI) and nonclustered columnstore index for disk-based tables. In the example below, I create an empty table and then create a clustered columnstore index and finally I load 300k rows.  SQL Server 2016 loads data from staging table into CCI in parallel when you specify TABLOCK hint. The machine I ran this test on has 4 logical processors so the 300k rows got divided into 75k each between 4 threads. Since each thread was loading < 102400 rows, the columnstore index ends up with 4 delta rowgroups as shown below.

Part 2:

A compressed rowgroup is considered as fragmented when any of the following two conditions is met

  • Less than 1 million rows but the trim_reason ( please refer to https://msdn.microsoft.com/en-us/library/dn832030.aspx ) is other than DICTIONARY_SIZE. If the size of a compressed rowgroup is reduced because it has reached the maximum dictionary size, then it can’t be further reduced

  • It has nonzero deleted rows that exceeds a minimum threshold.

I just got finished with a first draft of a script to determine whether reorganizing a clustered columnstore index partition would be worthwhile, so this is great timing.  I hope to make my script available soon, after I incorporate Sunil’s heuristics.

Comments closed