Today I’m excited to announce our plans to bring SQL Server to Linux as well. This will enable SQL Server to deliver a consistent data platform across Windows Server and Linux, as well as on-premises and cloud. We are bringing the core relational database capabilities to preview today, and are targeting availability in mid-2017.
SQL Server on Linux will provide customers with even more flexibility in their data solution. One with mission-critical performance, industry-leading TCO, best-in-class security, and hybrid cloud innovations – like Stretch Database which lets customers access their data on-premises and in the cloud whenever they want at low cost – all built in.
I want Management Studio on Linux. That’s the biggest thing keeping me tethered to Windows.
RPO is a TLA for Recovery Point Objective. The easiest way to describe RPO is to ask, “In terms of time, how much data are we willing to lose?” The immediate answer is always going to be zero. Here is where we have to be honest. You won’t be able to guarantee zero data loss (yeah, there are probably ways to do this, but #entrylevel). Talk with the business. Most of the time, you’ll find that they’d actually be OK with 15 minutes, or maybe 5 minutes, or even an hour of lost data. It really varies, not only from business to business, but from database to database within the business (allow for this flexibility). You need to establish this number. RPO is going to help you figure out how to set up your backups, your recovery model, your logs and their backups. All that stuff that seemed so technical, it’s all based on this extremely important number that you’re going to work with the business to arrive at.
It’s good to have this talk with decision-makers, particularly when presented in menu form with a discussion of costs (price of disk space, particularly) and time.
As it turns out, when you have a character string in SQL Server that contains character 0x000, it really doesn’t know what to do with it most of the time, especially when you’re dealing with Unicode strings.
I did track down http://sqlsolace.blogspot.com/2014/07/function-dbostripunwantedcharacters.html, but I generally try to avoid calling UDF’s in my queries.
Jay’s got an answer which works, so check it out. Also, I second the use of the #sqlhelp hashtag. There’s a great community watching that hashtag.
Sunil Agarwal has a couple of posts on columnstore index defragmentation in SQL Server 2016.
Let us now look at how you can use REORGANIZE command to defragment your columnstore index. Note, this command is only supported for clustered columnstore index (CCI) and nonclustered columnstore index for disk-based tables. In the example below, I create an empty table and then create a clustered columnstore index and finally I load 300k rows. SQL Server 2016 loads data from staging table into CCI in parallel when you specify TABLOCK hint. The machine I ran this test on has 4 logical processors so the 300k rows got divided into 75k each between 4 threads. Since each thread was loading < 102400 rows, the columnstore index ends up with 4 delta rowgroups as shown below.
A compressed rowgroup is considered as fragmented when any of the following two conditions is met
Less than 1 million rows but the trim_reason ( please refer to https://msdn.microsoft.com/en-us/library/dn832030.aspx ) is other than DICTIONARY_SIZE. If the size of a compressed rowgroup is reduced because it has reached the maximum dictionary size, then it can’t be further reduced
It has nonzero deleted rows that exceeds a minimum threshold.
I just got finished with a first draft of a script to determine whether reorganizing a clustered columnstore index partition would be worthwhile, so this is great timing. I hope to make my script available soon, after I incorporate Sunil’s heuristics.
Testing SQL Server 2016 RC0 today, I see that the bug for index_usage_stats has been fixed in this release! When I generate scans, seeks, and updates against an index, running ALTER INDEX REBUILD no longer resets the information. I can still tell which indexes have been used and which have not since the database came online, just like we had in SQL Server 2008 R2 and prior.
Kendra has created a Connect item to fix a separate bug where rebuilding an index clears out the missing index recommendations for that table.
On my flight home, I spent some time putting some of the things from my presentation into a one-page cheat sheet. I’ll have these printed up and have them on-hand for next time…but if you want it now, you candownload the PDF and print one for yourself.
My recommendation is to attend Andy’s talk when Mike Hillwig and Brent Ozar are trolling him from the back.
If there’s an index on the column you’re grouping, then changing the collation is going to hurt a bit. Grouping could take advantage of a Stream Aggregate under our indexed collation, but changing the column is like throwing it away the index order (ORDER BY doesn’t get handled well by changing the collation) means a Hash is required. But comparing two query plans that both use Hash Match (Aggregate), one on a case-insensitive collation and one on a binary collation, then I found the latter was slightly faster. Not as drastic a change as searching, but still 10-30% better. One would run in about 12 seconds, and one in about 10.
Be sure to check out his comments for more details.