Press "Enter" to skip to content

Curated SQL Posts

SQL Server ODBC Driver 13.1 For Linux

Meet Bhagdev announces a new version of the SQL Server ODBC driver for Linux:

Added

  • BCP API support

    • You can use functions through the ODBC driver as described here on Linux.
  • Support for user-defined KeyStoreProvider for Always Encrypted

    • You can now user-defined/created AE Column Master Key keystore providers. Check out code samples and more information here.
  • Ubuntu 16.10 support

    • Developed a package Ubuntu 16.10 for an apt-get experience.
  • Dependency on the platform unixODBC Driver Manager instead of the custom unixODBC-utf16 Driver Manager

    • This avoids conflicts with applications/software that depends on the platform unixODBC Driver Manager.

No groundbreaking additions, but there are a couple nice fixes in the update.

Comments closed

Full-Text Search

Kendra Little gives the scoop on full-text indexing:

The “dirty little secret” about full-text search indexes is that they don’t help with ‘%blabla%’ predicates.

Well, it’s not a secret, it’s right there in the documentation.

A lot of us get the impression that full-text search is designed to handle “full wildcard” searches, probably just because of the name. “Full-Text Searches” sounds like it means “All The Searches”. But that’s not actually what it means.

Kendra’s take is a bit more optimistic than mine; I’m definitely more inclined to dump text out to a Lucene-based indexing system (like Solr or ElasticSearch), as they’ll typically perform faster and solve problems that full-text cannot.  Some of that may just be that I was never very good at full-text indexing, though.

Comments closed

When To Define Clustered Index Columns On Non-Clustered Indexes

Kim Tripp explains when to include a clustered index column on a non-clustered index column’s definition:

So, when SHOULD you explictly define the clustering key columns in a nonclustered index? When they ARE needed by the query.

This sounds rather simple but if the column is USED by the query then the index MUST have the column explicitly defined. Yes, I realize that SQL Server will add it… so it’s not necessary NOW but what if things change? (this is the main point!)

One of the more common cases I could think of is multi-part clustered indexes, like on a junction table.

Comments closed

Modifying SSMS Shortcuts

Slava Murygin shows how to update shortcuts in Management Studio:

A lot of people in the Internet complain about their version of SSMS “forgot” some hot-key combinations. The oldest complain I remember was about the most useful combination “Ctrl-R”.

The reason why SSMS “forgets” is within code sharing and reusability with other Microsoft development products.
If you have that problem, most probably I have (or had in the past) installed something else from Microsoft, such as Development Studio, Data Tools etc.

It can get annoying when another tool clobbers your expected shortcuts.

Comments closed

Azure Disk Encryption

Melissa Coates configures Azure Disk Encryption for an already-existing Azure VM:

As I discussed in my previous blog post, I opted to use Azure Disk Encryption for my virtual machines in Azure, rather than Storage Service Encryption. Azure Disk Encryption utilizes Bitlocker inside of the VM. Enabling Azure Disk Encryption involves these Azure services:

  • Azure Active Directory for a service principal
  • Azure Key Vault for a KEK (key encryption key) which wraps around the BEK (bitlocker encryption key)
  • Azure Virtual Machine (IaaS)

Following are 4 scripts which configures encryption for an existing VM. I initially had it all as one single script, but I purposely separated them. Now that they are modular, if you already have a Service Principal and/or a Key Vault, you can skip those steps. I have my ‘real’ version of these scripts stored in an ARM Visual Studio project (same logic, just with actual names for the Azure services). These PowerShell templates go along with other ARM templates to serve as source control for our Azure infrastructure.

The Powershell scripts are straightforward and clear, so check them out.

Comments closed

Dplyr Tutorial

Deepanshu Bhalla has a nice dplyr tutorial:

What is dplyr?
dplyr is a powerful R-package to manipulate, clean and summarize unstructured data. In short, it makes data exploration and data manipulation easy and fast in R.
 
What’s special about dplyr?

The package “dplyr” comprises many functions that perform mostly used data manipulation operations such as applying filter, selecting specific columns, sorting data, adding or deleting columns and aggregating data. Another most important advantage of this package is that it’s very easy to learn and use dplyr functions. Also easy to recall these functions. For example, filter() is used to filter rows.

dplyr is a core package when it comes to data cleansing in R, and the more of it you can internalize, the faster you’ll move in that language.

Comments closed

Apache Zeppelin 0.7.0

Vinay Shulka announces Apache Zeppelin 0.7.0:

SPARK IMPROVEMENTS

This release also adds support for Spark 2 including version Spark 2.1. Zeppelin now also links to Spark History Server UI from Zeppelin so users can more easily track Spark jobs. The Livy interpreter now supports specifying packages with the job.

SECURITY IMPROVEMENTS

The major security improvement in Zeppelin 0.7.0 is using Apache Knox’s LDAP Realm to connect to LDAP. Zeppelin home page now lists only the nodes to which the user is authorized to access. Zeppelin now also has the ability to support PAM based authentication.

The full list of improvements is available here

This visualization platform is growing up nicely.

Comments closed

Leading Wildcard Seek Triggers

Aaron Bertrand demonstrates the triggers you could use if you wanted to build leading wildcard seek tables:

In my last post, “One way to get an index seek for a leading wildcard,” I mentioned that you would need triggers to deal with maintaining the fragments I recommended. A couple of people have contacted me to ask if I could demonstrate those triggers.

To simplify from the previous post, let’s assume we have the following tables – a set of Companies, and then a CompanyNameFragments table that allows for pseudo-wildcard searching against any substring of the company name

Read on for the triggers.

Comments closed

IoT Versus Event Hub

James Serra clarifies the differences between Azure’s IoT Hub and its Event Hub:

The majority of the time, if the data is coming directly from the devices, either directly or via a field-based gateway, IoT Hub will be the more appropriate choice.  Event Hub will generally be the more appropriate choice if either the data will not be coming to Azure directly from the devices, but rather either cloud-to-cloud through another provider, intra-cloud, or if the data is already landing on-premise and needs to be streamed to the cloud from a small number of endpoints internally.  There are exceptions to both conditions, of course.

Both solutions offer very high throughput data ingestion and can handle tremendous streaming data volumes.  In fact, today, IoT Hub is primarily a set of additional services that wrap an underlying Event Hub.

Read on for more scenarios and limitations in each.  They definitely serve different use cases.

Comments closed

Understanding Deadlock Priority

Kenneth Fisher explains deadlock priority:

Everyone deals with deadlocks from time to time. But sometimes we need to control who’s the deadlock victim and who isn’t. For example, I’m doing a big delete on a table in a 24×7 environment, I can’t afford downtime to do it so I’m doing my delete in small chunks to reduce transaction size and blocking time. My delete needs to happen but I’m in no hurry and I really can’t afford to deadlock some other transaction. So how do I make sure?

Or on the other hand, I’m running an update that absolutely has to happen right now. It’s going to take a bit and I can’t afford the time for it to be started over. A deadlock would be a disaster. What do I do?

That’s where deadlock priority comes into play.

Click through for the explanation.

Comments closed