Press "Enter" to skip to content

Author: Kevin Feasel

Implementing SoundEx

Dror Helper shows how to implement SoundEx in C#:

It’s fairly easy to follow the steps of the algorithm (as defined by Wikipedia):

  1. Retain the first letter of the name and drop all other occurrences of a, e, I, o, u, y, h, w.

  2. Replace consonants with digits as follows (after the first letter):

    • b, f, p, v → 1

    • c, g, j, k, q, s, x, z → 2

    • d, t → 3

    • l → 4

    • m, n → 5

    • r → 6

  3. If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by ‘h’ or ‘w’ are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.

  4. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers.

SQL Server also supports SOUNDEX as a built-in function.

Comments closed

Waiting For SP1

Guy Glantser hates “wait for SP1” advice:

Historical Facts

Throughout my career I have never seen an RTM version that was substantially less stable then the following SP1. Sure, there were bugs and issues. Sometimes there were critical bugs and issues. But there were just as much bugs and issues in SP1 and in SP2, and so on. I haven’t conducted a thorough research, so I don’t have a statistical proof, but these are the facts, at least from my experience.

I’d add one more thing:  pre-release versions of SQL Server run in production as part of Microsoft TAP (older link, and I think RDP and TAP have merged together at this point, but I don’t have those inside details).  These are real companies with real workloads running pre-RTM versions of SQL Server.  I work for a company which is in the program, and we were running our data warehouse on CTP 3 and then RCs.  By the time RTM hits the shelves, there’s already been a good deal of burn-in.

Comments closed

Choose Your Edition

Grant Fritchey explains the differences between different editions of SQL Server:

SQL Server Enterprise Edition is the high end. Here is where you need to go to multi-terrabytes in size and you have massive transaction loads. You’re looking at very sophisticated availability and disaster recovery. Again, the name gives it away. You’re generally only going to this edition when you’re working at an enterprise level of scale and architecture. Since you’re just getting started, don’t worry about this.

My version of the story is, “Buy Enterprise Edition.  Don’t cheap out because you’ll regret it later.”  Grant’s version is much more thorough.

Comments closed

Using Diagnostics Trace With Power BI

Chris Webb uses Diagnostics.Trace to track process runtime in Power BI:

To sum up, the workflow for tuning your query is:

  • Make some changes to the LongQuery query that hopefully make it faster

  • Update the Trace Message parameter with some notes about which version of the LongQuery query it is that you’ll be testing

  • Click the Refresh Preview button for the Diagnostics query to test how long LongQuery now runs for

  • Refresh, or load, the query that reads the data from the trace logs so you can see how all of your changes have affected query execution times

I give it two months before the Power BI team releases a change to make this easier…

Comments closed

SQL Server Backup Buffers

SQLsasquatch (whose name I now know but I like the handle so much) has a few questions about backup buffers:

I wonder:
-If ‘max server memory’ wasn’t being overridden by a more reasonable target (because max server memory is 120 GB on a 32GB VM), would the behavior still be the same before reaching target?  I bet it would be.
-Is this behavior specific to not having reached target?  Or when reaching target would backup buffers be allocated, potentially triggering a shrink of the bpool, then returned to the OS afterward requiring the bpool to grow again?
-What other allocations are returned directly to the OS rather than given back to the memory manager to add to free memory?  I bet CLR does this, too.  Really large query plans?  Nah, I bet they go back into the memory manager’s free memory.
-Does this make a big deal?  I bet it could.  Especially if a system is prone to develop persistent foreign memory among its NUMA nodes.  In most cases, it probably wouldn’t matter.

Good questions, to which I have zero answers.

Comments closed

Launch Event Thoughts

Patrick Keisler sums up his thoughts on the Raleigh-Durham SQL Server 2016 Launch Discovery Day:

Your solution must be completed in 3 hours.

On paper this all sounds pretty easy, but in practice it was quite hard. I am no BI developer and the other members of my team did not have any expertise in that area either, but we still managed to create a solution and have fun doing so.

The first issue was had was how to combine our development work on the same database. This one was easy…just use Azure. In the span of about 30 minutes, I spun up a new Azure VM with SQL Server 2016 pre-installed, uploaded the database, setup logins, and opened the appropriate ports. I then gave my team members the URL and credentials so they each could connect from their laptops.

These are good thoughts, and I completely agree with the point that better data definition would have made for a better event.  Each one of the teams had to spend a lot of time cleaning up data, and I think that limited the teams’ ability to do really cool things.  I’d love to put something like this on again, but if that happens, I’m going to make sure we start with a good data model so people can do fun things on top of that rather than spend all their time scrubbing data (unless that’s the point of the exercise).

Comments closed

Row-Level Security With Reporting Services

Paul Turley discusses combining row-level security, SQL Server Reporting Services, and SQL Server Analysis Services:

In every data source connection string, you can add a simple expression that maps the current Windows username to the CUSTOMDATA property of the data source provider.  This works in SSRS embedded data sources, shared data sources, in a SharePoint Office Data Connecter (ODC) file and in a SharePoint BISM connection file.  In each case, the syntax should be the similar.  Here is my shared data source on the SSRS 2016 report server

This is pretty snazzy.  Paul goes into good detail on the topic, so read the whole thing.

Comments closed

Share Your Knowledge

Ginger Grant wants you to share what you know:

Recently I have been working with some new features of SQL Server 2016 and have had questions which blogs, TechNet and Stack Overflowprovided no answers on the internet. Fortunately, I have found people to help me resolve the answers. If you go searching for the same errors I had, you will find answers now, as I have posted them. If you have had a problem unique to the latest release of SQL Server, I hope you will take the time to post the question and the answer if you have it. I’m going to try to be better at answering forum questions, especially now I have learned a few interesting factoids. I am looking forward to the fact that next time when I go looking for an answer, thanks to all of us who have done the same, we can all help each other out. The next person who finds themselves in the same jam will thank you for talking them out of the tree.

And if you don’t already do so, blog.  And if I don’t know about your blog, tell me about it.

Comments closed

Warehouse History

Kennie Pontoppidan delves into various aspects of collecting and storing history in warehouses:

In T2 history we have the two attributes ValidFromDate and ValidToDate. We can choose two different strategies for updating the values of these: using system time (load time) or business time. If we use system time for the T2 splits, the data warehouse history is dependent on when we load data. This makes it impossible to reload data in the data warehouse without messing up the data history. If we allow our load ETL procedures to use timestamps for business time (when data was really valid) for T2 history, we get the opportunity to reload data. But the cost of this flexibility is a much more complicated design for T2 splits. We also need to keep track of this metadata on the source system attributes.

Part of a warehouse’s value is its ability to replay historical data, but you can only do that if you store the data correctly (and query it correctly!).

Comments closed

Azure Key Vault Connector Available

Rebecca Zhang notes that Azure Key Vault is now available to all:

When using these SQL encryption technologies, your data is encrypted with a symmetric key (called the database encryption key) stored in the database. Traditionally (without Azure Key Vault), a certificate that SQL Server manages would protect this data encryption key (DEK). With Azure Key Vault integration for SQL Server through the SQL Server Connector, you can protect the DEK with an asymmetric key that is stored in Azure Key Vault. This way, you can assume control over the key management, and have it be in a separate key management service outside of SQL Server.

The SQL Server Connector is especially useful for those using SQL Server-in-a-VM (IaaS) who want to leverage Azure Key Vault for managing their encryption keys. SQL IaaS is the simplest way to deploy and run SQL Server, and it is optimized for extending existing on-premises SQL Server applications to the cloud in a hybrid IaaS scenario, or supporting a migration scenario.

Read on for more details.

Comments closed