Press "Enter" to skip to content

Month: June 2016

Creating R Code

Ginger Grant introduces us to Microsoft R:

Microsoft has not one version of R, they have two but two. These two different versions are needed because they have two different purposes in mind. Microsoft R Open, is open source and fully R compatible and is faster than open source R because they rewrote a number of the algorithms to include multi-threaded math libraries. If you want to run R code on SQL Server, this is the not the version you want to use. You want to use the non-open source version designed to run on R Server, which is included with SQL Server 2016, Microsoft RRE Open. This version will run R code not only in memory but swap to disk, to create code which can access SQL Server data without needing to create a file, and can run code on the server from the client. The version of RRE Open which is included in SQL Server 2016 is 8.0.3.

She follows this up with a demo program to pull data from a SQL Server table and generate a histogram.  If you have zero R experience, there’s no time like the present to get started.

Comments closed

Alerting On Drastic Changes

Rob Collie has a post on using Power BI to spot outliers:

The basic idea here is “alert me if something has changed dramatically.”  If there’s a corner of my business that has spiked or crashed in a big way, I want to know.  If something has dramatically improved in a particular region, I may want to dive into that and see if it’s something we can replicate elsewhere.  And if something has fallen off a cliff, well, I need to know that for obvious reasons too.  And both kinds of dramatic change, positive and negative, can easily be obscured by overall aggregate values (so in some sense this is a similar theme to “Sara Problem?”)

So the first inclination is to evaluate distance from average performance.  And maybe that would be fine with high-volume situations, but when we’re subdividing our business into hundreds or perhaps thousands of micro-segments, we end up looking at smaller and smaller sample sizes, and “normal” variation can be legitimately more random than we expect.

This looks really cool.  If you read the comments, Rob notes that performance does break down at some point.  If you start hitting that point, I’d think about shifting this to R.

Comments closed

Implementing SoundEx

Dror Helper shows how to implement SoundEx in C#:

It’s fairly easy to follow the steps of the algorithm (as defined by Wikipedia):

  1. Retain the first letter of the name and drop all other occurrences of a, e, I, o, u, y, h, w.

  2. Replace consonants with digits as follows (after the first letter):

    • b, f, p, v → 1

    • c, g, j, k, q, s, x, z → 2

    • d, t → 3

    • l → 4

    • m, n → 5

    • r → 6

  3. If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by ‘h’ or ‘w’ are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.

  4. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers.

SQL Server also supports SOUNDEX as a built-in function.

Comments closed

Waiting For SP1

Guy Glantser hates “wait for SP1” advice:

Historical Facts

Throughout my career I have never seen an RTM version that was substantially less stable then the following SP1. Sure, there were bugs and issues. Sometimes there were critical bugs and issues. But there were just as much bugs and issues in SP1 and in SP2, and so on. I haven’t conducted a thorough research, so I don’t have a statistical proof, but these are the facts, at least from my experience.

I’d add one more thing:  pre-release versions of SQL Server run in production as part of Microsoft TAP (older link, and I think RDP and TAP have merged together at this point, but I don’t have those inside details).  These are real companies with real workloads running pre-RTM versions of SQL Server.  I work for a company which is in the program, and we were running our data warehouse on CTP 3 and then RCs.  By the time RTM hits the shelves, there’s already been a good deal of burn-in.

Comments closed

Choose Your Edition

Grant Fritchey explains the differences between different editions of SQL Server:

SQL Server Enterprise Edition is the high end. Here is where you need to go to multi-terrabytes in size and you have massive transaction loads. You’re looking at very sophisticated availability and disaster recovery. Again, the name gives it away. You’re generally only going to this edition when you’re working at an enterprise level of scale and architecture. Since you’re just getting started, don’t worry about this.

My version of the story is, “Buy Enterprise Edition.  Don’t cheap out because you’ll regret it later.”  Grant’s version is much more thorough.

Comments closed

Using Diagnostics Trace With Power BI

Chris Webb uses Diagnostics.Trace to track process runtime in Power BI:

To sum up, the workflow for tuning your query is:

  • Make some changes to the LongQuery query that hopefully make it faster

  • Update the Trace Message parameter with some notes about which version of the LongQuery query it is that you’ll be testing

  • Click the Refresh Preview button for the Diagnostics query to test how long LongQuery now runs for

  • Refresh, or load, the query that reads the data from the trace logs so you can see how all of your changes have affected query execution times

I give it two months before the Power BI team releases a change to make this easier…

Comments closed

SQL Server Backup Buffers

SQLsasquatch (whose name I now know but I like the handle so much) has a few questions about backup buffers:

I wonder:
-If ‘max server memory’ wasn’t being overridden by a more reasonable target (because max server memory is 120 GB on a 32GB VM), would the behavior still be the same before reaching target?  I bet it would be.
-Is this behavior specific to not having reached target?  Or when reaching target would backup buffers be allocated, potentially triggering a shrink of the bpool, then returned to the OS afterward requiring the bpool to grow again?
-What other allocations are returned directly to the OS rather than given back to the memory manager to add to free memory?  I bet CLR does this, too.  Really large query plans?  Nah, I bet they go back into the memory manager’s free memory.
-Does this make a big deal?  I bet it could.  Especially if a system is prone to develop persistent foreign memory among its NUMA nodes.  In most cases, it probably wouldn’t matter.

Good questions, to which I have zero answers.

Comments closed

Launch Event Thoughts

Patrick Keisler sums up his thoughts on the Raleigh-Durham SQL Server 2016 Launch Discovery Day:

Your solution must be completed in 3 hours.

On paper this all sounds pretty easy, but in practice it was quite hard. I am no BI developer and the other members of my team did not have any expertise in that area either, but we still managed to create a solution and have fun doing so.

The first issue was had was how to combine our development work on the same database. This one was easy…just use Azure. In the span of about 30 minutes, I spun up a new Azure VM with SQL Server 2016 pre-installed, uploaded the database, setup logins, and opened the appropriate ports. I then gave my team members the URL and credentials so they each could connect from their laptops.

These are good thoughts, and I completely agree with the point that better data definition would have made for a better event.  Each one of the teams had to spend a lot of time cleaning up data, and I think that limited the teams’ ability to do really cool things.  I’d love to put something like this on again, but if that happens, I’m going to make sure we start with a good data model so people can do fun things on top of that rather than spend all their time scrubbing data (unless that’s the point of the exercise).

Comments closed

Row-Level Security With Reporting Services

Paul Turley discusses combining row-level security, SQL Server Reporting Services, and SQL Server Analysis Services:

In every data source connection string, you can add a simple expression that maps the current Windows username to the CUSTOMDATA property of the data source provider.  This works in SSRS embedded data sources, shared data sources, in a SharePoint Office Data Connecter (ODC) file and in a SharePoint BISM connection file.  In each case, the syntax should be the similar.  Here is my shared data source on the SSRS 2016 report server

This is pretty snazzy.  Paul goes into good detail on the topic, so read the whole thing.

Comments closed

Share Your Knowledge

Ginger Grant wants you to share what you know:

Recently I have been working with some new features of SQL Server 2016 and have had questions which blogs, TechNet and Stack Overflowprovided no answers on the internet. Fortunately, I have found people to help me resolve the answers. If you go searching for the same errors I had, you will find answers now, as I have posted them. If you have had a problem unique to the latest release of SQL Server, I hope you will take the time to post the question and the answer if you have it. I’m going to try to be better at answering forum questions, especially now I have learned a few interesting factoids. I am looking forward to the fact that next time when I go looking for an answer, thanks to all of us who have done the same, we can all help each other out. The next person who finds themselves in the same jam will thank you for talking them out of the tree.

And if you don’t already do so, blog.  And if I don’t know about your blog, tell me about it.

Comments closed