Press "Enter" to skip to content

Curated SQL Posts

Fixing SQL Server R Services Installation Issues

Cody Konior notes that upgrading from CTP 3.0 to CTP 3.2 can cause SQL Server R Services to break:

If you were using CTP 3.0 and later ran an in-place upgrade to CTP 3.2 this will silently break R Services. Uninstalling and reinstalling the R component will not fix the problem, but it can be fixed. There are a few interrelated issues here so bear with me.

Hopefully you don’t run into this issue, but if you do, at least there’s a fix.

Comments closed

How Do Natively Compiled UDFs Perform?

Gail Shaw investigates natively compiled user-defined functions in SQL Server 2016:

When I saw that, the first question that came to mind is whether natively compiling a scalar function reduces the overhead when calling that function within another query. I’m not talking about data-accessing scalar UDFs, since natively compiled functions can only access in-memory tables, but functions that do simple manipulation of the parameters passed in. String formatting, for example, or date manipulation.

While not as harmful as data-accessing scalar UDFs, there’s still overhead as these are not inline functions, they’re called for each row in the resultset (as a look at the Stored Procedure Completed XE event would show), and the call to the function takes time. Admittedly not a lot of time, but when it’s on each row of a large resultset the total can be noticeable.

Read the whole thing and check out Gail’s method and conclusions.

Comments closed

Who Grew The Database?

Andy Galbraith helps us figure out who to blame for database growth:

I feel a little dirty writing about the Default Trace in the world of Extended Events, but I also know that many people simply don’t know how to use XEvents, and this can be faster if you already have it in your toolbox.  Also it will work back to SQL 2005 where XEvents were new in SQL 2008.

I have modified this several times to improve it – I started with a query from Tibor Karaszi (blog/@TiborKaraszi), modified it with some code from Jason Strate (blog/@StrateSQL), and then modified that myself for what is included and what is filtered.  There are links to both Tibor’s and Jason’s source material in the code below.

I usually just blame the BI team for database growth.

Comments closed

Planning SQL Server Migrations

Kendra Little has some great resources to get you started with a SQL Server migration:

Planning to move to new hardware for your SQL Server? Techniques like log shipping and database mirroring can be incredibly useful to make the change fast and painless– but you’ve got to pick the right techniques for your environment ahead of time, and know how to do a few things that aren’t in the GUI.

Some of this stuff could also feed into a disaster recovery plan.

Comments closed

Shredding XML

Tim Peters introduces us to shredding multi-level XML:

The below XML has data nested in different levels that requires the nodes method to join them together. The nodes method accepts a XML string and returns a rowset. That rowset can then be used with CROSS APPLY to effectively link your way down.

nodes (XQuery) as Table(Column)

The tabular format I need requires data from 3 different levels of this XML gob and I need to wade through 5 “tables” to get there.

Shredding XML is something you occasionally need to do.

Comments closed

What Is Business Intelligence?

Rolf Tesmer digs into the concept of BI:

Hunting the web for the general definition pulls up many one liners – and yes I guess everyone who is anyone will have a way to define it, and that definition is (or should) be based on their own experiences with building, deploying or supporting BI solutions.

If you are looking for a nice short collection of some of those definitions – and a further explanation of why you need BI – then this is a great post (http://www.jamesserra.com/archive/2013/03/why-you-need-business-intelligence/)

Rolf unpacks the definition and gives us some insight into the nature of Business Intelligence.

Comments closed

Options To Capture Changed Data

Koen Verbeeck looks at various ways of capturing changed data:

  • In some very rare cases, you can actually use change data capture or change tracking on the source system. If you get one of those features implemented, you’re golden. But most of the time you’re not, as a lot of administrators don’t like them because of potential performance impact.

Koen lists several options.  One additional option is to use triggers to capture changes in a queue table.  If you are dealing with SCD-1 changes (in which you do not need a full reckoning of history) or periodic SCD-2 (in which you keep history but are okay with smashing some changes together if they’re within a time period between ETL loads), loading IDs of changed records into a queue table is reasonably efficient and gets around trying to make sure everybody updates the modified date.  It has its own drawbacks, though, starting with it using triggers…

Comments closed

Case Statement Short-Circuiting

Richie Lee talks about using the CASE statement to short-circuit a logical expression:

The issue here is that SQL is a declarative language: unlike procedural languages, there is no guarantee on the ordering of the operations, because optimizers. And SQL Server decides to do something other than what we’d expect: it tries to evaluate the value “Apu” as a date. But by using a CASE expression we can force the optimizer to take the input and match it to the expression (in this case, when a value is a date then convert it to a date) before checking if the value is older than 7 days.

This does work most of the time, but there are exceptions, so as always, test your code.

Comments closed

Incremental Loading Using Datetime Columns

Reza Rad shows a pattern for implementing incremental loads using modified dates:

The idea behind this method is to store the latest ETL run time in a config or log table, and then in the next ETL run just load records from the source table that have modified (with their modified date greater than or equal to) after the latest ETL run datetime. This will create the change set for the data table. The change set might contains inserted, updated, or deleted records. to identify which change happened on the record you need to compare the change set with existing records and separate inserted, updated, and deleted records. This change set with the action on each record can be inserted into staging tables, and then be used to apply on the fact table based on appropriate action.

In my experience, the hardest part about this is making sure people update ModifiedTime when they update rows in the table.

Comments closed

Powershell Hashtables

Steve Jones shows us how to implement a hashtable in Powershell:

With that code, I could easily solve the puzzle. However I was struck by the various ways I work with the hash tables. I use braces, {}, to declare the table. I use brackets, [], to access elements and then parenthesis, (), when calling methods. All of that makes programming sense, but it’s something to keep in mind, especially as those three marks mean different things in Python.

Hashtables are easy to implement in Powershell and are extremely useful.

Comments closed