Press "Enter" to skip to content

Month: August 2017

The Pain Of Multi-Statement TVFs

Andy Mallon walks through a multi-statement table-valued function in Microsoft Dynamics CRM:

Look at all those table-valued function calls! Followed immediately by a really expensive hash match. My Spidey Sense started to tingle. What is fn_GetMaxPrivilegeDepthMask, and why is it being called 30 times? I bet this is a problem. When you see “Table-valued function” as an operator in a plan, that actually means it’s a multi-statement table-valued function. If it were an inline table-valued function, it would get incorporated into the larger plan, and not be a black box. Multi-statement table-valued functions are evil. Don’t use them. The Cardinality Estimator isn’t able to make accurate estimates. The Query Optimizer isn’t able to optimize them in the context of the larger query. From a performance perspective, they don’t scale.

Even though this TVF is an out-of-the-box piece of code from Dynamics CRM, my Spidey Sense tells me that it’s the problem.

That said, Joe Sack and team are working on making multi-statement TVFs faster in SQL Server 2017.  Whether it will move the needle from Andy’s excellent advice, we’ll have to wait and see.

Comments closed

Using NLP To Find Similar Facebook Posts

The folks at Knoyd put together a word embedding example by scraping a Python Facebook group:

We are going to represent the content of a Facebook post using word embeddings and comparing the transformed posts using word mover’s distance. The combination of both have shown lower k-nearest neighbor-document classification error rates compared to other state of the art techniques.

The advantage of word embeddings is that the words which have similar meanings but don’t have any letters in common will still have similar vectors (be close) in the embedded space (e.g. lion and tiger).

There’s a good high-level discussion of techniques in this post.

Comments closed

R Services Internal Communication Mechanisms

Niels Berglund continues his R Services internals series:

When browsing for the symbols, you can use this command: x /1 *!TCP*. By using the option /1 you’ll only see the names, and no addresses. On my machine that gives me quite a lot, but there are two entries that catch my eye: sqllang!Tcp::AcceptConnection and sqllang!Tcp::Close. So let us set breakpoints at those two symbols, and see what happens when we execute our code.

The result when executing the code is that we initially break at sqllang!Tcp::AcceptConnection. Followed somewhat later by breaking at sqllang!Tcp::Close. Cool, this seems to work – let us set some more breakpoints and try to figure out the flow of events.

The first half recapitulates his previous findings, and then he incorporates new information in the second half.

Comments closed

Palindromes In SQL

Lukas Eder rises to a challenge:

SQL is a really cool language. I can write really complex business logic with this logic programming language. I was again thrilled about SQL recently, at a customer site:

But whenever I tweet something like the above, the inevitable happened. I was nerd snipedOleg Šelajev from ZeroTurnaround challenged me to prove that SQL is so awesome:

Given a string, find all substrings from that string, which are palindromes. Challenge accepted! (For the moment, let’s forget about algorithmic complexity.)

His answer is in Postgres syntax, and a commenter includes Oracle syntax.  T-SQL is left as an exercise for the reader.

Comments closed

Automating Index Maintenance On Azure SQL DB

Arun Sirpal shows how to use Azure Automation to rebuild indexes on an Azure SQL Database database:

The answer is via Azure Automation.

At a high level this is what I did.

  • Create an Automation Account.

  • Create a credential.

  • Create a PowerShell Runbook which has the code for index rebuilds.

  • Create a schedule and link it to the above.

  • Configure parameters within the schedule (if any).

  • Configure logging level (if desired).

Click through for the detailed steps.

Comments closed

CLR Strict Security Done Easier

Solomon Rutzky continues his CLR Strict Security series with an easier way of creating a secure assembly:

This solution is easier than Solution 1:

8 steps instead of 22!

No extra Project

However, a very small amount of risk was added by overriding the default MSBuild workflow for SSDT. This risk can be eliminated if Microsoft provides a pre-defined Target for the appropriate event. Please upvote my suggestion to have this happen: Add MSBuild predefined Targets for “BeforeSqlBuild” and “BeforePublish” to SSDT SQL Server Data Projects.

ALSO: Even though we did not sign the assembly with a Strong Name Key, it is still probably a good idea to do that.

If you use CLR, this is worth the read.

Comments closed

Another Reason To Avoid Shrinking Data Files

Frank Gill gives us a demo of how much log space it takes to shrink a database file:

Yesterday, I was running a health assessment for a client. They are running a weekly maintenance plan that is shrinking all of their data files. After I picked myself up off the floor, I searched the web for “Paul Randal shrink” and hit on Paul’s excellent post Why you should not shrink your data files. In the post, Paul (b|t) demonstrates the effect of DBCC SHRINKDATABASE on index fragmentation. After the demo script, Paul writes, “As well as introducing index fragmentation, data file shrink also generates a lot of I/O, uses a lot of CPU and generates *loads* (emphasis Paul’s) of transaction log.”

This led me to ask the question, “How much is *loads*?”. To find an answer, I made the following modification to Paul’s script:

Read on for the answer.  There are legitimate reasons to shrink data files, but it comes at a very high cost.

Comments closed

New Extended Events In SQL Server 2017

Erik Darling unwraps some of his Christmas presents a bit early:

There are, as of RC2 being released, 194(!) new Events to Extend your mind with. Not all of them are interesting to me, and I haven’t had time to pry into all of the ones that are interesting just yet.

This is a rundown of the new Events with names or descriptions that I found interesting, and will try to spend some time with.

I can’t promise anything

After all, getting some of these to fire is tougher than using a Debugger.

There are some interesting events here.

Comments closed

Building Azure Resource Manager Templates

Ed Elliott gives a brief overview of Azure Resource Manager templates and puts together a sample template:

The ARM API deploys resources to Azure, but doesn’t deploy code onto those resources. For example you can use ARM to deploy a virtual machine with SQL Server already installed but you can’t use ARM to deploy a database from an SSDT DacPac.

To save time when designing solutions, it is important to understand that ARM API is used simply for resources and we need to use some other technology such as DSC or PowerShell to manage the deployments onto the infrastructure once it is deployed.

This is a nice overview of the topic, and because it’s Ed (who is much better about this than most), he goes into how to test before even getting into how to create.

Comments closed

TensorFlow On The Pi

Pete Warden shows how to install TensorFlow on a Raspberry Pi:

It’s never been easy to get TensorFlow installed on a Pi though. I had created a makefile script that let you build the C++ part from scratch, but it took several hours to complete and didn’t support Python. Sam Abrahams, an external contributor, did an amazing job maintaining a Python pip wheel for major releases, but building it required you to add swap space on a USB device for your Pi, and took even longer to compile than the makefile approach. Snips managed to get TensorFlow cross-compiling for Rust, but it wasn’t clear how to apply this to other languages.

Plenty of people on the team are Pi enthusiasts, and happily Eugene Brevdo dived in to investigate how we could improve the situation. We knew we wanted to have something that could be run as part of TensorFlow’s Jenkins continuous integration system, which meant building a completely automatic solution that would run with no user intervention. Since having a Pi plugged into a machine to run something like the makefile build would be hard to maintain, we did try using a hosted server from Mythic Beasts. Eugene got the makefile built going after a few hiccups, but the Python version required more RAM than was available, and we couldn’t plug in a USB drive remotely!

Read the whole thing, even if for the science experiment aspect.

Comments closed