Kennie Pontoppidan talks about a new DBCC command:

This command enables you to take a copy of the schema and the statistics for an existing database into a new database. According to the documentation, DBCC CLONEDATABASE

  • Creates a new destination database that uses the same file layout as the source but with default file sizes as the model database.

  • Creates an internal snapshot of the source database.

  • Copies the system metadata from the source to the destination database.

  • Copies all schema for all objects from the source to the destination database.

  • Copies statistics for all indexes from the source to the destination database.

I knew there were ways of scripting out statistics, but DBCC CLONEDATABASE looks like a new and interesting beast.

Database Mail Requires TLS 1.0

Ryan Adams discovered that Database Mail cannot use TLS 1.2 at this time:

You may recall something called the POODLE attack that revealed a vulnerability in SSL 3.0 and TLS 1.0.  This particular server had SSL 3.0, TLS 1.0, and TLS 1.1 disabled in the registry.  Also note that TLS 1.2 was NOT disabled.  The server was running Windows 2012 R2.  These protocols were disabled to prevent the possibility of a POODLE attack.  If you are wondering how to disable these protocols on your servers then look at Microsoft Security Advisory 3009008.  To disable them for the whole OS scroll down to the Suggested Actions section and look under the heading “Disable SSL 3.0 in Windows For Server Software”.

I also want to note that the PCI Security Standards Council pushed back the date for getting off of SSL and TLS 1.0 to June 30th, 2018.  In addition to that, it should also be noted that Microsoft’s Schannel implementation of TLS 1.0 is patched against all known vulnerabilities.

The root cause is interesting:  it’s because Database Mail requires .NET Framework 3.5.  Ryan has more details, including a fix, so read on.

SSIS Deployment Models

Ginger Grant argues in favor of the project deployment model in SSIS:

Prior to SQL Server 2012, there was no project deployment. SSIS code was all deployed as packages. These packages could be stored within MSDB or they could be stored and run from the file system. In disorganized places like the one where I worked, they were deployed in both. Assuming nothing much changed since I left, they have all versions of SQL Server which were released prior to the day the new IT Director started in 2012. There was DTS on SQL Server 97, 2000 and SSIS on 2005, 2008 and 2008 R2. No reason to upgrade anything which still worked was their motto. When space was a problem, one could always go build another server. I think the LAN administrator was happiest when he was able to justify building a new server as he could spend hours shopping for parts on the internet and building the latest server.

I believe that, in the business, we call this a “nightmare scenario.”  Read on for ways in which a project model would help.

Introduction To R

Kevin Feasel



Paul Hernandez has an introduction to using the R client and RODBC to connect to SQL Server:

The first step is to load the RevoScaleR library. This is an amazing library that allows to create scalable and performant applications with R.

Then a connection string is defined, in my case using Windows Authentication. If you want to use SQL Server authentication the user name and password are needed.

We define a local folder as the compute context.

RxInSQLServer: generates a SQL Server compute context using SQL Server R Services –documentation

Sample query: I already prepared the dataset in the view, this is a best practice in order to reduce the size of the query in the R code and for me is also easier to maintain.

I think there’s a lot of value in learning R, regardless of whether you have “data analyst” in your role or job title.

MDS Installation Pre-Requisites

Cody Konior explains pre-requisites for installing Master Data Services for SQL Server 2016:

Microsoft’s list of MDS prerequisites doesn’t match their PowerShell installation script “sample”.

  • The sample installs the Application Server component which is not listed in the requirements.

  • The sample installs the Application Server NET 4.5 component even though it’s not in the requirements (it lists .NET Framework 4.5 Advanced Services, but a look through other versions of the OS would indicate that this is the plain .NET Framework 4.5 Features category; I suspect someone misread this as “Application Services”).

  • The sample installs additional “Application Development” components that are not listed in the requirements.

  • The sample doesn’t install the recommended Dynamic Content Compression component.

Basically, the sample needs updated and the documentation fixed.  Click through if you’re planning to use MDS.

Excel MDX Performance Improvements

Chris Webb notes performance improvements in Excel 2016 for PivotTables which connect to Analysis Services:

In that post, Microsoft point out that how much of a performance increase you get will depend on a number of factors. I guess they have to do this to manage expectations. However I’m going to come right out and say that these changes are probably the most important thing that has happened in the last five years for Analysis Services or Power Pivot query performance and if you are using Excel PivotTables with SSAS or Power Pivot for reporting, you should upgrade to Excel 2016 (you’ll need the Office 365 click-to-run version) just for these changes.

The key improvement that Microsoft have made here is to stop Excel bringing back unnecessary subtotal values in the MDX queries that it generates. This has been a gigantic problem for years and several people have blogged about it: I did here, and among other people Rui Quintino and Richard Lees have posts on the subject that are worth reading. A lot of my consultancy work involves SSAS and MDX performance tuning and I think I see around five to ten customers every year who are seriously affected by this – and I’m sure this is just the tip of the iceberg. Even if you don’t think you are affected, you may have users who are putting up with slow refresh times for Excel-based reports without your knowledge.

That’s a pretty big statement in bold.  Read the whole thing.

Data Frame Partial Caching

Kevin Feasel



Arijit Tarafdar shows how to capture partitions of a data frame in Spark, either horizontally or vertically:

In many Spark applications, performance benefit is obtained from caching the data if reused several times in the applications instead of reading them each time from persistent storage. However, there can be situations when the entire data cannot be cached in the cluster due to resource constraint in the cluster and/or the driver. In this blog we describe two schemes that can be used to partially cache the data by vertical and/or horizontal partitioning of the Distributed Data Frame (DDF) representing the data. Note that these schemes are application specific and are beneficial only if the cached part of the data is used multiple times in consecutive transformations or actions.

In the notebook we declare a Student case class with name, subject, major, school and year as members. The application is required to find out the number of students by name, subject, major, school and year.

Partitioning is an interesting idea for trying to speed up Spark performance by keeping everything in memory even when your entire data set is a bit too large.

Infrastructure Error

Suresh Kandoth explains the various causes of “Login-based server access validation failed with an infrastructure error” error messages:

The important piece of information from that message is the part about server access validation failed. So let’s examine that carefully. During the login process, the database engine has to perform several checks regarding the login and its various attributes before letting the application connect to the SQL Server instance. Out of that big list, related to this error condition, there are two checks to find out if this login is authorized to access this server instance. There are two permissions that come into play while performing these authorization checks:

– Does the login have the SERVER class permission named CONNECT SQL for this server instance?

– Does the login have the ENDPOINT class permission named CONNECT for the specific endpoint on which the client application established the connection?

Suresh lays out a number of scenarios and explains why they could cause this error to occur.

Continuous Integration

James Anderson is starting a series on continuous integration:

I had been using source control for years but it’s always felt like a tick box exercise that I was doing because I had to. I had never used it to review old versions to see where code went wrong or to quickly roll back changes if I decided I no longer wanted to go in a certain direction with the code. I never felt like I was getting anything back from using source control. Sometimes it takes a problem to arise for you to see the value of a solution.

In 2015 I started to inherit the code base for our internal maintenance database, the UtilityDB. This database is used to store performance metrics and to manage tasks such as index maintenance and backups. This database is installed on all of our instances.

This first post is an introduction to the series, and it looks like he’ll cover some heady topics.

Microsoft R Client

Kevin Feasel



Buck Woody discusses whether Microsoft R Client really is a client:

Enter the Microsoft R Client. It includes Microsoft R Open, and adds in some of the ScaleR functions, which makes processing data faster and more efficient. And again, it’s a full R environment – you can write and run code, right there on your desktop. But the important bit is that it can connect to a Microsoft R Server (MRS) by seting something called the “Compute Context“, which tells the R environment to run on a more powerful, scalable server environment, like you may be used to with SQL Server.

The naming is a bit of a head-scratcher, to be honest.


July 2018
« Jun