2016-11-16 – Curated SQL

SQL Server 2016 SP1

Published 2016-11-16 by Kevin Feasel

Parikshit Savjani notes that SQL Server 2016 SP1 is available:

The following table compares the list of features which were only available in Enterprise edition which are now enabled in Standard, Web, Express, and LocalDB editions with SQL Server 2016 SP1. This consistent programmatically surface area allows developers and ISVs to develop and build applications leveraging the following features which can be deployed against any edition of SQL Server installed in the customer environment. The scale and high availability limits do not change, and remain as–is for lower editions as documented in this MSDN article.

This is huge. With SQL Server 2016 SP1, you can get data compression, In-Memory OLTP, partitioning, database snapshots, Polybase, Always Encrypted, and a lot more in Standard edition. If you’re on Standard Edition today, this is a must-upgrade—some of these have been Enterprise-only features for nearly a decade and they were a huge part of the appeal for paying for Enterprise. My question is, what are they going to announce to make people want to keep buying Enterprise Edition?

Comments closed

Preemptive Scheduling

Published 2016-11-16 by Kevin Feasel

Ewald Cress looks at preemptive scheduling:

Cooperative scheduling is a relay race: you simply don’t stop without passing over the baton. If you write code which reaches a point where it may have to wait to acquire a resource, this waiting behaviour must be implemented by registering your desire with the resource, and then passing over control to a sibling worker. Once the resource becomes available, it or its proxy lets the scheduler know that you aren’t waiting anymore, and in due course a sibling worker (as the outgoing bearer of the scheduler’s soul) will hand the baton back to you.

This is complicated stuff, and not something that just happens by accident. The textbook scenario for such cooperative waiting is the traditional storage engine’s asynchronous disk I/O behaviour, mediated by page latches. Notionally, if a page isn’t in buffer cache, you want to call some form of Read() method on a database file, a method which only returns once the page has been read from disk. The issue is that other useful work could be getting done during this wait.

Read on for a detailed example looking at xp_cmdshell.

Comments closed

New MPP For Big Data

Published 2016-11-16 by Kevin Feasel

James Serra notes that there will be a Microsoft Professional Program for Big Data:

A few months back, Microsoft started the Microsoft Professional Program for Data Science (note the program name change from Microsoft Professional Degree to Microsoft Professional Program, or MPP). This is online learning via edX.org as a way to learn the skills and get the hands-on experience that a data science role requires. You may audit any courses, including the associated hands-on labs, for free. However, to receive credit towards completing the data science track in the Microsoft Professional Program, you must obtain a verified certificate for a small fee for each of the ten courses you successfully complete in the curriculum. The course schedule is presented in a suggested order, to guide you as you build your skills, but this order is only a suggestion. If you prefer, you may take them in a different order. You may also take them simultaneously or one at a time, so long as each course is completed within its specified session dates.

Look for it sometime next year.

Comments closed

Understanding The Cardinality Estimator

Published 2016-11-16 by Kevin Feasel

SQL Scotsman is working on a very interesting series on statistics and the different cardinality estimators. So far, this is a three-part series. Part one is an overview:

A few of those assumptions changed in the new SQL Server 2014/2016 CE, namely:

Independence becomes Correlation: In absence of existing multi-column statistics, the legacy CE views the distribution of data contained across different columns as uncorrelated with one another. This assumption of independence often does not reflect the reality of a typical SQL Server database schema, where implied correlations do actually exist. The new CE uses an increased correlation assumption for multiple predicates and an exponential back off algorithm to derive cardinality estimates.
Simple Join Containment becomes Base Join Containment: Under the legacy CE, the assumption is that non-join predicates are somehow correlated which is called “Simple Containment”. For the new Cardinality Estimator, these non-join predicates are assumed to be independent (called “Base Containment”), and so this can translate into a reduced row estimate for the join. At a high level, the new CE derives the join selectivity from base-table histograms without scaling down using the associated filter predicates. Instead the new CE computes join selectivity using base-table histograms before applying the selectivity of non-join filters.

Part two looks at trace flag 9481:

When To Use Trace Flag 9481

Query Scope: You’ve moved (migrated/upgraded) to SQL Server 2014 / 2016, your databases are at compatibility level 120 / 130 and using the new CE, your workload is performing well overall but there are a few regressions where a small number of queries actually perform worse. Use Trace Flag 9481 on a per query basis as a temporary measure until you can tune / rewrite the query so it performs well without the hint.

Part three discusses database scoped configurations in SQL Server 2016:

The problem with lowering the database compatibility level is that you can’t leverage the new engine functionality available under the latest compatibility level.

This problem was solved in SQL Server 2016 with the introduction of Database Scoped Configurations which gives you the ability to make several database-level configuration changes for properties that were previously configured at the instance-level. In particular, the LEGACY_CARDINALITY_ESTIMATION database scoped configuration allows you to set the cardinality estimation model independent of the database compatibility level. This option allows you to leverage all new functionality provided with compatibility level 130 but still use the legacy CE in the odd chance that the latest CE casuses severe query regressions across your workload.

The article on statistics is quite long for a blog post and a great read. I’m looking forward to reading more.

Comments closed

Processing Azure Analysis Services

Published 2016-11-16 by Kevin Feasel

Bill Anton shows how to process an Azure Analysis Services tabular model:

This post contains a list of various methods that can be used to process (i.e. load data into) an Azure AS tabular model. As you will see – not much has changed from the regular on-premise version (which is a very good thing as it softens the learning curve).

Read on if you’re looking at putting an Analysis Services model into Azure.

Comments closed

External Tables To Hadoop

Published 2016-11-16 by Kevin Feasel

I have a post looking at creating external tables in Polybase to hit a Hadoop folder:

The DATA_SOURCE and DATA_FORMAT options are easy: pick you external data source and external file format of choice.

The last major section deals with rejection. We’re going from a semi-structured system to a structured system, and sometimes there are bad rows in our data, as there are no strict checks of structure before inserting records. The Hadoop mindset is that there are two places in which you can perform data quality checks: in the original client (pushing data into HDFS) and in any clients reading data from HDFS. To make things simpler for us, the Polybase engine will outright reject any records which do not adhere to the quality standards you define when you create the table. For example, let’s say that we have a Age column for each of our players, and that each age is an integer. If the first row of our file has headers, then the first row will literally read “Age” and conversion to integer will fail. Polybase rejects this row (removing it from the result set stream) and increments a rejection counter. What happens next depends upon the reject options.

Creating an external table is pretty easy once you have the foundation prepared.

Comments closed

Growing New Speakers

Published 2016-11-16 by Kevin Feasel

Andy Yun hosted this month’s T-SQL Tuesday and it was a huge success:

Welcome to this month’s T-SQL Tuesday Round-Up! A few weeks ago, I sent out a call for bloggers and must say that I’m utterly blown away by the response. A whopping FORTY bloggers responded last week with contributions for Growing New Speakers! Four – zero! You people are all amazing!!!

There’s a lot to read here. If you’ve ever thought about speaking, give it a try; there are 40 people trying to convince you this month.

Comments closed

SQL Server In Containers

Published 2016-11-16 by Kevin Feasel

Andrew Pruski shows how to install Docker on Windows Server 2016 and pull down a SQL Express container:

But what about connecting remotely? This isn’t going to be much use if we can’t remotely connect!

Actually connecting remotely is the same as connecting to a named instance. You just use the server’s IP address (not the containers private IP) and the non-default port that we specified when creating the container (remember to allow access to the port in the firewall).
Easy, eh?

Containers are great, though I do have trouble wrapping my head around containerized databases and have had struggles getting containerized Hadoop to work the way I want.

Comments closed

DBCC OPTIMIZER_WHATIF

Published 2016-11-16 by Kevin Feasel

Derik Hammer shows how to use DBCC OPTIMIZER_WHATIF to get an idea of how your query would run with different hardware:

DBCC OPTIMIZER_WHATIF can be used to pull down your resources or augment them. Often the differences in the execution plans have to do with parallelism and memory grants. This is an example of an execution plan running on an under powered development machine.

This is a good tool to help figure out what an execution plan probably would look like in production when your test environment is much smaller.

Comments closed

SQL Server R Service Users

Published 2016-11-16 by Kevin Feasel

John Pertell shows how to figure out which user account is running SQL Server R Services code:

You’re not running as yourself, even though that’s the account you signed into SSMS as.

You’re not running under the server account that SQL or SQL Launchpad run under.

You’re running as a new account created when you installed SQL R Service In Database for the purpose of running R code.

John also looks at a couple ways of showing which user is running this code and notes that this solves his file share issue.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Day: November 16, 2016

When To Use Trace Flag 9481