2018-11-09 – Curated SQL

Kaggle-Maintained Data

Published 2018-11-09 by Kevin Feasel

Noah Daniels announces Maintained by Kaggle data sets:

The “Maintained by Kaggle” badge means that Kaggle is now and will continue to actively maintain that dataset. This includes regular updates to descriptions and metadata, quicker response rates in discussion, and accurate current data from the source. Our goal is to create seamless workflows that allow everyone to do data science on Kaggle and be confident in the data they work with.

They have several data sets available from different open data projects for several cities, as well as NOAA and the World Bank. If you’re looking for data sets to play with, this is a good option.

Comments closed

Faster Scalar Functions In SQL Server 2019

Published 2018-11-09 by Kevin Feasel

Brent Ozar looks at improvements the SQL Server team has made to scalar functions in 2019:

My database has to be in 2019 compat mode to enable Froid, the function-inlining magic. Run the same query again, and the metrics are wildly different:

Runtime: 4 seconds
CPU time: 4 seconds
Logical reads: 3,247,991 (which still sounds bad, but bear with me)

My bias tells me that I still want to avoid scalar functions, but it’s no longer the automatic deal-killer it once was.

Comments closed

The Basics Of Kubernetes

Published 2018-11-09 by Kevin Feasel

Chris Adkin gives us a rundown on Kubernetes:

With the announcement of SQL Server 2019 big data clusters at Ignite, Kubernetes (often abbreviated to K8s) now stands front and center as part of Microsoft’s data platform vision. The obvious inference being that this is something that the Microsoft data platform community is going to show an increased interest in. The post aims to provide some context around:

why container orchestration is required
how Kubernetes is architected
the basics of working with Kubernetes
and why embracing open source software should be approached in an eyes wide open manner

Kubernetes is another technology which is useful to learn and can be helpful down the line.

Comments closed

The Table Spool Operator In SQL Server

Published 2018-11-09 by Kevin Feasel

Hugo Kornelis digs into table spools:

The Table Spool operator is one of the four spool operators that SQL Server supports. It retains a copy of all data it reads in a worktable (in tempdb) and can then later return extra copies of these rows without having to call its child operators to produce them again. These copies can be made available in the same part of the execution plans, or in another part.

Table Spool is probably the most basic of the spool operators. The Index Spool operator is very similar to it, but indexes its data to allow it to return only a subset of the stored rows. The Row Count Spool operator is optimized for specific cases where the rows to be returned are empty. And the Window Spool operator is used to support the ROWS and RANGE specifications of windowing functions.

Typical use cases of a Table Spool are: to reproduce the same input multiple times without having to re-execute its child nodes (e.g. in the inner input of a Nested Loops); to make the same input available in multiple branches of an execution plan (e.g. in wide update plans); or to ensure that an original copy of the data is available after an insert, update, or delete operator changes the base data (“Halloween protection”).

Click through for a great deal more detail.

Comments closed

Accelerated Database Recovery In SQL Server 2019

Published 2018-11-09 by Kevin Feasel

Frank Gill notes an exciting new feature in SQL Server 2019:

“Any sufficiently advanced technology is indistinguishable from magic.” -Arthur C. Clarke

In this morning’s keynote session at PASS Summit 2018, public preview of a new feature in Azure SQL Database and SQL Server 2019 called Accelerated Database Recovery (ADR) was announced. This changes the way that SQL Server handles recovery of a SQL Server instance on start up.

This looks really good for large databases, where recovery can sometimes be measured in hours.

Comments closed

Azure Data Studio November Release

Published 2018-11-09 by Kevin Feasel

Alan Yu announces this month’s Azure Data Studio update:

In November’s version of the monthly release blog, the emphasis was on fixing customer issues and adding and improving existing extensions.

This includes:

Updates to the SQL Server 2019 Preview extension
Introducing the Paste the Plan extension
Introducing the High Color Queries extension
Improved Logging support
Bug fixes

Read on for the details. This product is getting closer and closer to a state where it can be a daily driver.

Comments closed

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Day: November 9, 2018

Kaggle-Maintained Data

Faster Scalar Functions In SQL Server 2019

The Basics Of Kubernetes

The Table Spool Operator In SQL Server

Accelerated Database Recovery In SQL Server 2019

Azure Data Studio November Release