The “Maintained by Kaggle” badge means that Kaggle is now and will continue to actively maintain that dataset. This includes regular updates to descriptions and metadata, quicker response rates in discussion, and accurate current data from the source. Our goal is to create seamless workflows that allow everyone to do data science on Kaggle and be confident in the data they work with.
They have several data sets available from different open data projects for several cities, as well as NOAA and the World Bank. If you’re looking for data sets to play with, this is a good option.
My database has to be in 2019 compat mode to enable Froid, the function-inlining magic. Run the same query again, and the metrics are wildly different:
Runtime: 4 seconds
CPU time: 4 seconds
Logical reads: 3,247,991 (which still sounds bad, but bear with me)
My bias tells me that I still want to avoid scalar functions, but it’s no longer the automatic deal-killer it once was.
With the announcement of SQL Server 2019 big data clusters at Ignite, Kubernetes (often abbreviated to K8s) now stands front and center as part of Microsoft’s data platform vision. The obvious inference being that this is something that the Microsoft data platform community is going to show an increased interest in. The post aims to provide some context around:
why container orchestration is required
how Kubernetes is architected
the basics of working with Kubernetes
and why embracing open source software should be approached in an eyes wide open manner
Kubernetes is another technology which is useful to learn and can be helpful down the line.
The Table Spool operator is one of the four spool operators that SQL Server supports. It retains a copy of all data it reads in a worktable (in tempdb) and can then later return extra copies of these rows without having to call its child operators to produce them again. These copies can be made available in the same part of the execution plans, or in another part.
Table Spool is probably the most basic of the spool operators. The Index Spool operator is very similar to it, but indexes its data to allow it to return only a subset of the stored rows. The Row Count Spool operator is optimized for specific cases where the rows to be returned are empty. And the Window Spool operator is used to support the
RANGEspecifications of windowing functions.
Typical use cases of a Table Spool are: to reproduce the same input multiple times without having to re-execute its child nodes (e.g. in the inner input of a Nested Loops); to make the same input available in multiple branches of an execution plan (e.g. in wide update plans); or to ensure that an original copy of the data is available after an insert, update, or delete operator changes the base data (“Halloween protection”).
Click through for a great deal more detail.
“Any sufficiently advanced technology is indistinguishable from magic.” -Arthur C. Clarke
In this morning’s keynote session at PASS Summit 2018, public preview of a new feature in Azure SQL Database and SQL Server 2019 called Accelerated Database Recovery (ADR) was announced. This changes the way that SQL Server handles recovery of a SQL Server instance on start up.
This looks really good for large databases, where recovery can sometimes be measured in hours.
In November’s version of the monthly release blog, the emphasis was on fixing customer issues and adding and improving existing extensions.
Introducing the Paste the Plan extension
Introducing the High Color Queries extension
Improved Logging support
Read on for the details. This product is getting closer and closer to a state where it can be a daily driver.