Press "Enter" to skip to content

Curated SQL Posts

Moving SQL Server Data File Locations on Linux

Nisarg Upadhyay wants to move files around in SQL Server on Linux:

In this article, I am going to explain how we can move the SQL database files to another location in Ubuntu 18.04. For the demonstration, I have installed Ubuntu 18.04, SQL Server 2019 on Linux on my workstation. You can read SQL Server 2019 on Linux with Ubuntu to understand the step-by-step installation process of the SQL Server 2019 on Linux. We will move database files of AdventureWorks2019 and Wideworldimportors database.

Click through for the process. It’s really similar to Windows in this respect. And, well, in most respects.

Comments closed

Number of Rows Automatically Sampled versus Table Size

Matthew McGiffen does the math:

I mentioned in my previous post about manually updating statistics that you can specify whether they’re updated using a full scan, or you can specify an amount of data to sample, either a percentage of the table size, or a fixed number of rows. You can also choose not to specify this, and SQL Server will decide for you whether to do a full scan, or to sample a certain amount of data.

I thought it would be interesting to look at what the sample sizes are that SQL will choose to use, depending on the amount of data in your table. 

Click through for the result of Matthew’s analysis.

Comments closed

Table Variable Deferred Compilation

Deepthi Goguri takes us through a fairly new feature in SQL Server:

With SQL Server 2017 and below versions, SQL Server always estimates one row for table variable as the table variable data gets inserted during the run time, so optimizer doesn’t know how many values it can expect coming out of the table variable. Due to this bad estimation, performance of the queries is effected.

Click through to see how this has changed in SQL Server 2019.

Comments closed

Survival Analysis Notebooks

Dan Morris, et al, walk us through a survival analysis scenario:

In contrast to other methods that may seem similar on the surface, such as linear regression, survival analysis takes censoring into account. Censoring occurs when the start and/or end of a measured value is unknown. For example, suppose our historical data includes records for the two customers below. In the case of customer A, we know the precise duration of the subscription because the customer churned in December 2020. For customer B, we know that the contract started four months ago and is still active, but we do not know how much longer they will be a customer. This is an example of right censoring because we do not yet know the end date for the measured value. Right censoring is what we most commonly see with this form of analysis.

Click through for an intro as well as a half-dozen notebooks.

Comments closed

Temporal Tables and Execution Plans

Hugo Kornelis starts a new sub-series within an existing series:

Welcome to part sixteen of the plansplaining series. The first of a few posts about how temporal tables affect execution plans. In this post, I’ll build on the last four posts on data modifications, building on the more generic discussion of data modification in the previous four posts. Later posts will look at data retrieval and some specific scenarios.

Hugo hits the highlights of temporal tables and how they handle insertion, deletion, and updating scenarios.

Comments closed

Avoiding Division by Zero

Chad Callihan has a few methods for us to avoid dividing by zero:

In the real world, everyone knows that if you divide by zero a wormhole will open up and swallow the universe. In SQL Server, it’s not good, but it’s not nearly as dramatic. I encountered the following error this week and thought it would make a good topic:

Msg 8134, Level 16, State 1, Line 6
Divide by zero error encountered.

There are multiple ways to handle this error message in SQL Server and some are better than others. Let’s take a look at a few.

Click through for those methods and try to keep this universe existent—it’s where I keep all my stuff.

Comments closed

Updates to Message Keys in ksqlDB

Victoria Xia announces an improvement to ksqlDB:

One of the most highly requested enhancements to ksqlDB is here! Apache Kafka® messages may contain data in message keys as well as message values. Until now, ksqlDB could only read limited kinds of data from the key position. ksqlDB’s latest release—ksqlDB 0.15—adds support for many more types of data in messages keys, including message keys with multiple columns. Users of Confluent Cloud ksqlDB already have access to these new features as Confluent Cloud always runs the latest release of ksqlDB.

Read on for more information on this, as well as some of the ramifications of this change.

Comments closed

From SQL to Functional

Shel Burkow has the start of an interesting series:

Neither loops nor branches were used in any of the T-SQL work, but that doesn’t mean they aren’t there. They are – visible underneath in the query plans, which describe the process steps (operators and data flow) the SQL Server database engine follows to produce results. A salient example of looping is the Nested Loops operator used in the T-SQL rewrite query plan to match each outer row to an inner row, i.e. perform the INTERSECT. This style of coding, in which the logic of the computation is expressed rather than the lower-level control flow detail, is declarative programming, and is a major theme of this series. (The latter being imperative programming.)

This is a concept I’m really big on: I think functional programming languages are easier for data platform developers to pick up than object-oriented or imperative languages for the reason that both are declarative, so many of the programming metaphors for one apply to the other.

Comments closed