November 2023 – Page 4

Key Constraints in Databricks Unity Catalog

Published 2023-11-24 by Kevin Feasel

I’ve been building lakehouses using Databricks Unity catalog for a couple of clients. Overall, I like the technology, but there are a few things to get used to. This includes the fact that primary key and foreign key constraints are informational only and not enforced.

If you come from a relational database background, this unenforced constraint may bother you a bit as you may be used to enforcing it to help with referential integrity.

Read on to see what is available and why it can nonetheless be useful in some circumstances.

Comments closed

E-Mail Alerts when a Database is Offline

Published 2023-11-24 by Kevin Feasel

Volker Bachmann gets an e-mail:

As the second article in the dbatools Quickies series, I would like to introduce another mail script that sends an email when databases on the servers examined are offline.

Unfortunately, it happens every now and then that databases are briefly taken offline but then forgotten. Here we receive weekly emails with all databases that are offline and where we can then follow up. If no database is offline, we will still receive an email with a short note.

Click through for the article in English and German, as well as a dbatools script and some additional remarks from Volker.

Comments closed

T-SQL Tuesday 168 Round-Up

Published 2023-11-24 by Kevin Feasel

Steve Jones lagged a bit:

I didn’t get much of a chance to check out the posts as I was at the PASS Data Community Summit, but I came home and started to work through them.

This was the 8th one I’ve hosted, which makes sense as I’ve taken over managing the party from Adam Machanic and there have been a few places I’ve had to fill in for missing hosts. In any case, here’s the roundup. I’m going in order of the comments as I see them on the blog.

Click through for this month’s list of entrants.

Comments closed

Fast-Track Optimization and Window Functions

Published 2023-11-24 by Kevin Feasel

Hugo Kornelis digs into a performance improvement the SQL Server optimizer has for window functions:

This is part twenty-four of the plansplaining series. In the previous part, I explained the execution plans for basic window functions, with and without a window frame. Especially the latter group performed quite poorly in the examples. So let’s now look at an optimization that SQL Server can apply to most cases, that prevents this rather bad scaling.

Click through to see what this is, how it works, and when the optimizer is able to use it.

Comments closed

Failed to Update Replica Status Due to Exception 35222

Published 2023-11-24 by Kevin Feasel

Sean Gallardy troubleshoots a problem:

Read Scale Availability Groups can be pretty useful in the right places and for the right things and were a latest feature update for AGs until Contained AGs came along in 2022. Read Scale AGs don’t integrate with clustering of any type and they behave somewhat as mirroring used to where there is no real coordination of resources and it is up to the administrators to make the proper judgement calls or automate whatever possible scenarios they deem important.

Read on to see what this error means and why it’s less of a problem than it first appears.

Comments closed

Schema Validation with Kusto Databases

Published 2023-11-24 by Kevin Feasel

Vincent-Philippe Lauzon tests the schema:

Kusto allows you to very quickly get productive. You can setup an ingestion pipeline in minutes that will ingest Terabytes (TBs) of data per day.

Like any piece of code, your database schema is as good as the intent you convey when you wrote it. But over time, the intent diffuses and different priorities, authors and just plain miscommunication can diminish the quality of your code.

Read on to see what it does and the benefits it provides.

Comments closed

I/O Freeze in SQL Server 2022 T-SQL Snapshot Backups

Published 2023-11-24 by Kevin Feasel

Anthony Nocentino has a public service announcement:

SQL Server 2022 introduces a new feature to enable application-consistent snapshot backups. TSQL Snapshot Backups enable the SQL Server to control the database quiesce without external tools. Using TSQL Snapshot backups enables instantaneous restores, independent of the size of data, for a database, group, or server backups, including point-in-time recovery.

When you use this feature, it freezes I/O. You’ll see a record like this in your error log when you execute the command ALTER DATABASE TestDB1 SET SUSPEND_FOR_SNAPSHOT_BACKUP = ON. This blog post will show you that the I/O freeze is just for write operations. You can continue to read from the database during this frozen state.

Read on to understand what’s going on, what “frozen” really means, and why this is a huge improvement over the classic behavior of the volume snapshot service.

Comments closed

Practical Applications of Functional Programming Techniques in R

Published 2023-11-22 by Kevin Feasel

Anirban Shaw continues a series on functional programming:

Functional programming in R is more than just a trendy buzzword; it’s a powerful approach that can dramatically simplify and enhance your data analysis tasks. In this section, we’ll explore real-world examples of common data analysis problems solved using functional programming in R, comparing them to traditional imperative methods. We’ll also highlight the conciseness and readability of functional code, demonstrating why it’s a game-changer for data professionals.

I think there’s a lot of value in understanding the functional programming approach for a few reasons:

Functional programming makes more intuitive sense to data platform developers who are raised on “set-based” operations and thinking over columns rather than rows.
Functional programming allows for ease of parallel computation. This is why Spark heavily emphasizes functional programming and why the code base itself is written in a functional programming language.
Functional code is often terser than equivalent imperative code.
Functional code allows you to build more “bottom-up,” starting with small functions and composing them rather than thinking in a “top-down” approach of classes and interactions between classes. For people who are not full-time application developers, the ease of writing a quick function is huge, which is part of why shell scripting languages like Powershell and Bash emphasize this approach as well.

H/T R-Bloggers.

Comments closed

Exponential Regression in R

Published 2023-11-22 by Kevin Feasel

Steven Sanderson understands the power of compound interest:

Before we jump into the code, let’s quickly grasp the concept of exponential regression. In simple terms, it’s a statistical method used to model relationships where the rate of change of a variable is proportional to its current state. Think of scenarios like population growth, viral spread, or even financial investments.

Read on to see how you can perform a regression in this kind of scenario.

Comments closed

Surrogate Keys and Logical Data Models

Published 2023-11-22 by Kevin Feasel

I wrap up a series on database normalization:

This video serves as a coda, covering one topic I did not include in the main series: do surrogate keys belong in the logical data model?

The short answer is, “no.” The longer answer is, “no and here’s why.”

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Month: November 2023