2025-05-15 – Curated SQL

Extending caret for Spatial Machine Learning

Published 2025-05-15 by Kevin Feasel

This document shows the application of caret for spatial modelling at the example of predicting air temperature in Spain. Hereby, we use measurements of air temperature available only at specific locations in Spain to create a spatially continuous map of air temperature. Therefore, machine-learning models are trained to learn the relationship between spatially continuous predictors and air temperature.

When using machine-learning methods with spatial data, we need to take care of, e.g., spatial autocorrelation, as well as extrapolation when predicting to regions that are far away from the training data. To deal with these issues, several methods have been developed. In this document, we will show how to combine the machine-learning workflow of caret with packages designed to deal with machine-learning with spatial data. Hereby, we use blockCV::cv_spatial() and CAST::knndm() for spatial cross-validation, and CAST::aoa() to mask areas of extrapolation. We use sf and terra for processing vector and raster data, respectively.

Click through to see how it all works. H/T R-Bloggers.

Comments closed

Setting up Physical Streaming Replication in PostgreSQL

Published 2025-05-15 by Kevin Feasel

Umair Shahid pushes the contents of the write-ahead log to another machine:

Physical streaming replication in PostgreSQL allows you to maintain a live copy of your database on a standby server, which continuously receives updates from the primary server’s WAL (Write-Ahead Log). This standby (or hot standby) can handle read-only queries and be quickly promoted to primary in case of failover, providing high availability and disaster recovery.

In this guide, I will walk through provisioning a primary PostgreSQL 16 server and a standby server on Linux, configuring them for streaming replication, and verifying that everything works. I assume you are an experienced engineer familiar with Linux, but new to PostgreSQL replication, so I will keep it friendly and straightforward.

Click through for the process.

Comments closed

Writing DAX Query Outputs to Lakehouse Tables

Published 2025-05-15 by Kevin Feasel

Gilbert Quevauvilliers does a bit of writing:

In this blog post I am going to explain how to use a Python Notebook using the Semantic Link module, to run a DAX query and write the output to a Lakehouse table.

I will show you how to install a Python library and then use it within my python notebook.

Read on for a quick primer on Semantic Link Labs, followed by the meat of the article.

Comments closed

Administration Tips for SQL Agent Jobs

Published 2025-05-15 by Kevin Feasel

Chad Callihan shares a pair of tips:

These aren’t the most technical SQL Server Agent job topics, but two that came to mind were managing who jobs belong to and making sure the right steps are followed when jobs are removed.

Click through for more thoughts on both of these topics.

Comments closed

Last Page Insert Contention

Published 2025-05-15 by Kevin Feasel

Haripriya Naidu is trying to slam a lot of transactions through the same door:

When operations wait to acquire a latch on a page, you’ll see a wait type called PAGELATCH. A latch is a lightweight lock. PAGELATCH waits typically occur in TempDB on pages like PFS, GAM, SGAM, or system object pages.

Normally, you won’t see PAGELATCH waits in user databases because user objects don’t usually experience the same level of concurrent inserts/updates/deletes as temp tables do.

But, there is one case where this can happen:
When many concurrent transactions try to insert into the last page of a table, they all compete for a latch on that page. This results in last page insert contention.

Read on to see when this happens, as well as a demonstration of it. Haripriya then uses a bit of functionality that is available in recent versions of SQL Server to resolve the issue.

Comments closed

Managing SQL Agent Jobs with DBADash

Published 2025-05-15 by Kevin Feasel

David Wiseman shows off an open-source product:

For T-SQL Tuesday #186, Andy Levy asks, “How do you manage and/or monitor your SQL Server Agent jobs?”

This is a great opportunity for me to discuss how DBA Dash can help monitor SQL Agent jobs. DBA Dash is a free and open-source monitoring tool for SQL Server, created by me. It’s used to monitor thousands of SQL Server instances within Trimble alone, and it’s gaining popularity in the SQL Server community.

Read on to see how the product can help if you have a series of SQL Agent jobs.

Comments closed

Comparing Data Importation Modes in Fabric Semantic Models

Published 2025-05-15 by Kevin Feasel

Marco Russo has a guide:

When I presented “Choosing Between Import Mode, Direct Lake, and Composite Models” at Fabric Conf 2025 in Las Vegas, the room overflowed, and the session was not recorded. I promised to publish the material once the new Direct Lake + Import composite model became available. This post follows the structure of that (now re‑recorded) session.

I prepared a recap for this blog post, but I suggest you watch the full video!

Check out the video and Marco’s guidance.

Comments closed

Managing SQL Agent Jobs in a Large Environment

Published 2025-05-15 by Kevin Feasel

Steve Jones shares some tips:

I used to work in a fairly large enterprise (5,000+ people, 500+ production SQL instances) with a small staff. It was 2-3 of us to manage all these systems, as well as respond to questions/queries/issues with dev/test systems. As a result, we depended heavily on SQL Agent.

We decided on a few principles which helped us manage jobs, with a (slow) refactoring of the existing jobs people randomly created with no standards. A few of the things we did are listed below. This isn’t exhaustive, but these are the main things I remember.

Read on for Steve’s list.

Comments closed

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Day: May 15, 2025