Press "Enter" to skip to content

Day: April 15, 2025

choroplethr 4.0.0 Now in CRAN

Ari Lamstein has an announcement:

With this version, I have transferred the maintenance of choroplethr to Zhaochen He, an economics professor at Christopher Newport University. Zhao addressed the issues that led to choroplethr being archived from CRAN in February. Please join me in thanking Zhao for his contribution!

Click through for the updates, as well as what Ari views as the current challenges for the project as he hands the project over Zhaochen He. H/T R-Bloggers.

Leave a Comment

Vector Indexing in PostgreSQL

Hans-Jürgen Schönig builds some indexes on vector data:

In the previous post we have imported a fairly large data set containing Wikipedia data, which we downloaded using pgai. However, importing all this data is not enough because we also need to keep an eye on efficiency. Therefore, it is important to understand that indexing is the key to success.

Click through for an overview of what options are available for indexing vectors in PostgreSQL, as well as the trade-offs and disk space requirements.

Leave a Comment

Comparing Varieties of Statistics in SQL Server

Kendra Little gets the smorgasbord:

Statistics in SQL Server are simple in theory: they help the optimizer estimate how many rows a query might return.

In practice? Things get weird fast. Especially when you start filtering on multiple columns, or wondering why the optimizer thinks millions of rows are coming back when you know it’s more like a few hundred thousand.

In this post, I’ll walk through examples using single-column, multi-column, and filtered statistics—and show where estimates go off the rails, when they get back on track, and why that doesn’t always mean you need to update everything with FULLSCAN.

Read on for a review of the three types of statistics. Admittedly, I’ve never had much luck with filtered statistics improving the performance of queries. If I were to speculate, I’d say that they’re good for a very specific type of problem that maybe I just don’t run into that often.

Leave a Comment

Purview DLP Updates

Yael Biss has an announcement:

Microsoft Purview’s Data Loss Prevention (DLP) policies for Fabric now supports Fabric KQL and Mirrored DBs!

Purview DLP policies help organizations to improve their data security posture and comply with governmental and industry regulations. Security teams use DLP policies to automatically detect upload of sensitive information to Microsoft 365 applications like SharePoint and Exchange, and to Fabric’s semantic models and lakehouses.

And another one:

In today’s fast-paced data-driven world, enterprises are building more sophisticated data platforms to gain insights and drive innovation. Microsoft Fabric Lakehouses combine the scale of a data lake with the management finesse of a data warehouse – delivering unified analytics in an ever-evolving business landscape. But with great data comes great responsibility. Protecting sensitive information and ensuring regulatory compliance is paramount. That’s where Data Loss Prevention (DLP) policies with restricted access come into play.

Click through to see what this preview currently offers.

Leave a Comment

Swap-and-Drop for Partition Management

Rich Benner deals with a troublesome partition:

What are stubborn partitions in SQL Server and how do you delete them? This was an interesting issue I recently had to deal with on a client site that I thought our readers might find interesting.

The tables in use here are partitioned. The partition field is based upon a date field and we have a partition per month. There is a monthly maintenance job which creates our new partitions. The job should also delete the oldest partitions. This job has been failing to delete an old partition as the data file contained within is not empty. It’s stubborn!

If we try to remove this file we get the error “The File cannot be removed because it is not empty,” as you can see:

Read on for some diagnosis of the problem, as well as the solution Rich developed.

Leave a Comment

A Required Privilege Is Not Held by the Client

Rebecca Lewis runs into a permissions error:

I received an email from a customer yesterday regarding their Replication, which began failing with this error after Windows updates were applied:

Message Replication-Replication Transaction-Log Reader Subsystem: agent servername-xxx2 failed. Executed as user: domainname\svcaccount. A required privilege is not held by the client. The step failed.

Slightly dummied, but the important content is in red.  What does that mean?  ‘A required privilege is not held by the client’… he didn’t change anything, I didn’t change anything – why is Replication suddenly failing with permissions problems?

Click through for the answer.

Leave a Comment

Data Conversion via Generative AI

Grant Fritchey rearranges some data:

The DM-32 is a Digital Mobile Radio (DMR) as well as an analog radio. You can follow the link to understand all that DMR represents when talking radios. I want to focus on the fact that you have to program the behaviors into a DMR radio. While the end result is identical for every DMR radio, how you get there, the programming software, is radically different for every single radio (unless you get a radio that supports open source OpenGD77, yeah, playing radio involves open source as well). Which means, if I have more than one DMR radio (I’m currently at 7, and no, I don’t have a problem, shut up) I have more than one Customer Programming Software (CPS) that is completely different from other CPS formats. Now, I like to set up my radios similarly. After all, the local repeaters, my hotspot, and the Talkgroups I want to use are all common. Since every CPS is different, you can’t just export from one and import to the next. However, I had the idea of using AI for data conversion. Let’s see how that works.

Click through for the scenario as well as Grant’s results. Grant’s results were pretty successful for a data mapping operation, though choice of model and simplicity of the input and output examples are important to generate the Python code.

Leave a Comment

Using fabric-cicd with GitHub Actions

Kevin Chant doesn’t limit us to Azure DevOps:

In this post I want to show how you can operationalize fabric-cicd to work with Microsoft Fabric and GitHub Actions. Since I got asked if this post was available whilst I was helping at the ask the experts panel during the Microsoft Fabric Community Conference.

Just so that everybody is aware, fabric-cicd is a Python library that allows you to perform CI/CD of various Microsoft Fabric items into Microsoft Fabric workspaces. At this moment in time there is a limited number of supported item types. However, that list is increasing.

Click through for a high-level diagram and the process, including the code Kevin used in the GitHub Actions workflow.

Leave a Comment