Press "Enter" to skip to content

Author: Kevin Feasel

Heap-Only Tuples in Postgres

Umair Shahid explains the benefit of Heap-Only Tuples in PostgreSQL:

Heap-only tuples, also known as HOT, are PostgreSQL’s answer to the update query performance issues caused by MVCC. These tuples allow PostgreSQL to mark a row as “dead” and physically reuse the space it occupies in the table. This process eliminates the need to keep multiple versions of the same row, reducing I/O and improving query performance.

Read on to see how these compare to the normal MVCC process in Postgres, as well as cases when you should and should not use them.

Comments closed

An Overview of Microsoft Fabric Domains

Reza Rad provides an overview:

Microsoft Fabric introduced a new concept called Domains. Domains are more than just a separation of Fabric data items. They come with a whole lot of security, administration, and governance features, which brings the concept of data mesh into the world of data analytics using Microsoft Fabric. Domains are logical categorizations inside the OneLake. In this article and video, I will explain what domains are in Microsoft Fabric, why they are important, and their associated features and configurations.

Click through for both a video on the topic and a lengthy article.

Comments closed

CI/CD for Synapse Serverless SQL Pool with SqlPackage and Azure DevOps

Rui Cunha has a tutorial for us:

Azure Synapse Analytics Serverless SQL is a query service mostly used over the data in your data lake, for data discovery, transformation, and exploration purposes. It is, therefore, normal to find in a Synapse Serverless SQL pool many objects referencing external locations,  using disparate external data sources, authentication mechanisms, file formats, etc. In the context of CICD,  where automated processes are responsible for propagating the database code across environments, one can take advantage of database oriented tools like SSDT and SqlPackage CLI , ensuring that this code is conformed with the targeted resources.

In this article I will demonstrate how you can take advantage of thee tools when implementing the CICD for the Azure Synapse Serverless SQL engine. We will leverage SQL projects in SSDT to define our objects and implement deploy-time variables (SQLCMD variables).  Through CICD pipelines, we will build the SQL project to a dacpac artifact, which enables us to deploy the database objects one or many times with automation.

Click through for the demonstration.

Comments closed

The Search for Extended Events Information

Grant Fritchey stays on the first page:

Here’s their paraphrased (probably badly) story:

“I was working with an organization just a few weeks back. They found that Trace was truncating the text on some queries they were trying to track. I asked them if they had tried using Extended Events. They responded: What’s that? After explaining it to them, they went away for an hour or so and came back to me saying that had fixed the problem.”

We all smiled and chuckled. But then it struck me. This wasn’t a case of someone who simply had a lot more experience and understanding of Profiler/Trace, so they preferred to use it. They had literally never heard of Extended Events.

Why?

This led Grant to perform some search engine shenanigans and what he found was curious. A couple of points with search engines, though:

  • Search engine results will differ based on your location (IP address) and whether you are signed in or not. Google is particularly selective about this stuff. It might also affect Bing, but let’s face it: if you’re using Bing to search for anything other than images, you’ve already resigned yourself to failure.
  • In my case, a search for “extended events” (without quotation marks) did show quite a few pages which I’d consider reasonable for the topic: a Microsoft Learn quickstart article on using extended events, Brent Ozar’s extended events material, a SQL Shack article on the topic, etc. A good number of these links are content from the past 5 years, as well.
  • Grant mentions the “page 1” effect in search engines, and he’s absolutely right. The vast majority of people performing a search never leave the first page of results. This is part of why Google went to an infinite scrolling approach rather than showing explicit numbered pages.
Comments closed

Initial Thoughts on the Microsoft Fabric Data Science Experience

Tori Tompkins shares some thoughts:

Fabric is Microsoft’s recently announced SaaS all-in-one analytics platform. It brings together Azure Data Factory, Azure Synapse Analytics and Power BI into a single cohesive platform without the overhead of setting up resources, maintenance, and configuration. Fabric wouldn’t be an end-to-end data analytics platform without data science, so in this blog we will explore the data science and machine learning capabilities of Microsoft Fabric and assess where the platform fits in the completive data science landscape.

Click through for Tori’s overview, where Fabric does a good job in its preview, and where it currently falls short.

Comments closed

Trying out Azure Geo-Replication

Etienne Lopes continues a series on Azure SQL DB HA/DR:

So, first of all, what is Active Geo-Replication?

Active geo-replication is a feature that lets you create a continuously synchronized readable secondary database for a primary database. The readable secondary database may be in the same Azure region as the primary, or, more commonly, in a different region. This kind of readable secondary database is also known as a geo-secondary or geo-replica.“

Read on to learn more about the topic, including how to set it up and ways to try it out.

Comments closed

Error Trapping with Extended Events

Chad Callihan wants to know what’s going wrong:

Many times, when I’m using Extended Events, I’m filtering on an Id or procedure name in query text and tracking what’s hitting a database. In these cases, I’m assuming queries are going to complete successfully. Did you know Extended Events can be useful when you’re expecting things to go wrong? Let’s look at how we can use Extended Events to track down error messages.

You’ll get a fair amount of noise on a busy server–especially dev servers where people are writing queries and trying things out—but this is a great technique for seeing if something has gone wrong and we need to look into it before a customer reaches out.

Comments closed

Thoughts on Testing Stored Procedures

Erik Darling has a plan:

While most of it isn’t a security concern, though it may be if you’re using Row Level Security, Dynamic Data Masking, or Encrypted Columns, you should try executing it as other users to make sure access is correct.

When I’m writing stored procedures for myself or for clients, here’s what I do.

Click through for Erik’s guidance. The premise is couched in security testing, though much of this is functionality and performance testing Regardless, it’s a good plan.

Comments closed

Generating Tables from Files in Microsoft Fabric via Notebook

Dennes Torres performs a bit of ELT:

When Microsoft Fabric was born, the only method to convert files to tables was using notebooks. Nowadays we have an easy-to-use UI feature for the conversion.

As I explained on the article about lakehouse and ETL, there are some scenarios where we still need to use notebooks for the conversion. One of these scenarios is when we need table partitioning.

Let’s make a step-by-step on this blog about how to use notebooks and table partitioning.

Click through to see how it all works.

Comments closed