Press "Enter" to skip to content

Month: May 2023

Tracking Configuration-Based Performance Differences in Postgres

Ryan Lambert shows off a Postgres extension:

This is my entry for PgSQL Phriday #008. It’s Saturday, so I guess this is a day late! This month’s topic, chosen by Michael from pgMustard, is on the excellent pg_stat_statements extension. When I saw Michael was the host this month I knew he’d pick a topic I would want to contribute on! Michael’s post for his own topic provides helpful queries and good reminders about changes to columns between Postgres version 12 and 13.

In this post I show one way I like using pg_stat_statements: tracking the impact of configuration changes to a specific workload. I used a contrived change to configuration to quickly make an obvious impact.

Read on for the example.

Comments closed

Scanning for Startup Procedures in SQL Server

Steve Steadman reminds us of a SQL Server feature:

The Scan For Startup Procs feature in SQL Server allows you to specify a list of stored procedures that will be automatically executed whenever the database engine starts. This can be useful in certain scenarios, such as when you want to perform tasks such as restoring a database or performing maintenance tasks when the database engine starts.

“Scan for startup procs” is a configuration option in Microsoft SQL Server that determines whether the server should scan for and execute stored procedures that are marked as “startup procedures” when the server starts up.

I’ve used this to good effect in the past, but there is a fundamental problem with this approach: it’s easy to forget about these, potentially leading to a difficult search for why some action took place. If you only let sysadmins add or change startup stored procedures, then I’d consider this just as little a security risk as xp_cmdshell: if the attacker already has sysadmin, the attacker can simple enable the feature, so there’s no real value to denying yourself the capability if it makes sense in your environment.

Comments closed

Data Mesh Q&A

Jean-Georges Perrin hosts another Q&A:

As part of the Data Mesh Learning Community, Eric Broda invited Laveena KewlaniKruthika Potlapally, and me to discuss the implementation of Data Mesh at PayPal. As expected, the session went longer than scheduled, and some questions remained open. As with the previous Q&A sessions ([#1] and [#2]), here is an attempt to answer them.

Click through for the questions, as well as the answers.

Comments closed

Using a Map in shiny

Steven Sanderson plots a course:

The code is used to create a Shiny app that allows the user to search for a type of amenity (such as a pharmacy) in a particular city, state, and country, and then display the results on a map. Here is a step-by-step explanation of how the code works.

Click through for notes, the code, and an example of the app in operation.

Comments closed

Slowly-Changing Dimensions in the Serverless SQL Pool

Lilliam Leme is building a serverless warehouse:

As organizations continue to collect and store large volumes of data in their data lakes, managing this data effectively becomes increasingly important. One key aspect of this is implementing Slow Change Dimension type 2, which allows organizations to track historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys and/or different version numbers. In this blog post we will address the following scenario: a customer wants to implement Slow Change Dimension type 2 on top of their data lake.

For this example, we will use Serverless SQL Pool to demonstrate how this can be done. Additionally, in the next post, we will explore how the same approach can be used with Spark.

This turns out to be more work than a classic SQL Server-based solution because of the fact that the serverless SQL pool is read-only, save for CETAS statements.

Comments closed

Azure SQL Updates for May 2023

Anna Hoffman gives us the latest news:

Let’s start with Azure SQL Managed Instance, which had several general availability (GA) announcements in April. First, the GA of Link feature for Azure SQL Managed Instance for SQL Server 2016 and 2019 happened. This capability allows you to set up near real-time replication between a SQL Server and SQL MI. You can use this link for scale, migration, read-only workloads, etc. To learn more, review the announcement blog. The team also announced the GA of CETAS. This stands for Create External Table As Select, which essentially means you can create an external table while in parallel exporting the results of a SELECT statement. This has been a customer ask and you can learn how to take advantage of it here.

Read on to learn more about what’s new with the rest of the Azure SQL landscape, and some things happening in the community.

Comments closed

Adding a UTC Time Zone Indicator to a Date in SQL Server

Bill Fellows fights with the language:

It seems so easy, I was building json in SQL Server and the date format for the API specified it needed to have 3 millsecond digits and the zulu timezone signifier. Easy peasy, lemon squeezey, that is ISO8601 with time zone Z format code 127

SELECT CONVERT(char(24), GETDATE(), 127) AS waitAMinute; Running that query yields something like 2023-05-02T10:47:18.850 Almost there but where’s my Z? Hmmm, maybe it’s because I need to put this into UTC? SELECT CONVERT(char(24), GETUTCDATE(), 127) AS SwingAndAMiss;

Running that query yields something like 2023-05-02T15:47:18.850 It’s in UTC but still no timezone indicator.

Read on for several attempts and what finally did the trick.

Comments closed

CETAS in SQL Server 2022

Eric Rouach shows off a nice extension to T-SQL in SQL Server 2022:

Create External Table As Select or “CETAS” has finally become available on SQL Server with the release of the 2022 version.

After a short setup, we can create various formats files containing any query’s result set. The created file/s must be kept on an Azure storage solution i.e. Azure Blob Storage.

The process also creates an external table reflecting the updated file’s content.

We’ve been able to do this in Azure Synapse Analytics dedicated and serverless SQL pools for a while, so it’s good to be able to create an external table from a SELECT query on-premises, especially considering that it’s the only way we have left to write to external sources using PolyBase.

Comments closed