Press "Enter" to skip to content

Curated SQL Posts

SQL Server Compilation Time and Storage

Kendra Little explains how storage can affect query compilation time:

Up till now, I’ve thought of compilation time in SQL Server as being dependent only on CPU resources– not something that requires fast storage to be speedy. But that’s not quite right.

Slow storage can result in periodic long compile time in SQL Server. And long compile time not only extends the runtime for the query, it can also result in blocking with waits for compile locks.

Click through for more details, as well as a video by Erik Darling on compile-time locks.

Comments closed

Choosing between Add-Type and New-Object

Patrick Gruenauer contrasts two options in Powershell:

Predefined .NET classes: PowerShell makes certain predefined .NET classes directly available without you having to load them with “Add-Type”.

You can simply use “New-Object, to create instances of these classes. This includes many commonly used classes such as “System.String”, “System.DateTime”, “System.IO.FileInfo”

Read on for a few examples of this, as well as when you would want to use Add-Type instead.

Comments closed

Using AI Skills as Cell Magics in Microsoft Fabric Notebooks

Sandeep Pawar takes a look at a new preview capability:

The public preview of AI Skills in Microsoft Fabric was announced yesterday. AI Skills allows Fabric developers to create their own GenAI experience using data in the lakehouse. Unlike Copilot, which is an AI assistant, AI Skills lets users build a validated Q&A application that queries lakehouse data by converting natural language questions into T-SQL queries. It’s only available in paid F64+ SKUs. You can watch the below video for Copilot, AI Skills and Gen AI experiences in Fabric:

Read on for more details on how it works.

Comments closed

Finding String Patterns in R

Steven Sanderson goes looking for patterns:

Welcome to another exciting blog post where we walk into the world of R programming. Today, we’re going to explore how to check if a string contains specific characters using three different approaches: base R, stringr, and stringi. Whether you’re a beginner or an experienced R user, this guide will should be of some use and provide you with some practical examples.

Read on for those three examples.

Comments closed

Building Real-Time Dashboards from Lakehouse Data in Microsoft Fabric

Dennes Torres gets around a limitation:

Real-Time dashboards are a great feature in Real Time Intelligence experience to monitor our data. However, by default it’s made to work only with Kusto Databases. The options to create a real time dashboard or to define its data source only accept Kusto Databases.

What if we would like to see in real time the information we have in a lakehouse as well? Let’s discover a solution for this.

Read on for the solution.

Comments closed

The Internals of DATETIME2

Chad Baldwin digs in:

I noticed in sys.column_store_segments the min_data_id and max_data_id columns store very large bigint values in the segments for datetime2 columns. After doing a bit more googling and tinkering, I found for bit/tinyint/smallint/int/bigint it stores the min/max of the actual values rather than dictionary lookup values. So I assume it’s likely doing the same for date/time/datetime/datetime2 and storing some sort of bigint representation of the actual value.

This post is going to focus on datetime2(7) datatypes mainly because that’s what I was dealing with. Though I’m sure it wouldn’t be much work to figure out the other types.

Click through to learn more about the datatype and see how this wraps into a discussion of temporal table cleanup and columnstore indexes.

Comments closed

Blob Storage Account Lifecycle Maintenance

Andy Brownsword deletes some files but wants to keep other files:

A hierarchy of directories which contain files. That’s how we typically think about file storage. That’s not quite the same everywhere. In Blob Storage a file can appear to be in a directory, but when it’s removed so is the directory.

This can occur when using Lifecycle Management to help purge legacy blobs, which can be unexpected. Let’s look at a way we can help remediate this.

One important thing to remember about Azure blob storage accounts and S3 buckets is that there’s really no concept of a directory structure. It’s all keys, where your key might be dir1/dir2/dir3/file.txt. This is a bit different for Azure Data Lake Storage Gen2 and its notion of hierarchical namespaces (i.e., folders). But Andy does walk through some of the consequences of this and how to work with lifecycle maintenance policies to delete only certain sets of files.

Comments closed

Chat with Your Own Data in Streamlit and Azure Open AI

I have a new video:

In this video, I show how we can make a GPT-4 deployment aware of our own custom data, without needing to fine-tune the model. I talk about meta prompts and the Retrieval Augmented Generation (RAG) pattern, and then show how you can set this up using Azure AI Search and Azure OpenAI. Then, I bring it back to Streamlit and give users the option between chatting with a generic GPT-4 deployment and chatting over custom data.

I try to make my videos 10 minutes in length. They usually end up at 15-18 minutes. This one clocks in at more than 30 minutes and there’s very little fluff.

Comments closed