Curated SQL – Page 250 – A Fine Slice Of SQL Server

Choosing between Add-Type and New-Object

Published 2024-08-12 by Kevin Feasel

Patrick Gruenauer contrasts two options in Powershell:

Predefined .NET classes: PowerShell makes certain predefined .NET classes directly available without you having to load them with “Add-Type”.

You can simply use “New-Object, to create instances of these classes. This includes many commonly used classes such as “System.String”, “System.DateTime”, “System.IO.FileInfo”

Read on for a few examples of this, as well as when you would want to use Add-Type instead.

Comments closed

Using AI Skills as Cell Magics in Microsoft Fabric Notebooks

Published 2024-08-09 by Kevin Feasel

Sandeep Pawar takes a look at a new preview capability:

The public preview of AI Skills in Microsoft Fabric was announced yesterday. AI Skills allows Fabric developers to create their own GenAI experience using data in the lakehouse. Unlike Copilot, which is an AI assistant, AI Skills lets users build a validated Q&A application that queries lakehouse data by converting natural language questions into T-SQL queries. It’s only available in paid F64+ SKUs. You can watch the below video for Copilot, AI Skills and Gen AI experiences in Fabric:

Read on for more details on how it works.

Comments closed

Finding String Patterns in R

Published 2024-08-09 by Kevin Feasel

Steven Sanderson goes looking for patterns:

Welcome to another exciting blog post where we walk into the world of R programming. Today, we’re going to explore how to check if a string contains specific characters using three different approaches: base R, stringr, and stringi. Whether you’re a beginner or an experienced R user, this guide will should be of some use and provide you with some practical examples.

Read on for those three examples.

Comments closed

Building Real-Time Dashboards from Lakehouse Data in Microsoft Fabric

Published 2024-08-09 by Kevin Feasel

Dennes Torres gets around a limitation:

Real-Time dashboards are a great feature in Real Time Intelligence experience to monitor our data. However, by default it’s made to work only with Kusto Databases. The options to create a real time dashboard or to define its data source only accept Kusto Databases.

What if we would like to see in real time the information we have in a lakehouse as well? Let’s discover a solution for this.

Read on for the solution.

Comments closed

Creating a Month Calendar in Power BI with the Matrix Visual

Published 2024-08-09 by Kevin Feasel

Erik Svensen knows what day it is:

Last week I posted on LinkedIn an example on how we can utilize the matrix as a month calendar slicer.

It got a lot of attention so why not share an example file where you can see how I built it.

Erik has put the sample file in a GitHub repo, so click through to check that out.

Comments closed

The Internals of DATETIME2

Published 2024-08-09 by Kevin Feasel

Chad Baldwin digs in:

I noticed in sys.column_store_segments the min_data_id and max_data_id columns store very large bigint values in the segments for datetime2 columns. After doing a bit more googling and tinkering, I found for bit/tinyint/smallint/int/bigint it stores the min/max of the actual values rather than dictionary lookup values. So I assume it’s likely doing the same for date/time/datetime/datetime2 and storing some sort of bigint representation of the actual value.

This post is going to focus on datetime2(7) datatypes mainly because that’s what I was dealing with. Though I’m sure it wouldn’t be much work to figure out the other types.

Click through to learn more about the datatype and see how this wraps into a discussion of temporal table cleanup and columnstore indexes.

Comments closed

Blob Storage Account Lifecycle Maintenance

Published 2024-08-09 by Kevin Feasel

Andy Brownsword deletes some files but wants to keep other files:

A hierarchy of directories which contain files. That’s how we typically think about file storage. That’s not quite the same everywhere. In Blob Storage a file can appear to be in a directory, but when it’s removed so is the directory.

This can occur when using Lifecycle Management to help purge legacy blobs, which can be unexpected. Let’s look at a way we can help remediate this.

One important thing to remember about Azure blob storage accounts and S3 buckets is that there’s really no concept of a directory structure. It’s all keys, where your key might be dir1/dir2/dir3/file.txt. This is a bit different for Azure Data Lake Storage Gen2 and its notion of hierarchical namespaces (i.e., folders). But Andy does walk through some of the consequences of this and how to work with lifecycle maintenance policies to delete only certain sets of files.

Comments closed

Chat with Your Own Data in Streamlit and Azure Open AI

Published 2024-08-07 by Kevin Feasel

I have a new video:

In this video, I show how we can make a GPT-4 deployment aware of our own custom data, without needing to fine-tune the model. I talk about meta prompts and the Retrieval Augmented Generation (RAG) pattern, and then show how you can set this up using Azure AI Search and Azure OpenAI. Then, I bring it back to Streamlit and give users the option between chatting with a generic GPT-4 deployment and chatting over custom data.

I try to make my videos 10 minutes in length. They usually end up at 15-18 minutes. This one clocks in at more than 30 minutes and there’s very little fluff.

Comments closed

Tips for Hyperparameter Tuning

Published 2024-08-07 by Kevin Feasel

Bala Priya C shares some tips and techniques:

If you’re familiar with machine learning, you know that the training process allows the model to learn the optimal values for the parameters—or model coefficients—that characterize it. But machine learning models also have a set of hyperparameters whose values you should specify when training the model. So how do you find the optimal values for these hyperparameters?

You can use hyperparameter tuning to find the best values for the hyperparameters. By systematically adjusting hyperparameters, you can optimize your models to achieve the best possible results.

This tutorial provides practical tips for effective hyperparameter tuning—starting from building a baseline model to using advanced techniques like Bayesian optimization. Whether you’re new to hyperparameter tuning or looking to refine your approach, these tips will help you build better machine learning models. Let’s get started.

Read on for those techniques. Incidentally, one of my “Old man yells at clouds” takes is that I dislike the existence of hyperparameters and consider them a modeling failure, essentially telling the implementer to do part of the researcher’s work. Knowing that they are necessary to work with for so many algorithms, there’s nothing to do but learn how to work with them effectively, but there’s a feel of outsourcing the hard work to users that I don’t like about the process. For that reason, I have extra respect for algorithms that neither need nor offer hyperparameters.

Comments closed

Managed Private Endpoints and Trusted Workspace Access for All

Published 2024-08-07 by Kevin Feasel

Wolfgang Strasser is very pleased with a recent announcement:

In times of data breaches and millions of customer entries breached, the security of your data platform is one of the things you need to consider upfront and – preferably in all your data solutions.

When Microsoft Fabric was announced the concepts of connecting to other parts of your already secured data platform in Azure was not possible. The options to (securely) connect Fabric to other parts of your Azure platform were not available initially.

Read on to learn more about Managed Private Endpoints and Trusted Workspace Access, the initial problem with them both, and how Microsoft has definitely improved things recently.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Curated SQL Posts