Press "Enter" to skip to content

Day: August 9, 2024

Using AI Skills as Cell Magics in Microsoft Fabric Notebooks

Sandeep Pawar takes a look at a new preview capability:

The public preview of AI Skills in Microsoft Fabric was announced yesterday. AI Skills allows Fabric developers to create their own GenAI experience using data in the lakehouse. Unlike Copilot, which is an AI assistant, AI Skills lets users build a validated Q&A application that queries lakehouse data by converting natural language questions into T-SQL queries. It’s only available in paid F64+ SKUs. You can watch the below video for Copilot, AI Skills and Gen AI experiences in Fabric:

Read on for more details on how it works.

Comments closed

Finding String Patterns in R

Steven Sanderson goes looking for patterns:

Welcome to another exciting blog post where we walk into the world of R programming. Today, we’re going to explore how to check if a string contains specific characters using three different approaches: base R, stringr, and stringi. Whether you’re a beginner or an experienced R user, this guide will should be of some use and provide you with some practical examples.

Read on for those three examples.

Comments closed

Building Real-Time Dashboards from Lakehouse Data in Microsoft Fabric

Dennes Torres gets around a limitation:

Real-Time dashboards are a great feature in Real Time Intelligence experience to monitor our data. However, by default it’s made to work only with Kusto Databases. The options to create a real time dashboard or to define its data source only accept Kusto Databases.

What if we would like to see in real time the information we have in a lakehouse as well? Let’s discover a solution for this.

Read on for the solution.

Comments closed

The Internals of DATETIME2

Chad Baldwin digs in:

I noticed in sys.column_store_segments the min_data_id and max_data_id columns store very large bigint values in the segments for datetime2 columns. After doing a bit more googling and tinkering, I found for bit/tinyint/smallint/int/bigint it stores the min/max of the actual values rather than dictionary lookup values. So I assume it’s likely doing the same for date/time/datetime/datetime2 and storing some sort of bigint representation of the actual value.

This post is going to focus on datetime2(7) datatypes mainly because that’s what I was dealing with. Though I’m sure it wouldn’t be much work to figure out the other types.

Click through to learn more about the datatype and see how this wraps into a discussion of temporal table cleanup and columnstore indexes.

Comments closed

Blob Storage Account Lifecycle Maintenance

Andy Brownsword deletes some files but wants to keep other files:

A hierarchy of directories which contain files. That’s how we typically think about file storage. That’s not quite the same everywhere. In Blob Storage a file can appear to be in a directory, but when it’s removed so is the directory.

This can occur when using Lifecycle Management to help purge legacy blobs, which can be unexpected. Let’s look at a way we can help remediate this.

One important thing to remember about Azure blob storage accounts and S3 buckets is that there’s really no concept of a directory structure. It’s all keys, where your key might be dir1/dir2/dir3/file.txt. This is a bit different for Azure Data Lake Storage Gen2 and its notion of hierarchical namespaces (i.e., folders). But Andy does walk through some of the consequences of this and how to work with lifecycle maintenance policies to delete only certain sets of files.

Comments closed