Press "Enter" to skip to content

Day: September 4, 2024

Sampling without Replacement and Unequal Probabilities

Peter Ellis finds interesting results with sampling in R:

A week ago I was surprised to read on Thomas Lumley’s Biased and Inefficient blog that when using R’s sample() function without replacement and with unequal probabilities of individual units being sampled:

“What R currently has is sequential sampling: if you give it a set of priorities w it will sample an element with probability proportional to w from the population, remove it from the population, then sample with probability proportional to w from the remaining elements, and so on. This is useful, but a lot of people don’t realise that the probability of element i being sampled is not proportional to w_i”

Read on for a demonstration. H/T R-Bloggers.

Leave a Comment

Building an App to Use Fabric AI Skills Locally

Sandeep Pawar takes us on-premises:

If you are a regular reader of this blog, you probably know I have been testing Fabric AI Skills extensively. I have written three blogs so far on various ways the AI Skills endpoint can be used. The feature is still in preview but I am excited to see how it can be used to create new solutions as it matures.

I was curious to test if the AI Skills endpoint can be used locally and in other applications. This will open many opportunities to integrate it in different tools, inside and outside of Fabric ecosystem. So, I built an app using Gradio to make API calls to the endpoint and show the results in a local browser along with interactive plots.

Click through for a link to the code and some instructions on how to build it yourself.

Leave a Comment

Core Concepts of Vector Databases

Brendan Tierney continues a series on vector databases:

In this post on Vector Databases, I’ll look at the main components:

  • Vector Embedding Models. What they do and what they create.
  • Vectors. What they represent, and why they have different sizes.
  • Vector Search. An overview of what a Vector Search will do. A more detailed version of this is in a separate post.
  • Vector Search Process. It’s a multi-step process and some care is needed.

Read on for more about these terms and ideas.

Leave a Comment

Connecting Snowflake to Microsoft Fabric

Stephanie Bruno makes a connection:

If you’re new to Snowflake and you need to mirror a Snowflake database in Microsoft Fabric, where do you begin? The steps are straightforward enough, but when trying something new, I often get tripped up by the basics. In this case, the configuration screen for mirroring. The documentation tells us to simply enter the server and warehouse, and provides some helpful information on where to find the details, but I prefer step by step instructions with pictures. If you do, too, then this post is for you.

Click through for a walkthrough.

Leave a Comment

Missing Columns in the Extended Events Live Data Explorer

Grant Fritchey explains a UI oddity:

Let me be extremely clear up front, this is not my original work. I saw this post on and I wanted to share and promote it. Nice work FevziKartal.

The rest of this post is just me replicating work already done by others. I just want to see it in action.

Read on for the example and what happens when you don’t have any events in the live data explorer.

Leave a Comment

Thoughts on T-SQL Snapshot Backups

Anthony Nocentino has a two-parter for us. First is the idea of how T-SQL snapshot backups work:

Traditional SQL Server backups can struggle with large databases, resulting in longer backup times and resource contention. T-SQL Snapshot Backup, a new feature in SQL Server 2022, addresses these challenges by allowing storage-based snapshots to be coordinated through T-SQL. This feature delivers faster, more efficient backups, especially for large-scale environments with the most aggressive of recovery objectives.

The follow-up is for point-in-time recovery from a T-SQL snapshot backup:

In this post, the second in our series, I will guide you through using the new T-SQL Snapshot Backup feature in SQL Server 2022 to take a snapshot backup and perform point-in-time database restores using a snapshot backup as the base of the restore. We will explore how to manage storage-level operations, such as cloning snapshots and executing an instantaneous point-in-time restore of a database from the snapshot with minimal impact on your infrastructure. Additionally, I will demonstrate a PowerShell script that utilizes dbatools and the PureStoragePowerShellSDK2 modules to automate the process.

Check out both posts and be on the lookout for subsequent entries in the series.

Leave a Comment