Kevin Feasel – Page 181

The Most Important Tool for a Data Detective

Published 2024-10-09 by Kevin Feasel

Andy Yun wants you to use your earholes:

The All Powerful…

… Question. That is what I now believe is the most important tool for a Data Detective.

Asking Questions Effectively

This nuance involves HOW you ask a question. Some of this involves knowing your audience. Is this the right place or the right time? Sometimes there comes a point where asking questions is just counter-productive because your audience has no interest in answering. And it also means you need to make sure you’re asking the correct audience in the first place.

Asking questions is difficult, so instead, I just strawman my way to success.

Comments closed

The Power of Computed Columns

Published 2024-10-09 by Kevin Feasel

Andy Brownsword speeds up a query:

Bad code smells can run through a system, and one instance can quickly spread when code is recycled. Simon recently looked at a non-sargable example and was able to fix it by using an index and refactoring the query.

I wanted to consider an alternative approach if we saw the same issue repeated consistently. We don’t want to fix each instance, we want a single fix. We’ll solve this with indexed computed column.

We can index computed columns to help resolve deterministic (i.e. won’t change, no GETDATE() allowed) clauses. Let’s get started.

Read on to learn more. This is a powerful approach to the problem of needing to perform some sort of data transformation before filtering your data.

Comments closed

Building a Data Detective Toolkit

Published 2024-10-09 by Kevin Feasel

Deb Melkin talks tools:

Happy T-SQL Tuesday! I wasn’t really sure I’d be able to crank something out for this one but somehow I managed to squeeze it in. Tim Mitchell ( b ) is hosting and he has a great topic for us: What’s in our Data Detective toolkit?

I love this topic for so many reasons. Partly because I feel like I’m asked to look at so many projects where I’m dropped in and asked to figure things out, usually performance related but occasionally new functionality or features. But as I’m asked to do this fairly often, I may have to see if Data Detective can be my new title… hmm…

Being a Data Detective in a film noir. On the one hand, that sounds like a really neat idea. On the other hand, things usually don’t turn out so well for the detective.

Comments closed

Referencing a Microsoft Fabric ML Model from another Workspace

Published 2024-10-08 by Kevin Feasel

Sandeep Pawar crosses workspaces:

I have written a couple of blogs about working with ML models in Microsoft Fabric. Creating experiments and logging and scoring models in Fabric is very easy, thanks to the built-in MLflow integration. However, the Fabric Data Science experience has one limitation. There are no model endpoints yet, and you cannot load a model from another workspace because the model URI, unlike in Databricks, does not reference a workspace. If you use MLFlowTransformer as shown in this blog, only the model from the workspace where the notebook is hosted is loaded. However, there is a workaround.

Read on for that workaround, as well as the core limitation associated with it.

Comments closed

Grouping Rows in R

Published 2024-10-08 by Kevin Feasel

Steven Sanderson needs a GROUP BY clause:

Combining rows with the same column values is a fundamental task in data analysis and manipulation, especially when handling large datasets. This guide is tailored for beginner R programmers looking to efficiently merge rows using Base R, the dplyr package, and the data.table package. By the end of this guide, you will be able to seamlessly aggregate data in R, enhancing your data analysis capabilities.

Click through for several code examples.

Comments closed

Reduced Auto-Pause Delay for Azure SQL DB Serverless

Published 2024-10-08 by Kevin Feasel

Morgan Oslake goes to sleep sooner:

Azure SQL Database serverless automatically scales compute based on workload demand and bills for compute used per second. In the General Purpose tier, serverless also provides an option to automatically pause the database during idle usage periods when only storage related costs are billed. When workload activity returns, the database is automatically resumed.

Customers choosing to enable auto-pausing can specify the auto-pause delay as part of the serverless configuration. The auto-pause delay is the length of time the database must be idle before auto-pausing. The lower the auto-pause delay and the more frequently auto-pausing occurs, the greater the potential compute cost savings.

Read on for the update in minimum auto-pause time.

Comments closed

Fixing Missing SQL Agent Jobs Post-Migration

Published 2024-10-08 by Kevin Feasel

Lee Markum is looking for that lost shaker of SQL Agent jobs:

I’ve been doing migrations fairly continuously for the past 18 months. PowerShell has been my primary mechanism for many parts of the process, including copying jobs from the source SQL Server to the target. That has worked almost without incident each time. However, recently, an app team noticed that there were SQL Server Agent jobs missing on their new 2022 SQL Servers. Because the first couple of missing jobs also existed on their Development environment, they were able to recreate those jobs in production. They naturally expressed concern that other jobs may be missing.

Read on for Lee’s process, including the solution.

Comments closed

Announcements from the European Fabric Community Conference

Published 2024-10-08 by Kevin Feasel

James Serra brings tidings:

A TON of new features announcements at the European Microsoft Fabric Community Conference help last week. The full list is here, and I wanted to list my favorite announcements from that list:

Access Databricks Unity Catalog tables from Fabric (public preview): You can now access Databricks Unity Catalog tables directly from Fabric. In Fabric, you can now create a new data item called “Mirrored Azure Databricks Catalog”. When creating this item, you simply provide your Azure Databricks workspace URL and select the catalog you want to make available in Fabric. Rather than making a copy of the data, Fabric creates a shortcut for every table in the selected catalog. It also keeps the Fabric data item in sync. So, if a table is added or removed from UC, the change is automatically reflected in Fabric. Once your Azure Databricks Catalog item is created, it behaves the same as any other item in Fabric. Seamlessly access tables through the SQL endpoint, utilize Spark with Fabric notebooks and take full advantage of Direct La

Read on for the rest of what James found exciting.

Comments closed

A Parent-Child Relationship in Fabric Data Factory

Published 2024-10-08 by Kevin Feasel

Andy Leonard takes us through a design pattern:

The Microsoft Fabric Team released a new Invoke Pipeline Activity in September 2024. This post describes one way to implement a parent-child design pattern using the new activity. In this post, we will:

Create a new workspace

Build a child pipeline

Build a parent pipeline

Test

Click through for the instructions.

Comments closed

An Overview of k Nearest Neighbors

Published 2024-10-07 by Kevin Feasel

Harris Amjad explains a common algorithm for classification:

It so happens that given the hype of Machine Learning (ML) and especially Large Language Models these days, there is a considerable proportion of those who wish to understand how these systems work from scratch. Unfortunately, more often than not, the interest fades away quickly as learners jump to complicated algorithms like neural networks and transformers first, without giving heed to traditional ML algorithms that paved the foundation for these advanced algorithms in the first place. In this tip, we will introduce and implement the K-Nearest Neighbors model in Python. Although it is quite old, it remains very popular due to its simplicity and intuitiveness.

Click through to learn more about this algorithm, including an implementation from scratch in Python.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Author: Kevin Feasel

The All Powerful…

Asking Questions Effectively