Press "Enter" to skip to content

Day: April 3, 2024

Normalizing Data in R

Steven Sanderson says, act normal:

Data normalization is a crucial preprocessing step in data analysis and machine learning workflows. It helps in standardizing the scale of numeric features, ensuring fair treatment to all variables regardless of their magnitude. In this tutorial, we’ll explore how to normalize data in R using practical examples and step-by-step explanations.

Read on for a definition of what this means and how you can do it.

Comments closed

Finding Last Access Dates for SQL Server

David Fowler checks the calendar:

Your boss walks up to you one morning and says, “Hey, I wanna list of all of our databases and when they were last accessed”.

If you’ve got some sort of auditing switched on or a trace or xevent catching this sort of info you might be ok, but I’m betting you don’t have any of that. That’s cool, it’s not something that I tend to monitor as standard either.

But if you’re not monitoring it, is there any way that you can get at that info?

Read on for one way to estimate it. Though I believe automated jobs would skew that result if the underlying question is, “When did a human last view that database?”

Comments closed

Finding Row Counts in SQL Server

Kevin Wilkie breaks out the abacus:

Today, I was working with SQL Server to get row counts from several tables so I thought I’d be smart and work with some functions in SQL Server to make it smarter / easier.

Now, if I am truly only getting “straight” row counts from these tables, I would be able to create a query like the below that would provide the answers with no problem:

Read on for the normal approach, as well as a more complicated approach made necessary due to some business logic requirements.

Comments closed

Date Calculation Bug in Power Query ODBC Code

Meagan Longoria files a report:

I was working on an imported Power BI semantic model, adding some fiscal year calculations to my date table. The date table was sourced from a view in Databricks Unity Catalog. I didn’t have access to add more fields to the view, so I was adding the fields in Power Query first, with plans to request they be added to the view in the future. I got some unexpected results, which turned into a bug being logged for the ODBC code for Power Query.

If you are only analyzing data in the last 20 years, you won’t see this bug. But if you are doing long-term analysis including years before 2000, you might just run into it.

Read on to see the bug, how you can replicate it, and three workarounds you can use to avoid it.

Comments closed

The Value of Mirroring in Microsoft Fabric

Nikola Ilic talks mirroring:

First things first. Before I show you how to leverage this feature in Microsoft Fabric, let’s first explain the feature itself.

But, before we explain the feature itself, we need to go one step back and examine the key logic behind the Microsoft Fabric workloads, so that you understand the full context of the Mirroring importance.

Take that context and then you get an idea of how mirroring becomes so important for the Microsoft Fabric experience.

Comments closed

Dealing with Parameter Sniffing using Multiple Execution Plans

Andy Brownsword deals with statistical skew in the data:

Dynamic SQL has many uses and one of these can help us fix Parameter Sniffing issues. Here we’ll look at how it can be used to generate multiple execution plans for the same query.

Parameter sniffing is a common issue. Even for simple queries we can run into suboptimal plans being produced. There are multiple ways we can use Dynamic SQL to solve this challenge. Here we’ll demonstrate one technique: Comment Injection.

My one note about a good post (other than, you should read it) is that parameter sniffing is not itself a bad thing. 95%+ of the time, it’s a great thing. It’s that last 5% or so that give it a bad name.

Comments closed