Press "Enter" to skip to content

Curated SQL Posts

Data Pre-Processing in R

Amieroh Abrahams cleans up some data:

As data scientists, we often find ourselves immersed in a vast sea of data, trying to extract valuable insights and hidden patterns. However, before we embark on the journey of data analysis and modeling, we must first navigate the crucial steps of data cleaning and preprocessing. In this blog post, we will explore the significance of data cleaning and preprocessing in data science workflows and provide practical tips and techniques to handle missing data, outliers, and data inconsistencies effectively.

Read on for several tactics which can help you clean up your data.

Comments closed

Naming Artifacts in Microsoft Fabric

Johnny Winter shares some advice:

With Fabric being a unified platform, the worlds of Power BI Developer and Data Engineer collide. So is a solid naming convention a good idea?

At Advancing Analytics, we say yes.

In fact, given the breadth of the platform and the variety of artifacts available for use in Fabric, it becomes even more important to have a strategy to be able to organise these items and make them quick and easy to identify.

Read on to see what Johnny recommends.

Comments closed

Private Endpoints and Azure SQL Managed Instance

Zoran Rilak begins a new series:

Last week we announced the general availability (GA) of private endpoints for Azure SQL Managed Instance. Today, we bring you examples of private endpoints in practical scenarios, starting from the basics and building to the more complex ones to follow in the second installment of this mini-series.

In this post, we’ll cover the following scenarios:

  1. Accessing SQL MI from another virtual network
  2. A more secure kind of public access
  3. Accessing SQL MI from your premises
  4. Making SQL MI available to managed Azure services

Click through to see these four scenarios at the architecture diagram level.

Comments closed

ALTER TABLE SWITCH and Errors 4907, 4908, and 4912

Eitan Blumin works out some problems:

When it comes to managing tables and indexes in SQL Server, the ALTER TABLE SWITCH statement is a powerful tool for “moving” data swiftly between tables. However, this convenience can sometimes be met with frustrating roadblocks, such as errors 4907 and 4908.

These errors may be confusing about their underlying cause, particularly when the source and target tables have identical partitions, including in non-clustered indexes.

Read on to see what these error messages mean and how you can correct them.

Comments closed

Fallback Fonts in Power BI and Deneb Visuals

Meagan Longoria gets a request:

This week, I was working with a client who requested I use the Segoe UI font in their Power BI report. The report contained a mix of core visuals and Deneb visuals. I changed the fonts on the visuals to Segoe UI and published the report. But my client reported back that they were seeing serif fonts in some visuals. I couldn’t replicate this on my machine while viewing the report in a web browser or in Power BI Desktop.

Read on to see what the problem was, as well as the workaround.

Comments closed

Creating Curves in R

Steven Sanderson draws a curve:

In the vast world of R programming, there are numerous functions that provide powerful capabilities for data visualization and analysis. One such function that often goes under appreciated is the curve() function. This neat little function allows us to plot mathematical functions and explore their behavior. In this blog post, we will dive into the syntax of the curve() function, provide a couple of examples to demonstrate its usage, and encourage readers to try it on their own.

Click through for several examples.

Comments closed

Comparing GROUP BY and SUMMARIZE in DAX

Marco Russo and Alberto Ferrari make a comparison:

DAX offers a rich set of functions, some of which overlap in their functionalities. Among the many, two functions perform grouping: SUMMARIZE and GROUPBY. These are not the only two: SUMMARIZECOLUMNS and GROUPCROSSAPPLY perform similar operations. However, the article is about SUMMARIZE and GROUPBY, as the other functions have many more functionalities, so a comparison would be unfair.

To make a long story short: GROUPBY should be used to group by local columns, columns created on the fly by DAX functions. SUMMARIZE should be used to group by model and query columns. Be mindful that both functions support both scenarios: both functions can group by model and local columns. However, using the wrong function translates into a strong decrease in performance.

Read on for a detailed explanation.

Comments closed

Decrypting Stored Procedures in SQL Server

Steve Jones breaks the connection:

I had a client that was struggling with some encrypted stored procedures. They needed to decrypt them, which I know is a pain in the #@$%@#$@#$#@. I had to do this one. This post shows how I sent them some code to do this.

Note, SQL Compare 15 does this easier and simpler. If you own it, I’d use that instead. A future post will show how easy that it.

Stored procedure encryption is one of the more annoying features in SQL Server. The idea was, if you wanted to prevent end users from reading your code, you could encrypt the procedures. But in order to use the procedures, SQL Server needed to decrypt them and you needed this to work on restored backups, so the decryption keys needed to be available to that SQL Server. The infrastructure is a bit different from how Microsoft eventually landed Transparent Data Encryption, enough so that it turned out breaking these procedures is trivial, as Steve shows.

I didn’t know that SQL Compare did decryption. The couple of times I needed to do this, I had used a standalone tool which was released in the 2005 timeframe, so it’s good to see something still supported which does this.

Comments closed

Cross-Database and Cross-Cluster ADX Joins in Power BI

Dany Hoter makes a connection:

You may have more than one ADX database and probably more than one ADX cluster.

In some cases, you want to join tables or functions from more than one database/cluster.

In this article you’ll see how to make sure that such joins are folded and sent to the ADX backend instead of executing at the level of the Power Query mashup engine.

Everything mentioned here is applicable to Azure Data Explorer, Synapse Data Explorer, and Fabric RTA.

Read on for the two examples.z

Comments closed

Deploying Resource Governor with Minimal Blocking

Michael J. Swart doesn’t want to wait (or cause anyone else to):

Just like sp_configure, Resource Governor is configured in two steps. The first step is to specify the configuration you want, the second step is to ALTER RESOURCE GOVERNOR RECONFIGURE.
But unlike sp_configure which has a “config_value” column and a “run_value” column, there’s no single view that makes it easy to determine what values are configured, and what values are in use. It turns out that the catalog views are the configured values and the dynamic management views are the current values in use:

Read on for a variety of scripts to help configure resource governor.

Comments closed