Press "Enter" to skip to content

Author: Kevin Feasel

Common Data Transformations in Microsoft Fabric

Nikola Ilic takes us through several data transformations:

In the lakehouse, for example, you can transform the data by using PySpark, but also Spark SQL, which is VERY similar to Microsoft’s dialect of SQL, called Transact-SQL (or T-SQL, abbreviated). In the warehouse, you can apply transformations using T-SQL, but Python is also an option by leveraging a special pyodbc library. Finally, in the KQL database, you can run both KQL and T-SQL statements. As you may rightly assume, the lines are blurred, and sometimes the path is not 100% clear.

Therefore, in this article, I’ll explore five common data transformations and how to perform each one using three Fabric languages: PySpark, T-SQL, and KQL.

Click through for those transformations, such as extracting date parts, fixing casing, and pivoting data.

Comments closed

Building a Rubik’s Cube in Power Apps

Jon Vöge builds an app:

This time however, the Power Hour led me to come fully clean to my colleagues, by attempting to build a Rubik’s Cube emulator within a Canvas Power App.

For this week’s blog, I’ll take you for a brief tour of its inner workings, and share the code with you for yourself to play around with.

Click through for the code and explanation.

In the meantime, I’ll share a family secret on how we solve Rubik’s cubes. We remove the wrong stickers and swap them with the correct ones. Boom, problem solved. Also, this is getting the most coveted category I have to offer on Curated SQL, so good on Jon.

Comments closed

Self-Intersecting Quadrilaterals in R

Jerry Tuttle talks shapes:

A quadrilateral is a polygon having four sides, four angles, and four vertices. A polygon means that the figure is a closed shape, meaning the last line segment connects back to the first one, effectively enclosing an area.

We usually think of quadrilaterals as squares, rectangles, parallelograms, trapezoids, rhombuses, or kites. (I was impressed that my four year-old granddaughter knew the last one, although she called it a diamond!) It could also be irregularly shaped with no name.

However, a polygon may intersect itself. 

Click through for a demonstration of a self-intersecting quadrilateral, including the R code you can use to try it out yourself.

Comments closed

Building Custom PowerPoint Decks in R

Theo Roe tries out a package:

From a purely design perspective, Quarto’s standard PowerPoint output falls short. It is limited to seven layout options, with the most complex being “Two Content.” The {officer} R package offers a powerful alternative for those seeking full control and customisation.

Click through to see how it works, as well as a hit list of limitations you might run into along the way.

Comments closed

Combining DISTINCT and UNION

Louis Davidson gives it the college try:

When I was perusing my LinkedIn feed the other day, I came across this thread about using SELECT *. In one of the replies, Aaron Cutshall noted that: “Another real performance killer is SELECT DISTINCT especially when combined with UNION. I have a whole list of commonly used hidden performance killers!”

To which started my brain thinking… What does happen when you use these together? And when you use UNION on a set with non-distinct rows, what happens. So for the next few hours I started writing.

Read on for Louis’s findings.

Comments closed

Performance Testing ZSTD Compression for SQL Server Backups

Andy Yun tries out some backup compression:

SQL Server 2025 Public Preview is not even a week old, but I’m impressed with another new capability that was released – a new backup compression algorithmZSTD. This one came as a surprise, despite being part of Private Preview, as it was only released with Public Preview.

Click through for Andy’s findings. It’s just one database that is not representative of normal SQL Server databases, but it’s an interesting data point that we can use.

Comments closed

Trying out Microsoft Fabric Data Agents

Wolfgang Strasser gives a generative AI solution built into Microsoft Fabric a try:

Today, I wanted to give the new Fabric Data Agents a try. According to the documentation, a Fabric Data Agent is defined as follows:

Data agent in Microsoft Fabric is a new Microsoft Fabric feature that allows you to build your own conversational Q&A systems using generative AI. A Fabric data agent makes data insights more accessible and actionable for everyone in your organization. With a Fabric data agent, your team can have conversations, with plain English-language questions, about the data that your organization stored in Fabric OneLake and then receive relevant answers. This way, even people without technical expertise in AI or a deep understanding of the data structure can receive precise and context-rich answers.

Let’s give it a try and build our first Data Agent.

Click through for the pre-requisites, the setup process, and how everything looked for Wolfgang.

Comments closed

sqlcmd in SQL Server 2025 and Certificate Chain Not Trusted

Vlad Drumea points out a new thing to keep an eye on:

SQL Server 2025 provides ODBC sqlcmd version 17 which enforces an encrypted connection.

If you’re trying to use it to connect to instances that don’t have a CA-signed certificate or where TLS encryption was never properly configured, sqlcmd will throw the famous “certificate chain not trusted” error message:

Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : SSL Provider: The certificate chain was issued by an authority that is not trusted.
Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : Client unable to establish connection.

The proper answer to this is to get trusted certificates. The workaround is what Vlad describes, so click through for that.

Comments closed

Building an ML-Friendly Data Lake with Apache Iceberg

Anant Kumar designs a data lake:

As companies collect massive amounts of data to fuel their artificial intelligence and machine learning initiatives, finding the right data architecture for storing, managing, and accessing such data is crucial. Traditional data storage practices are likely to fall short to meet the scale, variety, and velocity required by modern AI/ML workflows. Apache Iceberg steps in as a strong open-source table format to build solid and efficient data lakes for AI and ML.

Click through for a primer on Iceberg, how to set up a fairly simple data lake, and some functionality that can help in model training.

Comments closed

Preventing Injection Attacks in Shiny

Arthur Breant shares some advice:

Code injection is a common security vulnerability that involves injecting malicious code into a page or application. This code is then executed, creating the security breach. There are several ways to inject code into an application, and Shiny is unfortunately not immune to these risks.

Click through for a quick overview of the three most common types of injection attack. There’s nothing special about Shiny here—any system that executes code based on user input is potentially vulnerable to injection attacks—so it is good to keep these tips in mind. H/T R-Bloggers.

Comments closed