Press "Enter" to skip to content

Author: Kevin Feasel

Performing Linear Regression in Power BI

Marco Russo and Alberto Ferrari build a regression:

LINEST and LINESTX are two DAX functions that calculate a linear regression by using the Least Squares method. Both functions return multiple values, represented in a table that has a single row and one column for each of the values returned.

LINEST gets column references as arguments, whereas LINESTX explicitly iterates over the table provided in the first argument and executes the other arguments in a row context. Internally, LINEST invokes LINESTX and provides to it the table that contains the column references specified in the LINEST arguments. This article describes the more generic function LINESTX.

This is pretty neat for adding a visual element, though I’d probably include an R or Python visual and do the regression in there, myself.

Comments closed

Switch Statements and Expressions in C#

Hasan Savran points out the overloaded nature of switch in C# 8 and later:

It works great but the break and the case syntaxes are getting duplicated, new switch syntax gets rid of the case, and the break statements. Here how this example looks like using the new switch syntax.

Click through for Hasan’s demo. Basically, this is the difference between a statement and an expression. C#’s switch keyword has historically been a statement: given some input, perform an action but do not return an output. Performing an action within the function is known as a side effect and it adds some mental overhead to the way we process things, especially as your methods get more complex and you have to keep track of more things in your mind at once.

By contrast, Hasan’s second example is switch as an expression, which is more in the F# style and an example of why I like to joke about how what you’ll find in C# vNext is what you got in F# two versions ago. An expression is an operation which takes an input and returns an output without performing any actions causing side effects along the way. This makes expressions easier to diagram and conceptualize than statements, though statements offer more flexibility, especially when you do want to take radically different actions depending on some given input.

Comments closed

Licensing for Microsoft Fabric

Reza Rad explains how licensing of Microsoft Fabric will work:

To understand the licensing for Microsoft Fabric, You first need to understand the Capacity structure. In Fabric, there are three important sections that the content can be organized into those; Tenant, Capacity, and Workspace.

Tenant is the most fundamental part of the structure of Fabric. Each domain can have one or multiple tenants.

The capacity is the substructure under the tenant. You can have one or multiple capacities in each tenant. Each capacity is a pool of resources that can be used for Microsoft Fabric services. There are different SKUs for different levels of resources. I’ll explain the pricing and SKUs shortly after.

Inside capacities, you will have workspaces. Workspaces are sharing units that will be used for developers and users. For example, you will create Lakehouse, Data Pipeline, and Dataflow inside a workspace, and you can share them with the rest of the developer team. A workspace is assigned to a capacity. However, you can have more than one capacity associated with one workspace. The screenshot below shows how Tenant, Capaicy, and Workspace work together.

Read on to understand at what level billing occurs, what the options are, and what it means. My gut is saying that F8 is probably the lowest acceptable tier for a real company’s production environment and F2 is more for dev environments or people trying things out. But we’ll know more, I think, in the next few months as people try things out.

Comments closed

Finding All Implicit Warnings via Query Store

Jose Manuel Jurado Diaz has a script for us:

During our last session in SQL Data Saturday, we received a question about if it is possible to know all the conversion implicit captured by Query Data Store. In the following example, I would like to share with you an example how to capture this considering among of SQL Antipatterns. 

Basically, in sys.query_store_plan  we found the column called query_plan that contains the text of the execution plan. With this information plus other Query Data Store DMVs we could see the information required. 

Read on to see how you can shred out implicit conversions from the Query Store plans.

Comments closed

Configuring Compliance in Microsoft Fabric

Kevin Chant checks a box:

Compliance is a very important aspect when working for data. Especially when you must work to standards like PCI-DSS. With this in mind I looked into the compliance story for Microsoft Fabric.

By the end of this post, you will have a better idea of how to test configuring compliance for Microsoft Fabric. Along the way I share plenty of links.

Read on for step-by-step instructions, as well as those links.

Comments closed

An Overview of Always Encrypted

Matthew McGiffen describes a product:

Always Encrypted was a new encryption feature added to SQL Server with the 2016 version of the product. Initially it was just available in enterprise edition, but from SQL Server 2016, SP1 was made available in standard edition also.

Unlike TDE which encrypts the whole database, Always Encrypted is a form of column encryption that means you choose which columns of data you want to encrypt. The “Always” part of Always Encrypted refers to the fact that data is encrypted at rest, in memory, and as it is transmitted across the network. That means that it provides the highest level of protection possible for your data.

Read on to learn more about what makes it different from other forms of encryption in SQL Server and the way this feature works.

Comments closed

Order, Sort, and Rank in R

Steven Sanderson compares three functions in R:

In the realm of data analysis and programming, organizing and sorting data efficiently is crucial. In R, a programming language renowned for its data manipulation capabilities, we have three powerful functions at our disposal: order()sort(), and rank(). In this blog post, we will delve into the intricacies of these functions, explore their applications, and understand their parameters. These R functions are all used to sort data, however, they each have different purposes and use different methods to sort the data.

Coming at R from a SQL background, the idea of order() and sort() behaving so differently was strange at first, especially because the verbs are synonymous and it’s the noun form of “order” which we get back, rather than performing the ordering.

Comments closed

Thoughts on Postgres File Layout and Migration

Dian Fay shares some advice:

I’ve used several migration frameworks in my time. Most have been variations on a common theme dating back lo these past fifteen-twenty years: an ordered directory of SQL scripts with an in-database registry table recording those which have been executed. The good ones checksum each script and validate them every run to make sure nobody’s trying to change the old files out from under you. But I’ve run into three so far, and used two in production, that do something different. Each revolves around a central idea that sets it apart and makes developing and deploying changes easier, faster, or better-organized than its competition — provided you’re able to work within the assumptions and constraints that idea implies.

Read on for thoughts about three tools: sqitch, graphile-migrate, and migra.

Comments closed

Postgres Change Management Rollbacks

Grant Fritchey explains why stateful systems are difficult to roll back:

The invitation this month for #PGSqlPhriday comes from Dian Fay. The topic is pretty simple, database change management. Now, I may have, once or twice, spoken about database change management, database DevOpsautomating deployments, and all that sort of thing. Maybe. Once or twice.

OK. This is my topic.

I’ve got some great examples on taking changes from the schema on your PostgreSQL databases and then deploying them. All the technical stuff you could want. However, I don’t want to talk about that today. Instead, I want to talk about something really important, the concept of rollbacks when it comes to database deployments.

I completely agree with Grant’s description of the pain and his recommendation. With stateful systems, roll forward, not backward.

Comments closed

Thoughts on Fabric OneLake

Teo Lachev shares some thoughts:

In a previous post, I shared my overall impression of Fabric. In this post, I’ll continue exploring Fabric, this time sharing my thoughts on OneLake. If you need a quick intro to Fabric OneLake, the Josh Caplan’s “Build 2023: Eliminate data silos with OneLake, the OneDrive for Data” presentation provides a great overview of OneLake, its capabilities, and the vision behind it from a Microsoft perspective. If you prefer a shorter narrative, you can find it in the “Microsoft OneLake in Fabric, the OneDrive for data” post. As always, we are all learning and constructive criticism would be appreciated if I missed or misinterpreted something.

I think some of Teo’s criticism comes from the idea that OneLake should also mean one lakehouse or one data lake, but the abstraction is one level higher than that. I would like to see some of Teo’s ideas make it into GA, though.

Comments closed