Press "Enter" to skip to content

Day: April 24, 2023

R in 10 Minutes

Holger von Jouanne-Diedrich gives us a quick primer on R:

R is a powerful programming language and environment for statistical computing and graphics. In this post, we will provide a quick introduction to R using the famous iris dataset.

We will cover loading data, exploring the dataset, basic data manipulation, and plotting. By the end, you should have a good understanding of how to get started with R, so read on!

Click through for the intro.

Comments closed

Diagramming a Finite State Machine with Mermaid.JS

Matt Eland defeats the boss:

A year or two ago I built a small game prototype that featured a boss fight with a crab monster that was powered by a finite state machine. This monster waited for the player to enter its arena, then descended from the ceiling, roared a challenge, and began fighting the player.

The monster was only damageable after it finished descending. Taking enough damage would make the monster react in pain before it could attack again. Hurting the monster enough caused it to die.

Read on to see how you can model this information in a finite state machine and, from there, how to visualize it with the Mermaid library. I have used Mermaid in the past and can certainly recommend it if you need to generate diagrams programmatically.

Comments closed

Unrolling Multiple Arrays in Azure Data Factory

Mark Kromer puts us in disarray:

ADF and Synapse data flows gave a Flatten transformation to make it easy to unroll an array as part of your data transformation pipelines. We’ve updated the Flatten transformation to now allow for multiple arrays that can be unrolled in a single transformation step. This will make your ETL jobs much simpler with fewer transformation steps.

Click through for screenshots showing how to use this feature.

Comments closed

Using Dynamic Format Strings for Measures in Power BI

Meagan Longoria shows off a new preview feature:

The April 2023 release of Power BI desktop introduced a new preview feature called dynamic format strings for measures. This allows us to return values with different formats from the same measure. Previously, we needed to create calculation groups (usually by using Tabular Editor) to accomplish this. But now it is built in to Power BI Desktop.

Read on to learn good use cases for this feature, as well as a few important notes on operation and limitations.

Comments closed

Load Testing in Power BI

Chris Webb gives us the why and the how:

If you’re about to put a big Power BI project into production, have you planned to do any load testing? If not, stop whatever you’re doing and read this!

In my job on the Power BI CAT team at Microsoft it’s common for us to be called in to deal with poorly performing reports. Often, these performance problems only become apparent after the reports have gone live: performance was fine when it was just one developer testing it, but as soon as multiple end-users start running reports they complain about slowness and timeouts. At this point it’s much harder to change or fix anything because time is short, everyone’s panicking and blaming each other, and the users are trying to do their jobs with a solution that isn’t fit for purpose. Of course if the developers had done some load testing then these problems would have been picked up much earlier.

With that in mind, Chris explains some of the things we can do to help with load testing in Power BI.

Comments closed

Finding Special Characters in Powershell

Patrick Gruenauer looks for special characters:

Sometimes special characters are a nuisance. If you are trying to create some user accounts in on-premise or cloud environments, you should avoid special characters in usernames. In this blog post I will show how to find this special characters.

Click through for a regular expression-based approach, which also allows you to exclude special but not special enough characters.

Comments closed

Snowflake Data Governance

Enrique Lopez de Lara shares a few ways that Snowflake allows us to protect data in its system:

The role hierarchy in the previous section defines what can be done on different objects and by whom. However, it doesn’t restrict which records within a table a user can see or which values should be masked within a column. That’s where the data governance policies in this section come into play.

All data governance policies and tags are stored in the PROD_DB_GOV database under three schemas: MASKING, ROWACCESS and TAGS. Putting all the policies and tags in a single database allows us to centralize them and better restrict access to them. Please note that only the GOV_ADMIN role has read/write permissions on it.

These are, for the most part, very similar to what we’re used to in relational databases: application and system roles, row-level security, and data classification.

Comments closed