Press "Enter" to skip to content

Day: December 16, 2022

File Seeding with dbt

Ust Oldfield sneaks in a file:

File seeding is a useful way of maintaining and deploying static, or very slowly changing, reference data for a data model to use repeatably and reliably across environments, whilst benefitting from source control.

If you aren’t in the Databricks world, this also feels like a job for DVC.

Comments closed

Sharing Results between Notebooks with MSSparkUtils

Liliam Leme provides an answer to a common Synapse Spark pool question:

I’ve been reviewing customer questions centered around “Have I tried using MSSparkUtils to solve the problem?”

One of the questions asked was how to share results between notebooks. Every time you hit “run” in a notebook, it starts a new Spark cluster which means that each notebook would be using different sessions. Making it impossible to share results between executions of notebooks. MSSparkUtils offers a solution to handle this exact scenario. 

Read on to see what MSSparkUtils is and how it helps in this case.

Comments closed

Becoming Familiar with MLflow

Tomaz Kastrun continues an advent series on Azure ML:

MLFlow is an open-source framework for registering, managing and tracking machine learning models. It is multiplatform, bringing consistent model training and model consumption across different platforms. This means, that training a model locally and uploading it to Azure or training a model on remote compute instances and downloading it, is a great feature for MLflow.

You can use MLflow with Azure CLI, Azure Python SDK or in the studio and it will deliver a consistent experience (note, some functionalities are limited to the language).

Click through for a quick overview of MLflow.

Comments closed

Running Power BI Report Server

Reza Rad stays on-premises:

Power BI is not only a cloud-based reporting technology. Due to the demand for some businesses to have their data and reporting solutions on-premises, Power BI also has the option to be deployed fully on-premises. Power BI on-premises hosting is called Power BI Report Server. This post concerns using Power BI in a fully on-premises solution with Power BI Report Server.

This post will teach you everything you need about the on-premises world of Power BI. You will learn how to install Power BI Report Server, learn all requirements and configurations for the Power BI Report Server to work correctly, and see all the pros and cons of this solution. At the end of this post, you will be able to decide if Power BI on-premises is the right choice for you, and if it is, then you will be able to set a Power BI on-premises solution up and running easily.

I used Power BI Report Server for a few years. My short version is that it’s really useful if you aren’t allowed to use Power BI Online (as was my case) but if you know what’s in the Online version, you’ll see just how much you’re missing out on.

Comments closed

Checking if Cross-Database Ownership Chaining is On

Tom Collins performs a check:

Cross-database ownership chaining is a SQL Server  security feature allowing database users  access to other database objects hosted on the same SQL server instance, in the case where database  users don’t have access granted explicitly

Tom shows us whether it is on as well as how to enable it. I’d recommend not enabling it at all and using module signing instead.

Comments closed

Fixing VertiPaq Analyzer Dictionary Size Errors

Marco Russo troubleshoots an issue:

There are cases where the dictionary size reported by VertiPaq Analyzer (used by DAX Studio, Bravo for Power BI, and Tabular Editor 3) does not correspond to the actual memory required by the dictionary. However, the number reported is technically correct because it represents the memory currently allocated for the dictionary. The issue is that – after a refresh – this memory amount is larger than the actual memory required for hash-encoded columns.

Read on to learn what the consequences are and how you can resolve this in Power BI Desktop as well as in Analysis Services.

Comments closed