Press "Enter" to skip to content

Day: December 21, 2022

Structuring Azure ML Projects and using the Terminal

Tomaz Kastrun nears the end of the Azure ML advent. Day 20 covers package requirements and other niceties:

When creating notebooks, it is always a good way to have the dependencies included. Whether it is a particular version of a package, a separate script file or an installation requirement.

Selecting an environment or kernel can be an issue if it is not correctly initiated with the code. And you can also check the kernels with a simple python code:

Day 21 looks at the Azure CLI and running code from within a compute instance terminal:

Using Azure CLI can help you progress faster, make repetitve tasks automated and even use the GIT integration, for faster and better collaboration.

So we have created a YAML file on Day20 and we can use it also with Azure CLI to create an environment.

Comments closed

High-Level Thoughts on Migration

Marc Lelijveld thinks about migrations:

Over the last half year, I have been involved in many large migration projects from another BI tool to Power BI. All with a different setup, from Tableau, Microstrategy and from Analysis Services to Power BI. In this blog I will shine a light on the experiences I gained during these migrations and share some do’s and don’ts. In this blog I start with the why you even migrate in the first place, after which we dive into the where to start and of course also how. A blog that is more focused on the process rather than the technical how-to. At the same time, this blog describes the work I did over the past months and therefore is a small recap of the past half year.

Read on to understand your “why,” at least when it comes to migrations.

Comments closed

The Value (and Cost) of DATETRUNC

Brent Ozar points out the ups and downs of DATETRUNC():

The first one, passing in a specific start & end date, gets the best plan, runs the most quickly, and does the least logical reads (4,299.) It’s a winner by every possible measure except ease of writing the query. When SQL Server is handed a specific start date, it can seek to that specific part of the index, and read only the rows that matched.

DATETRUNC and YEAR both produce much less efficient plans. They scan the entire index (19,918 pages), reading every single row in the table, and run the function against every row, burning more CPU.

SQL Server’s thought process is, and has always been, “I have no idea what’s the first date that would produce YEAR(2017). There’s just no way I could possibly guess that. I might as well read every date since the dawn of time.”

Read on for the upshot.

Comments closed

Breaking the World with auto_explain

Ryan Lambert gets a lot of explanation:

Postgres has a handy module called auto_explain. The auto_explain module lives up to its name: it runs EXPLAIN automatically for you. The intent for this module is to automatically provide information useful for troubleshooting about your slow queries as they happen. This post outlines a pitfall I recently discovered with auto_explain. Luckily for us, it’s an easy thing to avoid.

I discovered this by running CREATE EXTENSION postgis; and watching it run for quite a while before failing with an out of disk space error. That is not my typical experience with a simple CREATE EXTENSION command!

Read on to learn what happened and how you can prevent making a similar mistake.

Comments closed

Capturing Event Hubs Data in Delta Lake Format with Stream Analytics

Xu Jiang announces a public preview:

The Stream Analytics no-code editor is a drag and drop design tool that helps customers to develop the Stream Analytics jobs without writing a single line of code. The experience provides a canvas that allows you to connect to input sources to quickly see your streaming data. Then you can transform and preview it before writing to your destination of choice in Azure. To learn more, see No-code stream processing through Azure Stream Analytics | Microsoft Learn.

Read on to see how you can capture and process data into Delta Lake format via their designer.

Comments closed

Time Intelligence Templates in Bravo for Power BI

Marco Russo and Alberto Ferrari try out some templates:

Thanks to Bravo for Power BI, creating a Date table and applying time intelligence calculations to existing model measures has never been easier. With a few clicks, the Power BI model gets the required updates, and you can further modify the code generated.

Bravo provides several ready-to-use templates based on the Time Intelligence patterns published on the DAX Patterns website. However, the pattern may not provide all the features required. There could be columns and measures you want to remove, or you might need additional columns or time intelligence calculations that are not part of the template.

Read on to see two ways you could resolve this.

Comments closed