Kevin Feasel – Page 311

Learning about GitHub Actions

Published 2024-03-20 by Kevin Feasel

In this video, we dig into GitHub’s process for executing code: GitHub Actions workflows. We’ll learn what Actions and workflows are, how we can create them from scratch, and how to incorporate Actions from the GitHub Marketplace into our own workflows.

Along the way, I describe what GitHub Actions workflows are and we build a simple one. I’ll have more videos coming up that expand on GitHub Actions and show you more of what you can do with them.

Comments closed

Changing the Timeout of a Spark Session in Microsoft Fabric

Published 2024-03-20 by Kevin Feasel

Koen Verbeeck doesn’t have time to wait:

You might know the feeling: you’re writing code in a Notebook in Microsoft Fabric and suddenly you have to leave your workstation for a while. Someone ran the doorbell (you’re working from home and you get some parcels delivered), or you took a coffee break with some colleagues. When you return to your notebook, the Spark session has timed out and when you run a cell, you have to wait for the damn thing to restart again. The agony, waiting for 2-3 minutes for the session to start, and only after that the actual code can start running.

Read on to see how you can set the timeout to a custom value, assuming you’re okay with paying for the Spark cluster to sit around until it times out.

Comments closed

Exposing Kafka Data in Iceberg using Tableflow

Published 2024-03-20 by Kevin Feasel

Marc Selwan announces a new product:

We’re excited to talk about our vision for Tableflow, which makes it push-button simple to take Apache Kafka® data and feed it directly into your data lake, warehouse, or analytics engine as Apache Iceberg® tables. Making operational data accessible to the analytical world is traditionally a complex, expensive, and brittle process and we believe we can do better to unify the operational and analytical estates.

Tableflow removes all this erroneous, duplicative work and helps convert Kafka topics and associated schemas to Iceberg tables in one click. This is central to our Confluent’s vision to build the world’s leading data streaming platform that fuels any operational and analytical workload with real-time data products.

It looks like this is currently in early access, but you can see where Confluent intends to take the product.

Comments closed

The Proper Use of Views and Inline UDFs

Published 2024-03-20 by Kevin Feasel

Erik Darling plays tic-tac-toe:

The problem is really the stuff that people stick into views. They’re sort of like a junk drawer for data. Someone builds a view that returns a correct set of results, which becomes a source of truth. Then someone else comes along and uses that view in another view, because they know it returns the correct results, and so on and so on. Worse, views tend to do a bunch of data massaging, left joining and coalescing and substringing and replacing and case expressioning and converting things to other things. The bottom line is that views are as bad as you make them.

The end result is a trash monster with a query plan that can only be viewed in full from deep space.

Read on to learn the use cases for views and inline UDFs, as well as a few important notes regarding performance of each. Views are like mogwai: they’re fine as long as you never get them wet and never let them eat after midnight. The problem is, far too many companies are apparently the business equivalent of all-you-can-eat buffets at water parks.

Inline user-defined functions are like patenting a device that lets you shoot yourself in both feet with one pull of the trigger. Which, if I understand things correctly, means you’ll need a Form 4 for each inline UDF.

Comments closed

Dropping Objects in SQL Server and Snowflake

Published 2024-03-20 by Kevin Feasel

Kevin Wilkie gets the drop on us:

When you’re working between SQL Server and Snowflake, there can be a lot of crossover that may make you forget what system you’re working in. Sometimes it’s close, but not close enough.

Today, let’s go over something that should be rather simple – removing old objects that we shouldn’t need any longer.

Read on to see how the two data platform technologies differ in this regard.

Comments closed

Migrating from Power BI to Microsoft Fabric

Published 2024-03-20 by Kevin Feasel

Paul Turley gives us an overview:

Fabric is here but what does that mean if you are using Power BI? What do you need to know and what, if anything, will you need to change if you are a Power BI report designer, developer or BI solution architect? What parts of Fabric should you use now and how do you plan for the near-term future? As I write this in March of 2024, I’m at the Microsoft MVP Summit at the Microsoft campus in Redmond, Washington this week learning about what the product teams will be working on over the next year or so. Fabric is center stage in every conversation and session. To say that Fabric has moved my cheese would be a gross understatement. I’ve been working with data and reporting solutions for about 30 years and have seen many products come and go. Everything I knew about working with databases, data warehouses, transforming and reporting on data has changed recently BUT it doesn’t mean that everyone using Power BI must stop what they are doing and adapt to these changes. The core product is unchanged. Power BI still works as it always has.

Read on to learn more about Paul’s thesis and how the world changes with Microsoft Fabric.

Comments closed

What’s New in SSMS 20

Published 2024-03-20 by Kevin Feasel

Erin Stellato gives us the skinny:

We expect that the first two posts, combined with the release notes and the new Connect with SQL Server Management Studio page, provide the details you need about the changes in SSMS 20 GA. As such, the focus of this post is the roadmap for SSMS. Our roadmap is heavily influenced by the evolving capabilities of SQL Server and Azure SQL, and feedback from SSMS users. We’re currently collecting general feedback at https://aka.ms/sqlfeedback, and feedback on Copilot in SSMS at https://aka.ms/ssms-copilot-feedback. Please comment and upvote on items that you would like to see in SSMS!

With SSMS 20 now being generally available, you can download it and try it out in your own environment. Erin quells any fears that Microsoft is abandoning SSMS and covers some of the big-ticket items on the roadmap.

1 Comment

Taking a Billion Taxi Rides with DuckDB

Published 2024-03-19 by Kevin Feasel

Mark Litwintschik tries out DuckDB:

DuckDB is an in-process database. Rather than relying on a server of its own, it’s used as a client. The client can work with data in memory, within DuckDB’s internal file format, database servers from other software developers and cloud storage services such as AWS S3.

This choice to not centralise DuckDB’s data within its own server, paired with being distributed as a single binary, makes installing and working with DuckDB much less complex than say, standing up a Hadoop Cluster.

The project isn’t aimed at very large datasets. Despite this, its ergonomics are enticing enough and it does so much to reduce engineering time that workarounds are worth considering. The rising popularity of analysis-ready, cloud-optimised Parquet files is removing the need for substantial hardware when dealing with datasets in the 100s of GBs or larger.

Read on to learn more about DuckDB, how it differs from SQLite, and a bit of nuttiness around how far you can push an in-memory database.

Comments closed

Restorable Dropped Databases Naming in Azure SQL DB

Published 2024-03-19 by Kevin Feasel

Tanayankar Chakraborty asks, what’s in a name?:

An issue was reported recently where the customer complained that in their cost analysis report of their Azure SQL DBs, the db name appears appended with a comma(,) and a number. While they agreed with the DB name in the report, they didn’t understand the number after the comma and its significance. This is how the cost analysis report looks like:

Click through for a redacted version of the report, showing an example of the database in question, as well as an explanation of what this number means.

Comments closed

Creating Dynamic Moving Averages with Visual Calcs and Numeric Parameters

Published 2024-03-19 by Kevin Feasel

Erik Svensen builds a dynamic moving average:

With the introduction of visual calculation in the February 2024 release of Power BI desktop (https://powerbi.microsoft.com/en-us/blog/visual-calculations-preview/) – this gives us some new possibilities to add calculations on the individual visual and some new functions gives us some exiciting options.

One example could be to use the MOVINGAVERAGE function (link) to and combine it with numeric range parameter to make it dynamic.

Click through for a video and a description of how to do it.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Author: Kevin Feasel