Press "Enter" to skip to content

Curated SQL Posts

Benchmarking Power BI Local Data Import Speed

Eugene Meidinger has all the data he needs on his desktop:

The chart above shows the number of seconds it took to load X million rows of data from a given data source, according to a profiler trace and Phil Seamark’s Refresh visualizer. Parquet is a clear winner by far, with MS Access surprisingly coming in second. Sadly the 2 GB file limit stops Access from becoming the big data format of the future.

Part of the reason I wanted to do these tests is often people on Reddit will complain that their refresh is slow and their CPU is maxed out. This is almost always a sign that they are importing oodles and oodles of CSV files. I recommended trying Parquet instead of CSV, but it’s nice to have concrete proof that it’s a better file source.

Read on for the chart. Also, don’t tell his accountants about the gaming laptop. It’s 100% for work purposes, just like my desktop PC. Only work, nothing else, IRS. The high-end GPU is for AI work. And the big screen is for doing big business.

Comments closed

Determining Power BI Report Fields in Use

Meagan Longoria performs a search:

Have you ever wondered where a certain field is used in a report? Or maybe you need an easy way to find broken field references in a report? Certain 3rd-party tools such as Measure Killer and Power BI Helper (not updated recently) have helped us with this task in the past. But now we can perform this task with a notebook in Fabric!

This is made possible by the Semantic Link Labs Python library. Please note that PBIR format is still in preview at the time of publishing this blog post, so use it at your own risk. Also, this works only on reports published to the Power BI service. Since this notebook is not making any changes to the report, I feel it’s pretty safe to run, but do remember that it uses CUs on your Fabric capacity while you run it.

Read on to see how it works.

Comments closed

Sending Data from Power Automate to Microsoft Fabric

Chris Webb uses Eventstreams:

Fabric’s Real-Time Intelligence features are, for me, the most interesting things to learn about in the platform. I’m not going to pretend to be an expert in them – far from it – but they are quite easy to use and they open up some interesting possibilities for low-code/no-code people like me. The other day I was wondering if it was possible to send events and data from Power Automate to Fabric using Eventstreams and it turns out it is quite easy to do.

Read on to see just how easy it is. And there’s a good question from a reader about using other languages, such as Powershell. Turns out the answer is yes.

Comments closed

First Impressions of SSMS 21

Chad Callihan has some thoughts:

Some exciting news recently as SQL Server Management Studio 21 is available in preview. Head over here to download it and experiment with the latest. I’ve worked with it a little bit so far and hope to play around more over the Thanksgiving break.

Here are my early thoughts on what I’ve seen so far.

I’ve linked to a few articles on what’s new in SSMS 21 but Chad points out a couple of things I haven’t seen from people yet.

Comments closed

Temp Tables in SSIS Data Sources

Andy Brownsword disappears in a flash:

When handing data we can make use of temporary tables to aid with separation or performance. However, they don’t always play nice with Integration Services packages.

If we set a source to call a procedure returning the contents of a temporary table we’ll see an error like below:

Read on for three options. It’s been a while, but I vaguely recall that you can use global temp tables (such as ##Results) and it will work, as those persist and are available to all sessions so long as there is some open session using them.

Comments closed

Table Cloning in Snowflake

Kevin Wilkie creates a clone:

In this coding scenario, I am copying everything from TableA and pushing it into a new table called TableB in the same database and schema.

If you check the size of the database before and after you clone a table, it will be the same size – no matter the size of TableA. This is because, at this point in time, TableB exists only as a “pointer” to the data that constitutes TableA. It is not until something changes in one of the tables – say adding a row to TableA, that it stops being a “pointer” and is artificially constituted.

Read on to learn more about how this works.

Comments closed

Handling a Consumer Fetch Request in Kafka

Multiple Confluent employees (who apparently don’t get to have names this time around) wrap up a series:

It’s been a long time coming, but we’ve finally arrived at the fourth and final installment of our blog series. In this series, we’ve been peeling back the layers of Apache Kafka® to get a deeper understanding of how best to interact with the cluster using producer and consumer clients.

Read on for the final part, as well as links to previous parts if you missed them.

Comments closed

Comparing Azure Kubernetes Service and Container Apps

Gaurav Shukla makes a comparison:

Hello Readers!! Welcome to the new blog!! AKS vs ACA, which is best in cloud migration? When migrating an application to the cloud, choosing the right platform is crucial to ensure scalability, cost-effectiveness, and ease of management. Two of the prominent services offered by Azure for running containerized applications are Azure Kubernetes Service (AKS) and Azure Container Apps (ACA). Both are excellent choices, but their use cases, complexity, and operational overhead differ significantly. This blog will provide a detailed comparison of AKS and ACA, helping you decide which is the best approach for your cloud migration.

Read on for an overview of each service and a nice table outlining the differences.

Comments closed

The Challenge of Major Version Upgrades in PostgreSQL

Peter Eisentraut lays out the explanation:

Upgrades between PostgreSQL major versions are famously annoying. You can’t just install the server binaries and restart, because the format of the data directory is incompatible.

Why is that? Why can’t we just keep the data format compatible?

Perhaps surprisingly, the data format is actually mostly compatible, but not completely. There are just a few things missing that are very hard to solve.

Perhaps I’m not as sympathetic as I should be to the core developers, but there are other RDBMS platforms that have a direct path for upgrade from version to version, so it’s hardly insurmountable.

Comments closed