Press "Enter" to skip to content

Curated SQL Posts

Working with .NET Polyglot Notebooks

Matt Eland installs Polyglot notebooks in VSCode:

Polyglot Notebooks is a powerful new interactive notebook technology that lets you run experiments in your editor, mix together code and rich documentation, and write code in a variety of languages to accomplish your tasks.

This article will guide you through the setup process to get Polyglot Notebooks running on your machine so you can do local data science and analytics notebooks using dotnet.

In case you’re curious, my installation has the ability to create notebooks in F#, C#, HTML, JavaScript, KQL, Markdown, Mermaid, Powershell, and SQL. I’m not positive how many of those come with the extension itself and how many are additional kernels I got from installing other extensions.

Comments closed

Connecting Power BI to ADX via Private Endpoint

Dany Hoter keeps it all on the Azure backbone:

The PBI developer creating datasets and reports need to connect to the ADX cluster using Power BI desktop.

To establish such a connection, the user’s IP address should be allowed access to the private end point.

The access should be tested using Kusto Web explorer (KWE) to make sure that the cluster can be reached.

If KWE can connect , Power BI desktop should also connect successfully and a report using the cluster in Direct Query or import can be created.

That’s the goal, and Dany shows us the way to do it.

Comments closed

The Value of PostgreSQL in Azure

Grant Fritchey does some explaining:

I’ve had people come up to me and say “PostgreSQL is open source and therefore license free. Why on earth would I put PostgreSQL in Azure?”

Honestly, I think that’s a very fair question. The shortest possible answer is, of course, you don’t have to. You can host your own PostgreSQL instances on local hardware, or build out VMs in Azure and put PostgreSQL out there, some other VM host, or maybe in Kubernetes containers, I mean, yeah, you have tons of options. So why PostgreSQL in Azure, and specifically, I mean the Platform as a Service offering? Let’s talk about it.

The biggest issue I’ve historically had with PostgreSQL or MySQL platform-as-a-service offerings in Azure is that Microsoft is always behind the release curve. With PostgreSQL, it’s not so bad—flexible server offers version 14.7, which is one major version behind Postgres itself (15) but at least the latest minor version. They’ve caught up on MySQL, but for a while, they were way behind.

Comments closed

Building an About Topic Help File in Powershell

Robert Cain teaches us about teaching others about something:

In my previous post, Fun With PowerShell – Authoring Help, I covered how to author comment based help for your functions.

In addition to help for your functions, it’s also possible to write about_ help. PowerShell itself contains many about topics for PowerShell itself.

These about topics are designed to provide further information for your users, information that may not fit into the confines of a functions help. These texts can be as long as you need.

I will say that the Powershell team nailed it with the way they implemented help.

Comments closed

Optimizing Text Search in DAX

Marco Russo and Alberto Ferrari prime the pump:

When you import a table in Power BI, all the strings contained in a text column are stored in a dictionary, which improves the compression and provides excellent query performance when there is a filter with an exact match for the column value. However, reports that apply complex filters on a text column may have performance issues when the dictionary has a large number of values: depending on many other variables, a column with a few thousand unique values might already present a bottleneck, and this is definitely an issue when there are hundreds of thousands of unique strings in a column.

In October 2022, there was an internal optimization in Power BI that has improved the performance of these searches by creating an internal index. Chris Webb described this optimization in his article, Text search performance in Power BI. In this article, we explore how to evaluate whether the optimization is applied and how to measure any performance improvements. As usual, everything comes at a price: creating the index has a cost, that you will see applied to the first query hitting the column. We will also see how to detect this event and the existing limitations for this optimization.

Click through for their deep dive into the process. The final answer reminds me of the warehousing world, where you might pre-run some important queries to get those pages into the buffer pool and available for later reports.

Comments closed

Speeding Up a Slow Kafka Consumer with Parallelism

Paul Brebner continues a series on Kafka consumers:

In Part 1 of this series, we had a look at Kafka concurrency and throughput work, recapped some earlier approaches I used to improve Kafka performance, and introduced the Kafka Parallel Consumer and supported ordering options (Partition, Key, and Unordered). In this second part we continue our investigations with some example code, a trace of a “slow consumer” example, how to achieve 1 million TPS in theory, some experimental results, what else do we know about the Kafka Parallel Consumer, and finally, if you should use it in production. 

Read on to see what Paul has to say about the topic.

Comments closed

Converting Letters to Pre-Smartphone Numeric Codes

Tomaz Kastrun rolls back the clocks:

What a nightmare was to type a short message on these keypads. So, image writing Hello on this keypad, you had to press: 4433555555666 to get the letters “hello”.

44 = h
33 = e
555 = l
555 = l
666 = o

So creating converter would be great to troll on your friends

This seems like as good a reason as any to create such a function. Click through to see how to do it in R.

Comments closed

pg_stat_statements and Public Sentiment

Andreas Scherbaum polls the audience:

For anyone who doesn’t know, I’m running a weekly interview series with people from the PostgreSQL community. It’s called “PostgreSQL Person of the Week“. One of the questions in the default set I give everyone is:

What is your favorite PostgreSQL extension?

And guess what the answer is: by far everyone’s favorite is pg_stat_statements!

Read on to learn a bit more about what the extension does, why people like it, and what other extensions interviewees prefer.

Comments closed

Tracking Configuration-Based Performance Differences in Postgres

Ryan Lambert shows off a Postgres extension:

This is my entry for PgSQL Phriday #008. It’s Saturday, so I guess this is a day late! This month’s topic, chosen by Michael from pgMustard, is on the excellent pg_stat_statements extension. When I saw Michael was the host this month I knew he’d pick a topic I would want to contribute on! Michael’s post for his own topic provides helpful queries and good reminders about changes to columns between Postgres version 12 and 13.

In this post I show one way I like using pg_stat_statements: tracking the impact of configuration changes to a specific workload. I used a contrived change to configuration to quickly make an obvious impact.

Read on for the example.

Comments closed