Press "Enter" to skip to content

Curated SQL Posts

Breaking the World with auto_explain

Ryan Lambert gets a lot of explanation:

Postgres has a handy module called auto_explain. The auto_explain module lives up to its name: it runs EXPLAIN automatically for you. The intent for this module is to automatically provide information useful for troubleshooting about your slow queries as they happen. This post outlines a pitfall I recently discovered with auto_explain. Luckily for us, it’s an easy thing to avoid.

I discovered this by running CREATE EXTENSION postgis; and watching it run for quite a while before failing with an out of disk space error. That is not my typical experience with a simple CREATE EXTENSION command!

Read on to learn what happened and how you can prevent making a similar mistake.

Comments closed

Capturing Event Hubs Data in Delta Lake Format with Stream Analytics

Xu Jiang announces a public preview:

The Stream Analytics no-code editor is a drag and drop design tool that helps customers to develop the Stream Analytics jobs without writing a single line of code. The experience provides a canvas that allows you to connect to input sources to quickly see your streaming data. Then you can transform and preview it before writing to your destination of choice in Azure. To learn more, see No-code stream processing through Azure Stream Analytics | Microsoft Learn.

Read on to see how you can capture and process data into Delta Lake format via their designer.

Comments closed

Time Intelligence Templates in Bravo for Power BI

Marco Russo and Alberto Ferrari try out some templates:

Thanks to Bravo for Power BI, creating a Date table and applying time intelligence calculations to existing model measures has never been easier. With a few clicks, the Power BI model gets the required updates, and you can further modify the code generated.

Bravo provides several ready-to-use templates based on the Time Intelligence patterns published on the DAX Patterns website. However, the pattern may not provide all the features required. There could be columns and measures you want to remove, or you might need additional columns or time intelligence calculations that are not part of the template.

Read on to see two ways you could resolve this.

Comments closed

CountIf in KQL

Robert Cain continues a series on KQL:

In my previous post, Fun With KQL – DCountIf, we saw how you could apply a filter to the data directly within the dcountif function.

You may have been thinking gosh, it sure would be nice if we could do that within the count function! (Note, you can read more about count in my Fun With KQL – Count blog post.)

Well fear not intrepid Kusto coder, there is just such a function to fit your needs: countif.

As always, Robert has examples for us, so check those out.

Comments closed

Statistical Analysis in Azure ML

Tomaz Kastrun continues an advent of Azure ML. Day 18 takes us through feature exploration:

Azure Machine Learning is also a great tool to do ordinary statistical analysis, graph plotting and everything that goes along.

Let’s get an open dataset, that is available on UCI Machine Learning repository and import it in the pandas dataframe.

Day 19 picks up with feature engineering:

Yesterday we have shown, that statistical analysis and all bolts and whistles can be done super simple in Azure machine learning. Today we will continue with feature engineering and modelling.

So, what is feature engineering? Is a general process and can involve both feature construction: adding new features from the existing data, and feature selection: choosing only the most important features for improving model performance, reducing data dimensionality, doing log-transformation, removing outliers, to do scaling (normalisation, standardisation), imputations, general transformation (and others, as polynomial), variable creation, variable extraction and so on.

Comments closed

Coloring Tree Branches in R

Ali Oghabian marks up a tree:

After running Hierarchical clustering we can cut the result binary tree at a certain depth or request that it be cut in a manner that would result a certain number of clusters. Here, I request that the resulted binary tree be cut in away that would result to 2 sample clusters. Furthermore, I convert the resulted tree to a “dendogram” object and colour the branches and the labels of the tree to visualize the 2 clusters. One can use color_branches and color_labels functions to cut and colour the trees.

Read on for a demonstration. H/T R-Bloggers.

Comments closed

Building Sparklines in Excel

Chris Webb tries out a QuickChart feature:

Last week I presented at the Nashville Modern Excel user group and before my session started Will Harvey, the organiser, did a quick run-through of what’s new in the Power Platform. He highlighted a service that piqued my interest called QuickChart which has a new Power Automate connector. QuickChart have an API that you can pass your data to and which will return an image of that data in a chart; they also have a free tier so you can play with it easily. 

Read on for a demonstration, though do note that if you wanted to do it in practice, this is a paid product.

Comments closed

Advent of Code in Postgres

Ryan Booz has some videos for us:

As I go, I’ll upload my finished code to my repo and (hopefully) record a video for each day explaining my approach. I’ve finished through day 8 (as of December 14, 2022), but haven’t finished all of the videos yet. Again, my goal is to enjoy the challenge and learn, not try to be the first one done or get on a leaderboard somewhere. This also means it will probably take into January 2023 to finish the puzzles and record videos.

Enough explanation! Let’s talk about the first five puzzles, including links to the videos.

Read on for days 1-5 of the challenge.

Comments closed

Auto-Drop Stats in SQL Server 2022

Dennes Torres takes out the trash:

Auto-Create statistics is a well-known feature critical for the database performance. When one statistic is auto created, changing the table structure is not blocked by the presence of the statistic. An auto-created statistics is also dropped automatically when a schema change happens.

On the other hand, if the statistic is created by the user, any schema change will be blocked by the presence of the statistic.

The Auto-Drop setting on a statistic is a new SQL Server 2022 feature to change this behaviour.

Click through for a demonstration.

Comments closed

Azure SQL Database Performance Roundup

Reitse Eskens shares the goods:

In the past 9 blogs, I’ve shown you all sorts of Azure SQL database solutions and gave them a little run for their money. I’ve tested a lot and written about them. This blog will be about the summation of the data and my views on the combined graphs. At the end I’ll wrap it up with my way of working when a new project starts.

But before I kick off, a little Christmas present. What I didn’t do, until now, is give you access to more raw data. Now is the moment to give you more raw number to play around with for yourself and do your own analysis. Fun as it might be, I’d highly encourage you to use my sheets as a jumping point and adapt them for your own workloads. You can find the two Excel files via the link for the scripts.

This is a post I’d been waiting for, as it covers the comparisons between tiers directly, rather than inferring it from the various posts.

Comments closed