Press "Enter" to skip to content

Curated SQL Posts

Static versus Dynamic Partitioning in Hive

The Hadoop in Real World team explains the difference between two partitioning strategies:

The difference between static and dynamic partitioning only exists when the partition is being created based on how the partitions are added to the table. Once the partitions are created, the tables won’t have any difference like static and dynamic partitions. All partitions are treated and one and the same.

Click through for the difference.

Comments closed

Improving Managed Instance Load Performance

Niko Neugebauer provides some tips on improving data load performance when using SQL Managed Instances in the General Purpose tier:

In this blog post we shall consider some of the strategies for improving data loading performance in Azure SQL Managed Instance. These strategies apply to the repeatable ETL processes, meaning that if there a problem, data loading process can be repeated without loss of any bit of data or its consistency. Of course, these strategies do require a prepared design for the possibility of the repetition, but this is a basic requirement for any data loading process.

There are many great ways of ensuring high performance data loading and this blog post does not pretend to be exhaustive an ultimate resource, but it will provide a couple of known paths that can drastically improve the performance of the Log Throughput in Azure SQL MI.

Click through for some tips, including some which make sense on-prem and a couple which are specific to platform-as-a-service offerings.

Comments closed

The Benefits of Indirect Checkpoints

Aaron Bertrand explains why it’s probably a good idea to enable indirect checkpoints:

At Stack Overflow, the upgrade to SQL Server 2019 was not as smooth as expected. It introduced long recovery times and high CPU on a secondary, during synchronization activity after failover, patching, and network connectivity testing. This symptom wasn’t present during similar activities and workloads under SQL Server 2017.

Aaron points out that if you have databased created in older versions of SQL Server, enabling indirect checkpoints can provide a performance boost to certain activities around log writing., including backup times.

Comments closed

Wide Non-Clustered Indexes as Kinda-Sorta Clustered Indexes

Grant Fritchey gets our hopes up:

Everyone knows that you only get a single clustered index, right? Wouldn’t it be great though if you could have two clustered indexes?

Well, you can. Sort of. Let’s talk about it.

Click through to see what Grant means. This is a thing that I’ve done occasionally, though much more often, I’ve ripped it out because Database Tuning Advisor suggested it and a credulous user took DTA at its word.

Comments closed

The Power BI Icon Map Custom Visual

Alice Drummond shows off a custom visual in Power BI:

Working across the environmental industry – it’s fair to say that pretty much ALL of DiscoverEI’s Power BI reports have a map! And if you thought that MapBox was good, well you’re going to LOVE the Icon Map custom visual in Power BI – which allows you to display interactive polygons, points, lines and of course – icons, all in the one visual…plus so much more!

We’ve been using the Icon Map custom visual in Power BI for the last couple of years and it’s safe to say that it is hands down our favourite mapping visual for Power BI! This visual is created for free for the community by James Dales. James is always making updates and enhancements to the visual so the best place to get the latest version and some helpful instructions and tips on how to use it is from his dedicated website: https://icon-map.com/index.html. And while you’re there remember to shout James a coffee or 10 to say thanks for creating such a fantastic mapping visual

Click through for a few examples of the visual and be sure to check it out in the link above.

Comments closed

CORRESPONDING and ANSI SQL

Lukas Eder looks at a rarely-implemented keyword in SQL:

I recently stumbled upon a standard SQL feature that was implemented, to my surprise, in HSQLDB. The keyword is CORRESPONDING, and it can be used with all set operations, including UNIONINTERSECT, and EXCEPT.

Click through to see what it does. Be sure to check out the comments, where Joe Celko pops in to provide some additional historical context to explain why you won’t find this keyword in many implementations of the standard..

Comments closed

Configuring a Debezium Connector for Event Hub Streaming

Niels Berglund continues a series:

This series came about as I in the post How to Use Kafka Client with Azure Event Hubs, somewhat foolishly said:

An interesting point here is that it is not only your Kafka applications that can publish to Event Hubs but any application that uses Kafka Client 1.0+, like Kafka Connect connectors!

I wrote the above without testing it myself, so when I was called out on it, I started researching (read “Googling”) to see if it was possible. The result of the “Googling” didn’t give a 100% answer, so I decided to try it out, and this series is the result.

In the first post, – as mentioned – we configured Kafka Connect to connect into Event Hubs. In this post, we look at configuring the Debezium connector.

Click through and enjoy the fruits of Berglund’s Folly—which, as far as it goes, I’d still rate Seward’s Folly higher but this one’s pretty good too.

Comments closed

Automating Azure SQL DB Maintenance with Data Factory

Hiten Bhavsar works around the lack of SQL Agent:

As we know, it’s crucial that you run Database maintenance regularly in order to keep your database performance up with the latest statistics and healthy indexes, here we provide another way to schedule this job using Azure Data Factory; this can be done on a scheduled time interval weekly/bi-weekly/monthly.

Click through for the process and try not to think too hard about this secretly being maintenance plans all over again.

Comments closed

Auto-Failover Groups for Azure SQL Hyperscale

Melody Zacharias fills us in on a recent announcement:

On January 5th they announced, auto-failover groups for Azure SQL Hyperscale are now available in preview. Auto-failover groups is a feature that allows you to manage the failover and replication of a group of databases on a server or managed instance from one region to another region in Azure. This can be done manually or in conjunction with a user-defined policy. 

Click through for more information on how it all works.

Comments closed