Press "Enter" to skip to content

Month: January 2022

Swap an Identity Column with a Sequence

Greg Larsen looks at sequences:

Microsoft introduced the sequence number objects starting with SQL Server 2012. A sequence object generates sequence numbers based on starting and increment values, similar to an identity column, but it has additional features. Over time, you might find that the additional benefits of a sequence number have you wanting to replace an identity column with a sequence number. This article demonstrates two options for replacing an identity column with a sequence number.

When it came out, I thought I was going to use sequences a quite often. In practice, I’ve used it a few times, but IDENTITY is so much easier to type and I’ve rarely needed sequence generators. That said, the times I have needed them, I definitely appreciate their existence.

Comments closed

Power BI Model Documenter v2

Marc Lelijveld announces a new version of the Power BI Model Documenter:

Back in 2020, I released the first version of the Power BI external tool to document your Power BI data model. Since then, I wrote a lot about this topic, such as why adding descriptions to everything is important and various releases of the model documenter.

Users encountered various challenges with the initial release of the tool. I tried to help everyone to my best knowledge, but some issues kept coming back. Over the past period, I worked together with my colleague Ton Swart to solve all these challenges in a new updated version!

Read on to see what has changed.

Comments closed

Finding Substrings in a String with T-SQL

Kevin Wilkie avoids a regex:

Continuing on with our series from last time – see here if you somehow missed it – let’s have some more fun with the different functions we can use with strings.

This time, let’s focus on looking for different items we can use to find a string within a string.

With T-SQL not natively supporting regular expressions—though you can use a CLR module to do this—click through to see what Kevin uses.

Comments closed

2022 Data Professional Salary Survey

Brent Ozar wraps up another year of surveying:

Every year, I run a salary survey for folks in the database industry. This year, I was especially curious to see the results to find out whether salaries went up. Anecdotally, I’ve seen a lot of people jumping ship to new companies due to the Great Resignation – but what does the data actually show? Let’s find out.

Click through to grab a copy of the survey and get analyzing.

Comments closed

Serializing Clustered Columnstore Index Deletes

Aaron Bertrand hit a wall:

At Stack Overflow, we have some tables using clustered columnstore indexes, and these work great for the majority of our workload. But we recently came across a situation where “perfect storms” — multiple processes all trying to delete from the same CCI — would overwhelm the CPU as they all went widely parallel and fought to complete their operation. Here’s what it looked like in SolarWinds® SQL Sentry®:

It looks bad. Click through to understand why and what Aaron & co did to prevent this issue. I typically have used queue tables on the other end of columnstore indexes: as a method for ensuring that we insert 1024*1024 rows at a time. This was particularly important in the 2016 days, as we had a problem in which trickle-loading a columnstore index would cause massive numbers of rowgroups with dozens of rows, though that issue was subsequently fixed.

Comments closed

Enhancing Color Photographs via Generative Adversarial Networks

Neil Saunders re-colorizes photographs:

When I’m not at the computer writing R code, I can often be found at the computer processing photographs. Or at the computer browsing Twitter, which is how I came across Stuart Humphryes, a digital artist who enhances autochromes. Autochromes are early colour photographs, generated using a process patented by the Lumière brothers in 1903. You can find and download many examples of them online. Stuart uses a variety of software tools to clean, enhance and balance the colours, resulting in bright vivid images that often have a contemporary feel, whilst at the same time retaining the somewhat “dreamy” quality of the original.

Having read that one of his tools uses neural networks, I was keen to discover how easy it is to achieve something similar using freely-available software found online. The answer is “quite easy” – although achieving results as good as Stuart’s is somewhat more difficult. Here’s how I went about it.

Click through for the process and some really nice-looking post-production photographs.

Comments closed

Using a Kafka Client with Azure Event Hubs

Niels Berglund takes us through one way to work with Azure Event Hubs:

This blog post came by, by accident, lol. A couple of weeks ago, I started to prepare for a webinar: Analyze Billions of Rows of Data in Real-Time Using Azure Data Explorer. One of the demos in that webinar is about ingesting data from Apache Kafka into Azure Data Explorer. When prepping, I noticed that my Confluent Cloud Kafka cluster didn’t exist anymore, so I had to come up with a workaround. That workaround was to use Azure Event Hubs instead of Kafka.

Since I already had the code to publish data to Kafka and knew that you could use the Kafka client to publish to Event Hubs, I thought I’d test it out. I did run into some minor snags along the way, so I thought I’d write a blog post about it. Then, at least, I have something to go back to. This post also looks at how to set up an Event Hubs cluster.

Click through to see it in action.

Comments closed

Air-Gapping Backups

Chad Callihan notes an increasingly dangerous problem:

I’m guessing that we’ve all dealt with a computer virus or malware at some point in our lives. Dealing with these issues can range from annoying to absolutely devastating.

Even at their worst, viruses or malware are not typically on the same level as a ransomware attack. Sure, you may hate not being able to access your games or documents on your laptop due to a virus. But when a company is attacked with ransomware there are more financial complications.

Ransomware has picked up steam in recent years, making the idea of keeping backups air-gapped more appealing.

I chatted with a consulting client of mine yesterday about this very thing. One of their customers was hit by a major ransomware attack which put them behind for an entire year (and would have been considerably worse if they hadn’t been using my client’s software, as it has both cloud and on-prem components, so they could safely fall back to the cloud side while they rebuilt their entire on-prem infrastructure from scratch). Recent ransomware will not only encrypt the files on local disk but will also try to figure out where backups get stored, access that location using the backup service account, and encrypt those backup files as well.

Comments closed

Implementing GREATEST in SQL Server 2019

Ronen Ariely is on a mission to be the greatest:

The function GREATEST returns the maximum value from a list of one or more expressions. It returns the data type with the highest precedence from the set of types passed to the function.

This function was added to Azure SQL. At this time, it is supported in Azure SQL Database, Azure SQL Managed Instance and Azure Synapse Analytics serverless. 

Unfortunately, it not yet supported on SQL Server on premises and synapse dedicated sql pool.

Click through for a pair of alternative constructs while we wait for GREATEST on-premises.

Comments closed

Getting Started with KQL

Steve Jones starts learning about the Kusto Query Language:

I saw an episode of Data Exposed with my good friend, Hamish Watson. He talked about KQL (Kusto Query Language) being the next query language you need to learn. I was skeptical of the title, but I decided to give this a try.

In the episode, Hamish points out a cheat sheet from Microsoft, which I thought was a good resource. However, while watching the video, I browsed over to the demo site Microsoft has at https://aka.ms/lademo. You need an Azure account to log in, but this is a demo site where you can query some Log Analytics data. The new query window below is what appears when you go here:

If you’re already familiar with the way Splunk’s filtering language works, KQL follows from it. It’s a worthwhile language for Azure-based administrators to know, as it’s the most powerful way to get data out of Log Analytics.

Comments closed