Press "Enter" to skip to content

Curated SQL Posts

Multiple Code Panes in R Studio

Tomaz Kastrun has good news for us:

On R studio home page, make sure to download the version 2021.09 Preview (as of writing of this blogpost, this is still in preview) and install this version on your client machine (supported windows machine, MacOS and Linux).

Once installation is completed, head to global options (Tools->Global options) and select Pane Layout. You will have a new set of buttons available (Add Column; Remove Column). With Add column an additional pane will be added to layout.

It’s not as convenient as the right-click –> “Split horizontally” or “Split vertically” that we get in tools like SSMS and VS Code, but I’m happy to see this change in R Studio.

Comments closed

Replication Error 20084 on SQL Server 2019

I ran into a weird issue:

Iwas helping out with a SQL Server upgrade recently, going from 2016 to 2019. We ran into a problem when trying to run replmerg.exe for a merge replication subscription. Specifically, we were getting error code 20084, which means that the replication process couldn’t connect to one of the instances. Interestingly, the process couldn’t connect to the local instance, and the failure was immediate—that is, within a couple of milliseconds. There was nothing in the management logs on either the distributor server or the subscriber server which indicated a problem. We were able to connect both sides together just fine—from the subscriber, we could connect to the distributor, and from the distributor, we could connect to the subscriber.

Click through for what error code 20084 typically means, as well as what turned out to be the problem here.

Comments closed

Swap an Identity Column with a Sequence

Greg Larsen looks at sequences:

Microsoft introduced the sequence number objects starting with SQL Server 2012. A sequence object generates sequence numbers based on starting and increment values, similar to an identity column, but it has additional features. Over time, you might find that the additional benefits of a sequence number have you wanting to replace an identity column with a sequence number. This article demonstrates two options for replacing an identity column with a sequence number.

When it came out, I thought I was going to use sequences a quite often. In practice, I’ve used it a few times, but IDENTITY is so much easier to type and I’ve rarely needed sequence generators. That said, the times I have needed them, I definitely appreciate their existence.

Comments closed

Power BI Model Documenter v2

Marc Lelijveld announces a new version of the Power BI Model Documenter:

Back in 2020, I released the first version of the Power BI external tool to document your Power BI data model. Since then, I wrote a lot about this topic, such as why adding descriptions to everything is important and various releases of the model documenter.

Users encountered various challenges with the initial release of the tool. I tried to help everyone to my best knowledge, but some issues kept coming back. Over the past period, I worked together with my colleague Ton Swart to solve all these challenges in a new updated version!

Read on to see what has changed.

Comments closed

Finding Substrings in a String with T-SQL

Kevin Wilkie avoids a regex:

Continuing on with our series from last time – see here if you somehow missed it – let’s have some more fun with the different functions we can use with strings.

This time, let’s focus on looking for different items we can use to find a string within a string.

With T-SQL not natively supporting regular expressions—though you can use a CLR module to do this—click through to see what Kevin uses.

Comments closed

2022 Data Professional Salary Survey

Brent Ozar wraps up another year of surveying:

Every year, I run a salary survey for folks in the database industry. This year, I was especially curious to see the results to find out whether salaries went up. Anecdotally, I’ve seen a lot of people jumping ship to new companies due to the Great Resignation – but what does the data actually show? Let’s find out.

Click through to grab a copy of the survey and get analyzing.

Comments closed

Serializing Clustered Columnstore Index Deletes

Aaron Bertrand hit a wall:

At Stack Overflow, we have some tables using clustered columnstore indexes, and these work great for the majority of our workload. But we recently came across a situation where “perfect storms” — multiple processes all trying to delete from the same CCI — would overwhelm the CPU as they all went widely parallel and fought to complete their operation. Here’s what it looked like in SolarWinds® SQL Sentry®:

It looks bad. Click through to understand why and what Aaron & co did to prevent this issue. I typically have used queue tables on the other end of columnstore indexes: as a method for ensuring that we insert 1024*1024 rows at a time. This was particularly important in the 2016 days, as we had a problem in which trickle-loading a columnstore index would cause massive numbers of rowgroups with dozens of rows, though that issue was subsequently fixed.

Comments closed

Enhancing Color Photographs via Generative Adversarial Networks

Neil Saunders re-colorizes photographs:

When I’m not at the computer writing R code, I can often be found at the computer processing photographs. Or at the computer browsing Twitter, which is how I came across Stuart Humphryes, a digital artist who enhances autochromes. Autochromes are early colour photographs, generated using a process patented by the Lumière brothers in 1903. You can find and download many examples of them online. Stuart uses a variety of software tools to clean, enhance and balance the colours, resulting in bright vivid images that often have a contemporary feel, whilst at the same time retaining the somewhat “dreamy” quality of the original.

Having read that one of his tools uses neural networks, I was keen to discover how easy it is to achieve something similar using freely-available software found online. The answer is “quite easy” – although achieving results as good as Stuart’s is somewhat more difficult. Here’s how I went about it.

Click through for the process and some really nice-looking post-production photographs.

Comments closed

Using a Kafka Client with Azure Event Hubs

Niels Berglund takes us through one way to work with Azure Event Hubs:

This blog post came by, by accident, lol. A couple of weeks ago, I started to prepare for a webinar: Analyze Billions of Rows of Data in Real-Time Using Azure Data Explorer. One of the demos in that webinar is about ingesting data from Apache Kafka into Azure Data Explorer. When prepping, I noticed that my Confluent Cloud Kafka cluster didn’t exist anymore, so I had to come up with a workaround. That workaround was to use Azure Event Hubs instead of Kafka.

Since I already had the code to publish data to Kafka and knew that you could use the Kafka client to publish to Event Hubs, I thought I’d test it out. I did run into some minor snags along the way, so I thought I’d write a blog post about it. Then, at least, I have something to go back to. This post also looks at how to set up an Event Hubs cluster.

Click through to see it in action.

Comments closed

Air-Gapping Backups

Chad Callihan notes an increasingly dangerous problem:

I’m guessing that we’ve all dealt with a computer virus or malware at some point in our lives. Dealing with these issues can range from annoying to absolutely devastating.

Even at their worst, viruses or malware are not typically on the same level as a ransomware attack. Sure, you may hate not being able to access your games or documents on your laptop due to a virus. But when a company is attacked with ransomware there are more financial complications.

Ransomware has picked up steam in recent years, making the idea of keeping backups air-gapped more appealing.

I chatted with a consulting client of mine yesterday about this very thing. One of their customers was hit by a major ransomware attack which put them behind for an entire year (and would have been considerably worse if they hadn’t been using my client’s software, as it has both cloud and on-prem components, so they could safely fall back to the cloud side while they rebuilt their entire on-prem infrastructure from scratch). Recent ransomware will not only encrypt the files on local disk but will also try to figure out where backups get stored, access that location using the backup service account, and encrypt those backup files as well.

Comments closed