Press "Enter" to skip to content

Curated SQL Posts

SQL Server Truncating Numbers to Asterisks

Bert Wagner points out that some numeric types handle overflow in a weird way:

Why does SQL Server sometimes error when converting a number into a string, but other times succeeds and returns an asterisk?

I don’t know.

The best (and logical) answer I could find online is from Robert Sheldon, who attributes it to poor error handling practices, “…before error handling got a more reputable foothold.”

This makes it important to check your results. I imagine that there’s somebody who relies upon this exact functionality, but it’s pretty weird.

Comments closed

The Fickleness of Batch Mode on Rowstore

Erik Darling points out how difficult it can sometimes be to get batch mode processing on rowstore tables:

I’m excited about this feature. I’m not being negative, here. I just want you, dear reader, to have reasonable expectations about it.

This isn’t a post about it making a query slower, but I do have some demos of that happening. I want to show you an example of it not kicking in when it probably should. I’m going to use an Extended Events session that I first read about on Dmitry Pilugin’s blog here. It’ll look something like this.

Read on for a demonstration of the point.

Comments closed

TRY_PARSE and NaN

Slava Murygin finds a nasty bug in SQL Server:

Database in trouble has a table with FLOAT column. It’s Front-End application verifies user’s input and inserts the data into that column using TRY_PARSE function.
The developer’s intention was that any “Not-a-Numeric” or “Out-of-Range” values will be automatically converted to NULL and it will be for user’s discretion to verify and fix these values.

However, one of the application users was very educated and instead of empty space, NULL or any other bad not numeric value the user supplied data with value of “NaN” for empty cells, which simply stands for “Not a Numeric”.
That action caused a database corruption!

Click through for a demo which you should not repeat on a work server.

Comments closed

DBLog: CDC for MySQL and Postgres

Andreas Andreakis and Ioannis Papapanagiotou announce a new change data capture tool for open source databases:

In databases like MySQL and PostgreSQL, transaction logs are the source of CDC events. As transaction logs typically have limited retention, they aren’t guaranteed to contain the full history of changes. Therefore, dumps are needed to capture the full state of a source. There are several open source CDC projects, often using the same underlying libraries, database APIs, and protocols. Nonetheless, we found a number of limitations that could not satisfy our requirements e.g. stalling the processing of log events until a dump is complete, missing ability to trigger dumps on demand, or implementations that block write traffic by using table locks.

This motivated the development of DBLog, which offers log and dump processing under a generic framework. In order to be supported, a database is required to fulfill a set of features that are commonly available in systems like MySQL, PostgreSQL, MariaDB, and others.

It looks like DBLog is not open source just yet, but that’s forthcoming.

Comments closed

Functional Java

Rishi Khandelwal lays out imperative versus functional Java with several examples:

As a java developer, you must have confused, whether should I move to the functional programming paradigm? What are the benefits it provide to us? People are talking about it everywhere. So let’s give it a try once and then you can decide whether you should go to the functional paradigm or not.

We will see the functional programming features one by one with the code examples and will compare it with the imperative way of java programming.

The snide part of me says “Hey, look, Java’s almost caught up to C# 3.0!” But that’s pushing it a little far. I think these functional pieces improve the language similarly to how they did C#, but if I were a regular Java developer, I’d probably look to Scala or Kotlin instead (says the guy who won’t shut up already about F#).

Comments closed

Changing Synapse Analytics Resource Classes with Azure Automation

John McCormack wins today’s edition of Cloud Bingo:

I was a approached by a data warehouse developer who wanted to be able to take advantage of the feature in Azure Synapse Analytics (Formerly Azure SQL Data Warehouse) called ‘Resource classes’. Resource classes allow users in certain roles to have access to extra compute resources for their queries. This can increase their max memory per query (although there is an overall trade-off on concurrency). Unfortunately, due to this being managed via roles, the person looking to change resource classes needs to be in the db_owner role so they can manage database role membership. As I didn’t want to add the developer to the db_owner role, a different solution was required.

John gives us a nice answer to this problem. Click through for the script.

Comments closed

Compression and Decompression with SQL Server

Randolph West asks a pair of questions:

On Twitter recently, I asked:

Does anyone I know use the COMPRESS and DECOMPRESS features in T-SQL?

To those who replied in the affirmative, I asked:

What made you decide on this as opposed to ROW or PAGE compression?

Read on to learn the responses. In my case, I’ve seen COMPRESS and DECOMPRESS used in two places. First, compressing large product descriptions (large enough to go to LOB). Second, I use it to compress binary models created via SQL Server Machine Learning Services. Some of those models compress quite nicely.

Comments closed

Managing On-Prem Data Gateways

Jamie Wick has some advice for managing Azure on-premises data gateways:

When organizations become aware of the On-premises Data Gateway application, there can be an explosion in the number of gateways that are connected to the tenant. Some people will invariably install a ‘temporary’ gateway on their local computer while they investigate/develop a new PowerBI report, PowerApp or Azure Automation process. All of these ‘temporary’ gateway instances can create problems when the systems that use them are moved into production and remain connected to the ‘temporary’ gateways.

In a previous post, I wrote about how to install and use an On-premises Data Gateway to connect on-site data with Azure-based apps (PowerBI, PowerApps, etc). Now we’re going to look at how to curtail the sprawl of gateways that may be occurring and how to remove any existing gateways that are deemed to be unnecessary.

Sprawl happens, but Jamie shows you how to limit it.

Comments closed

Recommendations for Implementing Azure Data Factory

Paul Andrew has a nice set of recommendations you should follow when configuring Azure Data Factory:

Building on our understanding of generic datasets, a good Data Factory should include (where possible) generic pipelines, these are driven from metadata to simplify (as a minimum) data ingestion operations. Typically I use an Azure SQLDB to house my metadata with stored procedures that get called via Lookup activities to return everything a pipeline needs to know.

This metadata driven approach means deployments to Data Factory for new data sources are greatly reduced and only adding new values to a database table is required. The pipeline itself doesn’t need to be complicated. Copying CSV files from a local file server to Data Lake Storage could be done with just three activities, shown below.

There are several good recommendations here; read the whole thing.

Comments closed

SSMS Regular Expressions

Tim Mitchell looks at regular expressions in SQL Server Management Studio:

Regular expressions (or simply regex for short) have long been used by system administrators and data professionals for searching and manipulating text. Regular expressions allow the user to find, replace, and manipulate text based on the pattern they define in the expression. While every text editor allows simple search-and-replace capabilities, regex allows for searching for partial matches, using wildcards, and even integrating special characters (such as newlines and tabs) into the search or replacement text.

Regular expressions have been a part of SSMS for as long as I can remember, and make the process of pattern-based SQL code search much easier. In this tip, I’ll show you a couple of brief examples of the use of regular expressions for working with SQL code in Management Studio.

Regular expressions have been in the product for a long time, but the set of available regular expressions changed when SSMS moved over to the Visual Studio shell. And in some ways (particularly around capture groups), that was a change for the worse.

Comments closed