Press "Enter" to skip to content

Curated SQL Posts

The Unique Properties of DateTimeOffset

Rob Farley analyzes a special data type:

And as I have a unique index on this, it won’t let me insert 00:30 in UTC+11, because 00:00 in UTC+10:30 is already there. It tells me “Msg 2627, Level 14, State 1, Line 19. Violation of PRIMARY KEY constraint ‘pkTimesOffset’. Cannot insert duplicate key in object ‘dbo.TimesOffsets’. The duplicate key value is (2021-01-01 00:30:00.0000000 +11:00).”

My general rule is to store everything in SQL Server as UTC. If I did not do this, I would very strongly advocate for using DateTimeOffsets regardless of the extra data length. I’ve experienced the pain of mismatched date and time details one too many times for that.

Fun bonus fact: the same applies to .NET as well. If I control the system, I’m using DateTime.UtcNow for everything. If not, I’m leaning heavily toward DateTimeOffset by default. Again, too many times have I experienced that source system X has times marked in Pacific Standard Time pushing data to a server in Eastern Standard Time, and then mixing in a server based in Central Standard Time and having people confused because “the times are wrong.”

Comments closed

A Review of Numerical Data Types

Lina Kovacheva reviews SQL Server’s numerical data types:

SQL Server provides two dedicated data types for storing monetary values. You can think of MONEY and SMALLMONEY as DECIMAL – Money is effectively the same as DECIMAL(19,4) while SMALLMONEY is effectively the same as DECIMAL(10,4). If you are planning to use MONEY you should have in mind that performing division and multiplication can cause rounding errors that result in the unintentional loss of precision. The cause of the problem is that MONEY only saves information up to the 4th decimal place and if your multiplication or division results in an integer that goes to the 5th decimal place or more, MONEY will round it off, causing an accuracy error.

Click through for an analysis of these data types.

Comments closed

SQL Server Data Types: Bit vs the World

Kevin Chant is a fan of the bit type:

I decided to tweak it a bit for this post, to provide a humorous comparison between the bit data type and others that are available in SQL Server. In reality, this won’t cover every single one.

By the end of this post, you will some see pitfalls to using certain data types and some tips on how to avoid them. Plus, you will find out which data type is my least favourite to use.

Click through for a “haha-just-serious” take on an underappreciated datatype which nonetheless can’t decide if it’s a boolean or not.

Comments closed

Approximate Count Distinct

Deepthi Goguri looks at the APPROX_COUNT_DISTINCT() function in SQL Server 2019:

In the previous post, we learned about Scalar UDF Inliningfeature. In this post, let’s focus on the Approximate QP with Approx_Count_Distinct feature introduced as a part of SQL Server 2019 Intelligent Query Processing features. This feature is useful to get the approximate count of distinct values just like the  Count distinct function to get the distinct number of records but this new feature will take less amount of CPU and memory to process the query.

Click through for more info. The algorithm itself is also pretty interesting.

Comments closed

Spring Cleaning Shiny Projects

Mirai Solutions has some tips on cleaning up Shiny apps:

How to apply the spring cleaning principles and advanced programming to your Shiny App.

1. Deep breeze and allocate some time

Do not avoid spring cleaning simply because you don’t know where to start from. Prioritize some time for the task and get inspired by our following points.

Click through for advice on tools and processes to make this code easier to understand. H/T R-Bloggers

Comments closed

Hash Distributions, Shuffling, and Data Types

Reiss McSporran explains an issue in Azure Synapse Analytics dedicated SQL pools:

Oh dear. Shuffle. On the surface, shuffle is not necessarily a bad thing and may even be expected, but not in this scenario. So what is shuffle and why can it be bad?

Shuffle occurs when a part of a distributed table is moved to a different node during query execution. To do this a hash value is computed using the join columns, the node is then found that has that hash value and the row is then sent to that node for processing. If the tables you’re joining on have different hash distributions, you’d expect this to happen, (hence why it’s not always a bad thing!), but in this case I’ve used the same value to hash both of my tables, so what’s gone wrong?

Click through to see what’s gone wrong.

Comments closed

Capturing Extended Properties when a Table is Dropped

Thomas Rushton attempts to preserve extended property history:

Custom extended properties can be a useful tool when creating a self-documenting database. However, in an environment where tables are routinely dropped and recreated by software/systems not within your control, it may be desirable to have these stored so they can be repopulated when the table is recreated.

Read on to learn about the results of each test. Also read on for a use of the term “faff” in its appropriate context.

Comments closed

Power BI: Parquet Files and Streamed Binary Values Error

Chris Webb explains an error:

If you’re using the new Parquet connector in Power BI there’s a chance you will run into the following error:

Parameter.Error: Parquet.Document cannot be used with streamed binary values.
Details:
[Binary]

This isn’t a bug or anything that can be fixed, so it’s important to understand why it occurs and what you can do about it.

Click through for the explanation.

Comments closed

The Importance of Configuring SQL Server Agent Alerts

Chad Callihan has a story in three acts:

Once upon a time, I was having a relaxing morning…until I received an e-mail alert about an Error Number 825 on a database. The error stated:

‘D:\sql_dat\DatabaseName.mdf_MSSQL_DBCC47’ at offset 0x00000004b9e000 succeeded after failing 1 time(s) with error: 121(The semaphore timeout period has expired.). Additional messages in the SQL Server error log and system event log may provide more detail. This error condition threatens database integrity and must be corrected. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

That’s not how you want your day to go. Fortunately, the database was not one that needed to be online 24/7 so I felt I could relax a little bit while starting to investigate. At this point, the issue seemed to be with one database.

Click through to see if everyone lived happily ever after.

Comments closed