One thing I see fairly often (and am occasionally guilty of myself) is using COUNT(DISTINCT) and DISTINCT interchangeably to get an idea of the number of unique values in a column.
While they will sometimes give you the same results, depending on the data the two methods may not always be interchangeable.
Click through for examples.
Or if you are Rob Volk (@sql_r on Twitter), and you want to create an annoying database on your best frenemy’s SQL Server that includes brackets in the name, like:
This [database] Is Awesome
You will need to do:
CREATE DATABASE [This [database]] Is Awesome];
I’m not saying you should do that, but I’m also not saying you shouldn’t.
I noted that one of the columns failed to convert VARCHAR to DECIMAL.
The error message is below, and it’s usually fairly easy to sort:
Error converting data type varchar to numeric
Normally, I’d use ISNUMERIC to identify the rows that fail to have a value in that column that could be converted to a number. Then, I could identify the value, and then I could replace or exclude it, as required.
However, on this occasion, using ISNUMERIC failed to identify any columns as being non-numeric.
Click through to see why Jen got this result.
If you don’t read the rest of this setup, I want you to take away one thing.
Always reference your tables with your columns when more than one table is involved in the query!
This post is made primarily with SQL Server in mind, but this behavior is actually ANSI SQL and can be replicated in PostgreSQL, MySQL, and Oracle.
Jon’s example is a case where perfectly valid ANSI SQL logic (which is why you can replicate this across platforms and why it’s not a bug) leads to an unexpected result.
Sometimes you need to find the previous value in a column. Easy enough, the LAG window function makes this a breeze (available since SQL Server 2012). But what if the previous value cannot be null? You can pass a default, but we actually need the previous value that was not null, even if it is a few rows back. This makes it a bit harder. T-SQL guru Itzik Ben-Gan has written about the solution to this problem: The Last non NULL Puzzle. It’s a bit of tricky solution.
Click through for the magic words and if you’re on the SQL Server side, upvote this issue to get that functionality in SQL Server too.
I am telling you personally that I hate the use of DISTINCT.
DISTINCT used by those people, who are not sure about their data set or SELECT statement or JOINS.
Whenever I get any query with DISTINCT, immediately I suggest to remove it.
I agree with this sentiment about 85% of the time. There are cases where I know l am working with data at a finer grain than I need and the counts aren’t important. But just tossing a
DISTINCT on a query to stop it from repeating rows is the wrong approach: figure out why that repetition happens and fix it.
So, when specifying a “new_reseed_value“, the possible scenarios covered are:
1. Rows exist
2. No rows due to none inserted since the table was created
3. No rows due to
What’s missing? The following scenario:
No rows due to
Click through to see how
DBCC CHECKIDENT behaves differently depending upon the scenario.
— so, before SWITCHOFFSET existed, …
SELECT SWITCHOFFSET(SYSDATETIMEOFFSET(),'-05:00') AS [EST the easy way], TODATETIMEOFFSET(DATEADD(HOUR, -5, SYSDATETIMEOFFSET()), '-05:00') AS [EST the hard way]
— so, thinking of a DATETIMEOFFSET data type as a complex object
— with many different parts: year, month, day, hour, time zone, etc.
— it looks like SWITCHOFFSET changes two things: time zone and hour
This was an interesting video. I typically think entirely in UTC and let the calling application convert to time zones as needed, but if that’s not an option for you, knowing about
SWITCHOFFSET() is valuable.
That’s right. I defined an expression – a concatenation of two fields – and used the alias of that expression in another expression. Furthermore, I used the alias of the second expression in the WHERE clause. Gasp.
My workaround is to use
CROSS APPLY and define calculations in blocks there. This doesn’t work for aggregation operations, but in cases like Koen’s example, it does simplify the
WHERE clauses. This is a nicer solution, though.
Oracle has the most sophisticated implementation, which supports both the ordered set aggregate function, and the window function version:
– Aggregate function:
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x)
– Window function:
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x) OVER (PARTITION BY y)
But there are ways to calculate
PERCENTILE_DISC() using a couple of window functions, so read the whole thing.