Helper Predicates And Multi-Column Filters

Rob Farley has an interesting post on optimizing a lookup when you have separate date and time columns:

Here we see a Seek Predicate that looks for OrderDate values between two values that have been worked out elsewhere in the plan, but creating a range in which the right values must exist. This isn’t >= 20110805 00:00 and < 20110806 00:00 (which is what I would’ve made it), it’s something else. The value for start of this range must be smaller than 20110805 00:00, because it’s >, not >=. All we can really say is that when someone within Microsoft implemented how the QO should respond to this kind of predicate, they gave it enough information to come up with what I call a “helper predicate.”

Now, I would love Microsoft to make more functions sargable, but that particular request was Closed long before they retired Connect.

But maybe what I mean is for them to make more helper predicates.

The problem with helper predicates is that they almost certainly read more rows than you want. But it’s still way better than looking through the whole index.

Read the whole thing.

Using DATETIMEOFFSET

Randolph West continues his date and time data type series:

DATETIMEOFFSET works the same way as the DATETIME2 data type, except that it is also time zone aware. It is formatted as 'YYYY-MM-DD HH:mm:ss[.nnnnnnn][{+|-}hh:mm]'.

Got all that? YYYY represents a four-digit year, MM is a two-digit month between 1 and 12, DD is a two-digit day between 1 and 31 depending on the month, HH represents a two-digit hour between 0 and 23, mm is the minutes between 0 and 59, while ss is the number of seconds between 0 and 59. Once again, n represents between zero and seven decimal places in a fraction of a second.

The main difference from DATETIME2 is the time zone offset at the end, which is the number of hours and minutes as an offset from UTC time.

Read on for more.  I generally don’t use this date type much, preferring to stick with DATETIME2 and saving data as UTC.

Time Zone Conversion With M

Cedric Charlier shows how to perform time zone conversions with the M language in Power Query:

Everything is fine … except if I share my code with someone from another time zone. The function DateTimeZone.ToLocal is relying on regional settings and in that case my conversion should always be from UTC to “Brussels time”.

I didn’t find any other way to ensure that I’m always converting from UTC to “Brussels time” than implementing the conversion by myself. That’s the goal of the following function

Looks like there may not be a nice “convert to a different time zone” here like lubridate::with_tz() does in R.

DATETIME2 In SQL Server

Randolph West continues his SQL Server date & time data types series:

SQL Server 2008 introduced new data types to handle dates and times in a more intelligent way than the DATETIME and SMALLDATETIME types that we looked at previously.

This week, we look at the DATETIME2 data type. I’m not the first person to think that this was probably not the best name for a data type, but here we are, a decade later.

DATETIME2 is, at its heart, a combination of the DATE and TIME data types we covered in previous weeks. DATE is 3 bytes long and TIME is between 3 and 5 bytes long depending on accuracy. This of course means that DATETIME2 can be anything from 6 to 8 bytes in length.

Nowadays, if you want to store a date plus time, this should be your default, not DATETIME.

The TIME Data Type

Randolph West covers the TIME data type in SQL Server:

This week, we look at the TIME data type. It is formatted as HH:mm:ss.fffffff, where HH is hours between 0 and 23, mmis minutes between 0 and 59, ss is seconds between 0 and 59, and f represents 0 or more fractional seconds, up to a maximum of seven decimal places.

With a maximum length of 5 bytes, TIME can store a value with a granularity of up to 100 nanoseconds.

I tend not to use TIME very often.  It’s useful if you need it, but I rarely find myself needing a dateless time.

The Date Data Type

Randolph West continues his dates and times series:

QL Server 2008 introduced new data types to handle dates and times in a more intelligent way than the previous DATETIME and SMALLDATETIME types that we looked at previously.

The first one we look at this week is DATE. Whereas DATETIME uses eight bytes and SMALLDATETIME uses four bytes  to store their values, DATE only needs a slender three bytes to store any date value between 0001-01-01 and 9999-12-31inclusive.

The DATE data type was a fantastic addition to SQL Server 2008.

Invalid Dates And Power BI

Melissa Coates notes a discrepancy between the Desktop and Service versions of Power BI:

Last week I got involved with a customer issue. A refresh of the data imported to a PBIX always works in Power BI Desktop, but the refresh operation intermittently fails in the Power BI Service. Their workaround had been to refresh the PBIX in Desktop and re-upload the file to the Service. This post is about finding and fixing the root cause of the issue – this is as of March 2018, so this behavior may very well change in the future.

Turns out, the source of the problem was that the customer’s Open Orders table can contain invalid dates – not all rows, just some rows. Since Open Orders data can fluctuate, that explains why it presented as an intermittent refresh issue. Here’s a simple mockup that shows one row which contains an invalid date:

At this point, we have two open questions:
(1) What is causing the refresh error?
(2) Why is the refresh behavior different in the Service than the Desktop tool?

Read on for the explanation of the difference, as well as a fix to prevent refresh errors due to invalid dates.

Calculating The End Of The Month

Bob Pusateri gives us a few techniques for calculating the last day of a particular month:

Months are funny. Unlike other parts of a date, they vary in length:

  • The last second of a minute is always 59.
  • The last minute of a hour is always 59.
  • The last hour of a day is always 23.

But the last day of a month? Well that depends on what month it is. And the year matters too because a leap year means February gets an extra day.

Click through for several techniques, including the knuckle technique for advanced practitioners.  But what if I need to calculate the end of a lunar month?

Don’t Use SMALLDATETIME

Randolph West argues against using the SMALLDATETIME data type:

But let’s say you don’t need that kind of accuracy and are happy with a granularity to the nearest minute. Maybe you’re storing time cards and don’t think it’s necessary to store seconds. As discussed in the Fundamentals series, you really want to choose the most appropriate datatype for your data.

Enter SMALLDATETIME, which rounds up or down to the nearest minute. The seconds value for any SMALLDATETIME is 00. Values of 29.999 seconds or higher are automatically rounded up to the nearest minute, while values of 29.998 seconds or lower are rounded down.

Read on to see Randolph’s explanation of why he recommends against using SMALLDATETIME.

The DATETIME Type In SQL Server

Randolph West gets into the DATETIME data type:

DATETIME is an eight-byte datatype which stores both a date and time in one column, with an accuracy of three milliseconds. As we’ll see though, the distribution of this granularity may not be exactly what we’d expect.

Valid DATETIME values are January 1, 1753 00:00:00.000, through December 31, 9999 23:59:59.997. On older databases designed prior to SQL Server 2008, because there was no explicit support for date values, it was sometimes customary to leave off the time portion of a DATETIME value, and have it default to midnight on that morning. So for example today would be stored as February 21, 2018 00:00:00.000.

If you’re not particularly familiar with SQL Server data types, this is detailed enough information to get you going and to explain exactly why you shouldn’t use DATETIME anymore…

Categories

April 2018
MTWTFSS
« Mar  
 1
2345678
9101112131415
16171819202122
23242526272829
30