Press "Enter" to skip to content

Category: Dates and Numbers

Invalid Dates And Power BI

Melissa Coates notes a discrepancy between the Desktop and Service versions of Power BI:

Last week I got involved with a customer issue. A refresh of the data imported to a PBIX always works in Power BI Desktop, but the refresh operation intermittently fails in the Power BI Service. Their workaround had been to refresh the PBIX in Desktop and re-upload the file to the Service. This post is about finding and fixing the root cause of the issue – this is as of March 2018, so this behavior may very well change in the future.

Turns out, the source of the problem was that the customer’s Open Orders table can contain invalid dates – not all rows, just some rows. Since Open Orders data can fluctuate, that explains why it presented as an intermittent refresh issue. Here’s a simple mockup that shows one row which contains an invalid date:

At this point, we have two open questions:
(1) What is causing the refresh error?
(2) Why is the refresh behavior different in the Service than the Desktop tool?

Read on for the explanation of the difference, as well as a fix to prevent refresh errors due to invalid dates.

Comments closed

Calculating The End Of The Month

Bob Pusateri gives us a few techniques for calculating the last day of a particular month:

Months are funny. Unlike other parts of a date, they vary in length:

  • The last second of a minute is always 59.
  • The last minute of a hour is always 59.
  • The last hour of a day is always 23.

But the last day of a month? Well that depends on what month it is. And the year matters too because a leap year means February gets an extra day.

Click through for several techniques, including the knuckle technique for advanced practitioners.  But what if I need to calculate the end of a lunar month?

Comments closed

Don’t Use SMALLDATETIME

Randolph West argues against using the SMALLDATETIME data type:

But let’s say you don’t need that kind of accuracy and are happy with a granularity to the nearest minute. Maybe you’re storing time cards and don’t think it’s necessary to store seconds. As discussed in the Fundamentals series, you really want to choose the most appropriate datatype for your data.

Enter SMALLDATETIME, which rounds up or down to the nearest minute. The seconds value for any SMALLDATETIME is 00. Values of 29.999 seconds or higher are automatically rounded up to the nearest minute, while values of 29.998 seconds or lower are rounded down.

Read on to see Randolph’s explanation of why he recommends against using SMALLDATETIME.

Comments closed

The DATETIME Type In SQL Server

Randolph West gets into the DATETIME data type:

DATETIME is an eight-byte datatype which stores both a date and time in one column, with an accuracy of three milliseconds. As we’ll see though, the distribution of this granularity may not be exactly what we’d expect.

Valid DATETIME values are January 1, 1753 00:00:00.000, through December 31, 9999 23:59:59.997. On older databases designed prior to SQL Server 2008, because there was no explicit support for date values, it was sometimes customary to leave off the time portion of a DATETIME value, and have it default to midnight on that morning. So for example today would be stored as February 21, 2018 00:00:00.000.

If you’re not particularly familiar with SQL Server data types, this is detailed enough information to get you going and to explain exactly why you shouldn’t use DATETIME anymore…

Comments closed

Converting Int To Time

Bill Fellows has a pop quiz for us:

Given the following DDL

CREATE TABLE dbo.IntToTime
( CREATE_TIME int
);

What will be the result of issuing the following command?

ALTER TABLE dbo.IntToTime ALTER COLUMN CREATE_TIME time NULL;

Clearly, if I’m asking, it’s not what you might expect.

Click through if you have not memorized your implicit conversion tables.

Comments closed

Using Date Types In Warehouses

Koen Verbeeck argues that date keys in warehouses should be actual date types:

The worst are by far the string representation, as there is no actual check on the contents. It can literally contain everything. And is ’01/02/2018′ the first of February 2018 (like any sane person would read, because days come before months), or the 2nd of January? So if you have to store dates in your data warehouse, avoid strings at all costs. No excuses.

The integer representation – e.g. 20171208 – is really popular. If I recall Kimball correctly, he said it’s the one exception where you can use smart keys, aka surrogate keys that have a meaning embedded into them. I used them for quite some time, but I believe I have found a better alternative: using the actual date data type.

I bounce back and forth, but I’m sympathetic to Koen’s argument, which you can read by clicking through.

Comments closed

Dealing With Dates In R

Mathew McLean shows how to convert strings to dates using a couple well-known packages and introduces flipTime:

The package flipTime provides utilities for working with time series and date-time data. The package can be installed from GitHub using

1
2
require(devtools)
install_github("Displayr/flipTime")

I will discuss only two functions from the package in this post, AsDate() and AsDateTime(). These are used for the conversion of date and date-time strings, respectively. These functions build on the convenience and speed of the lubridate function. Furthermore, the flipTime functions provide additional functionality (making them easier to use). The functions are smart about identifying the proper format to use. So the user doesn’t need to specify the format(s) as inputs. At the same time, both AsDate() and AsDateTime() are careful to not convert any strings to dates when they are not formatted as dates. Additionally, it will also warn the user when the dates are not in an unambiguous format.

Check it out.

Comments closed

Quick Date Formatting With CONVERT

Dave Mason lists the common date formats available with the CONVERT fucntion:

Displaying dates and times with different formats in TSQL is a task I run into quite a bit. I used to visit this page so many times, I’m surprised it doesn’t have a “Welcome back, Dave!” banner on it at the top.  After umpteen million times, I decided it was time to be more efficient. I created this query that’s come in handy numerous times. I considered dumping it into a view, but I’ve found it’s nice to copy/paste the CONVERT statement (directly from a script) and replace CURRENT_TIMESTAMP with whatever column I want to have formatted.

Click through for the script and sample output.

Comments closed

Using DATEADD Instead Of DATEDIFF

Michael J. Swart points out a bit of trickery with DATEDIFF:

I assumed that the DATEDIFF function I wrote worked this way: Subtract the two dates to get a timespan value and then return the number of seconds (rounded somehow) in that timespan.

But that’s not how it works. The docs for DATEDIFF say:

“Returns the count (signed integer) of the specified datepart boundaries crossed between the specified startdate and enddate.”

There’s no rounding involved. It just counts the ticks on the clock that are heard during a given timespan.

Read the whole thing.

Comments closed

When AT TIME ZONE Is Too Slow

Robert Davis troubleshoots a performance problem relating to time zones:

Time Zones were definitely being a drag today. I got an email from one of the developers at work asking about the performance difference between 2 queries. The only difference between the 2 queries is that one of them uses the AT TIME ZONE clause that was added in SQL Server 2016. I have not played around with this particular clause, but we do store quite a bit of data in the datetimeoffset data type. In the table in the developer’s queries, the data is all stored in the Eastern time zone, but they are considering storing it in additional time zones and will want to be able to display it in the Eastern time zone even if not stored that way. Thus, AT TIME ZONE.

When the developer was testing the conversion function, he noticed that the query slowed waaaayyyyy down when he added AT TIME ZONE. Before adding AT TIME ZONE to the query, STATISTICS TIME for the query was: CPU time: 145549 ms, elapsed time: 21693 ms.. It returned 8,996 rows, but if I removed the DISTINCT, it returned over 72M rows. That’s a lot of clams … er, data.

Read on for the rest of the story, including Robert’s solution.  Also check out his Connect item related to this.

Comments closed