Press "Enter" to skip to content

Category: Dates and Numbers

DATE_BUCKET() Now GA in Fabric Data Warehouse

Jovan Popovic makes an announcement:

We have introduced a new DATE_BUCKET() function in Fabric Data Warehouse SQL language that makes reporting and analytics even easier.

In this blog post, you’ll discover how it simplifies time-based reporting and makes grouping dates effortless.

My experience is that DATE_BUCKET() takes a bit of effort getting used to, as it is not an intuitive function. That said, it can be really powerful for dealing with time series data. It is also available in SQL Server, as of SQL Server 2022.

Leave a Comment

Gaps in Identity Columns

Brent Ozar explains why there can be gaps in identity columns:

And you use that identity number for invoice numbers, or customer numbers, or whatever, and you expect that every single number will be taken up. For example, your accounting team might say, “We see order ID 1, 3, and 4 – what happened to order ID #2? Did someone take cash money from a customer, print up an invoice for them, and then delete the invoice and pocket the cash?”

Well, that might have happened. But there are all kinds of reasons why we’d have a gap in identities. One of the most common is failed or rolled-back transactions. To illustrate it, let’s start a transaction in one window:

I have a talk on applying forensic accounting techniques using SQL and Python (as well as an older version using R) and this is one of the things I bring up. In cases where you absolutely need contiguous numbers, the best I can do for you is no identity column and a stored procedure that runs in a SERIALIZED transaction isolation level, using an app lock to prevent anybody else from calling the stored procedure concurrently, taking a table lock out on the relevant table prior to doing any real work, and hard blocking everybody else until your transaction either succeeds or fails. And I’m not even 100% sure on that if you have enough concurrency to matter.

Leave a Comment

Using TimescaleDB in Postgres

Adron Hall put a database in his database:

I’ve been using TimescaleDB for time-series data on and off for a while now. I recently fired up Postgres.app for local development. It’s one of the cleanest ways to get PostgreSQL running on macOS, and adding TimescaleDB is surprisingly straightforward once you know where to look.

Time-series data is everywhere—sensor readings, application metrics, user events, IoT data. Regular PostgreSQL can handle it, but once you’re dealing with millions of rows, you’ll notice queries slowing down. TimescaleDB solves this by turning your time-series tables into hypertables that automatically partition by time, compress old data, and optimize queries. The best part? It’s still PostgreSQL, so all your existing tools and SQL knowledge work exactly the same.

Read on for a demo of how it all works.

Leave a Comment

Calendar-Based Time Intelligence and DirectQuery Performance

Chris Webb hits the Turbo button on his PC:

Calendar-based time intelligence (see here for the announcement and here for Marco and Alberto’s more in-depth article) is at least the second-most exciting thing to happen in DAX in the last few months: it makes many types of time intelligence calculation much easier to implement. But as far as I know only Reid Havens, in this video, has mentioned the performance impact of using this new feature and that was for Import mode. So I wondered: do these benefits also apply to DirectQuery mode? The answer is on balance yes but it’s not clear-cut.

Click through to see what Chris found.

Leave a Comment

Consider What You Count

Hans-Jürgen Schönig ran out of fingers:

The purpose of this post is not to share some fancy SQL techniques, some super cool magic AI tool that does whatever new hot thing or some crazy coding technique one can use in PostgreSQL. This is all about two simple things that are often forgotten or just neglected. We are of course talking about “awareness” and “clarity”.

Yes, it is that simple. If you don’t know what you have or if you don’t know what you want, all data is worthless (in a best case scenario) or even dangerous. 

The concept for discussion here is a simple count: how many customers do you have? But even fairly simple questions like this can cause difficulty in answering because of business complexities that we need to model in our databases. The problem is that the expedient answer may not be the correct one. This is also a key reason why we end up with debates in meetings about which value of revenue to use.

Comments closed

Custom Calendars in Power BI

Kenneth Omorodion needs a calendar dimension:

Before September 2025, there was a complex workaround to create time intelligence calculations in DAX catered to different calendar types apart from the standard Gregorian calendar. With the Power BI September 2025 feature updates (still in preview at the time of writing), it is now readily possible to define custom Power BI custom calendars based time intelligence, like Shifted Gregorian, ISO, and retail calendars, in the data model and then use the new extended DAX functions against these calendars.

The new feature eliminates the need for complex workarounds and ensures cleaner and more accurate reporting for organizations. This tip will explain the different calendars used in time intelligence reporting and how to define them based on the new calendar-based time intelligence capability in Power BI.

Read on for several examples of how this works.

Comments closed

Time Zone Conversions in SQL Server

Louis Davidson knows what time it is:

Recently a topic came up at work, and I remember seeing it on Reddit, so you know, that definitely is a sign that I had to write a blog on it.

For a lot of people, we typically standardize on a single time zone for our life, and in turn our data. Maybe it is the time zone of our headquarters, our house, and sometimes it is the standard time zone UTC. It is often suggested that every organization should use UTC, and that is a great suggestion for many things.

However…if you store all your times as UTC, this becomes a nightmare for some kinds of reporting.

Click through for a primer on the capabilities of the DATETIMEOFFSET data type and some of the functionality you can use with it. And Louis hits on one important note around daylight savings time changes: the date changes over the years. Another is that what’s in the Windows or Linux time zone database isn’t always historically accurate. For example, in 1974 and 1975, the US extended DST and began it in January and Feburary, respectively. But if you use Louis’s query, it claims DST started in April, which would have followed the 1966-1973 and 1976-2006 patterns but is historically inaccurate. You’ll find these sorts of things for a variety of countries because Daylight Savings Time adherence is kind of wild.

Comments closed

Calendar-Based Time Intelligence in DAX

Marco Russo and Alberto Ferrari grab a calendar:

Since its first release in 2010, DAX has had a set of time intelligence functions to simplify calculations like year-to-date, year-over-year, and so on. However, the calculations only supported the Gregorian calendar, without addressing similar requirements for other calendars, such as the 4-4-5, ISO, and many other non-Gregorian calendars. With the classic time intelligence, the columns of the Date table were unknown to the time intelligence functions, with the only exception of the date column in the Date table, typically Date[Date].

Click through to see what Marco and Alberto have come up with.

Comments closed

Date Intervals in PostgreSQL Window Functions

Hubert Lubaczewski solves a problem:

Since I can’t copy paste the text, I’ll try to write what I remember:

Given table sessions, with columns: user_id, login_time, and country_id, list all cases where single account logged to the system from more than one country within 2 hour time frame.

The idea behind is that it would be a tool to find hacked account, based on idea that you generally can’t change country within 2 hours. Which is somewhat true.

Solution in the blogpost suggested joining sessions table with itself, using some inequality condition. I think we can do better…

Click through for a solution that works for PostgreSQL but not SQL Server because the latter doesn’t offer date and time intervals on window function frames.

To do this in SQL Server, I’d probably use LAG() and get the prior value of country ID and the prior login time. Something like the following query, though I didn’t run detailed performance checks.

WITH records AS
(
	SELECT
		s.user_id,
		s.login_time,
		s.country_id,
		LAG(s.login_time) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_login_time,
		LAG(s.country_id) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_country_id
	FROM sessions s
)
SELECT *
FROM records r
WHERE
	r.prior_country_id <> r.country_id
	AND DATEDIFF(HOUR, r.prior_login_time, r.login_time) <= 2;
Comments closed

Generating Calendar Dates in T-SQL

Rick Dobson writes some code:

Sometimes it is convenient to generate calendar dates, such as for Fridays across multiple years. This tip demonstrates two different solutions for this kind of task. The demonstrations implement recursive CTEs, chained CTEs, as well as a stored procedure with both input and output parameters. These two different demonstrations are described so that you can incorporate either or both into your own calendar date generation projects.

Honestly, this is where a good calendar table (or date dimension) would be extremely handy.

Comments closed