Press "Enter" to skip to content

Category: Dates and Numbers

Date Intervals in PostgreSQL Window Functions

Hubert Lubaczewski solves a problem:

Since I can’t copy paste the text, I’ll try to write what I remember:

Given table sessions, with columns: user_id, login_time, and country_id, list all cases where single account logged to the system from more than one country within 2 hour time frame.

The idea behind is that it would be a tool to find hacked account, based on idea that you generally can’t change country within 2 hours. Which is somewhat true.

Solution in the blogpost suggested joining sessions table with itself, using some inequality condition. I think we can do better…

Click through for a solution that works for PostgreSQL but not SQL Server because the latter doesn’t offer date and time intervals on window function frames.

To do this in SQL Server, I’d probably use LAG() and get the prior value of country ID and the prior login time. Something like the following query, though I didn’t run detailed performance checks.

WITH records AS
(
	SELECT
		s.user_id,
		s.login_time,
		s.country_id,
		LAG(s.login_time) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_login_time,
		LAG(s.country_id) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_country_id
	FROM sessions s
)
SELECT *
FROM records r
WHERE
	r.prior_country_id <> r.country_id
	AND DATEDIFF(HOUR, r.prior_login_time, r.login_time) <= 2;
Leave a Comment

Generating Calendar Dates in T-SQL

Rick Dobson writes some code:

Sometimes it is convenient to generate calendar dates, such as for Fridays across multiple years. This tip demonstrates two different solutions for this kind of task. The demonstrations implement recursive CTEs, chained CTEs, as well as a stored procedure with both input and output parameters. These two different demonstrations are described so that you can incorporate either or both into your own calendar date generation projects.

Honestly, this is where a good calendar table (or date dimension) would be extremely handy.

Leave a Comment

Creating a Day-Level Slicer in Power BI

Ben Richardson knows what day it is:

Whether you’re tracking daily sales, monitoring customer behaviour, or managing service operations:

Day-level slicers in Power BI give you the precision your reports need.

Power BI offers great flexibility for filtering by year, quarter, or month.

But when daily trends drive your decisions, default settings just aren’t enough.

Read on to see how you can take the normal slicer and make it a bit better for working with dates. And, of course, it also makes sense to have a date dimension to help out with these sorts of things.

Comments closed

Time Series Feature Engineering in Pandas

Matthew Mayo knows that time is a flat circle:

Feature engineering is one of the most important steps when it comes to building effective machine learning models, and this is no less important when dealing with time-series data. By being able to create meaningful features from temporal data, you can unlock predictive power that is unavailable when applied to raw timestamps alone.

Fortunately for us all, Pandas offers a powerful and flexible set of operations for manipulating and creating time-series features.

Click through for seven things you can do in Pandas to extend or work with time series data.

Comments closed

Date and Time Data Types in MySQL and PostgreSQL

Aisha Bukar compares and contrasts:

MySQL and PostgreSQL offer several data types that can be used for handling dates and times. These data types provide the tools to store and manage information like dates of a particular event, timestamps, and even time durations. While they both share some similarities on how they handle date and time, there are key differences in how they handle precision, time zones, and date/time calculations.

Getting date and time data right is key for keeping databases accurate and useful. In this article, we will compare how MySQL and PostgreSQL handle date and time data, their differences, strengths, and which one might work better for your needs. By the end, you’ll have a clearer idea of which database to choose for managing date and time information.

Click through to learn about the two platforms.

Comments closed

Backfilling Data in TimescaleDB

Semab Tariq takes us through a problem:

Backfilling data into a TimescaleDB hypertable in production can be very tricky, especially when automated processes like compression policies are involved. From past experience, we have seen that if backfill operations aren’t handled properly, they can interfere with these automated tasks, sometimes causing them to stop working altogether. 

This blog covers a safer and more reliable approach to backfilling hypertables, along with best practices to prevent disruptions to compression and other background processes.

Read on for several tips. Backfills can be challenging in any database, but time-scale databases like TimescaleDB introduce their own unique issues.

Comments closed

End of Month Testing in Powershell

Andy Levy checks if this is the end of the month:

This is one of those blog posts you write so that 2 years later, you can look it up to remind yourself how to do something.

I found myself needing to figure out if “today” was the end of the month in PowerShell. In T-SQL, this is easy, as we have the EOMONTH() function. But PowerShell (the .NET System.DateTime struct) doesn’t have the same thing.

Read on for the solution Andy came up with.

Comments closed

Integer Conversion and Rounding in SQL Server

Steve Jones points out a bit of rounding math:

Imagine that I have someone enter a value for the number of hours to include in a report. I enter 5 and the report divides this in half to go back 2.5 hours and forward 2.5 hours. I run this code at the top of my code block:

Click through for Steve’s example. This ultimately has to do with integer division. If you run the following code, you’ll still get 2 as the result:

SELECT CAST(5.99 / 2) AS INT;

This is because SQL Server discards the decimal during integer casting. DATEADD() simply works with the end result, post-cast.

Comments closed

Handling SQL Agent Dates and Durations

Andy Mallon disparages some Microsoft intern’s summer of 1996 project:

SQL Agent’s schema is older than me. It handles dates, times, and durations like it’s 1980 by using integers instead of date/time data types. My buddy Aaron Bertrand talks more about Dating Responsibly so that you can have a good datetime with your own database.

I was writing a query to pull recent job failures from SQL Agent’s msdb job history, and knew that I didn’t want to deal with the wonky date/time formats. Specifically, I was querying msdb.dbo.sysjobhistory to find the Start Time, End Time, and Duration of job runs that failed. If you aren’t familiar with that table, you can look at it over in the docs.

Andy does point out the built-in function but then explains why a separate function is superior. Andy also happens to furnish that function, so check it out.

Comments closed

Multi-Measure Calculations in Relational Databases

Greg Low describes a common business problem:

But while food wholesale systems will need to deal with quantities like I described in that post, they often have another layer of complexity. Items are often sold by:

  • Quantity
  • Weight
  • Quantity and Weight

This is an interesting look at how the domain can drive what a proper solution looks like. It also seems like a good use case for 6th normal form, with unit quantity and unit weight tables to prevent NULL from cropping up.

Comments closed