Press "Enter" to skip to content

Category: Dates and Numbers

Integer Conversion and Rounding in SQL Server

Steve Jones points out a bit of rounding math:

Imagine that I have someone enter a value for the number of hours to include in a report. I enter 5 and the report divides this in half to go back 2.5 hours and forward 2.5 hours. I run this code at the top of my code block:

Click through for Steve’s example. This ultimately has to do with integer division. If you run the following code, you’ll still get 2 as the result:

SELECT CAST(5.99 / 2) AS INT;

This is because SQL Server discards the decimal during integer casting. DATEADD() simply works with the end result, post-cast.

Leave a Comment

Handling SQL Agent Dates and Durations

Andy Mallon disparages some Microsoft intern’s summer of 1996 project:

SQL Agent’s schema is older than me. It handles dates, times, and durations like it’s 1980 by using integers instead of date/time data types. My buddy Aaron Bertrand talks more about Dating Responsibly so that you can have a good datetime with your own database.

I was writing a query to pull recent job failures from SQL Agent’s msdb job history, and knew that I didn’t want to deal with the wonky date/time formats. Specifically, I was querying msdb.dbo.sysjobhistory to find the Start Time, End Time, and Duration of job runs that failed. If you aren’t familiar with that table, you can look at it over in the docs.

Andy does point out the built-in function but then explains why a separate function is superior. Andy also happens to furnish that function, so check it out.

Leave a Comment

Multi-Measure Calculations in Relational Databases

Greg Low describes a common business problem:

But while food wholesale systems will need to deal with quantities like I described in that post, they often have another layer of complexity. Items are often sold by:

  • Quantity
  • Weight
  • Quantity and Weight

This is an interesting look at how the domain can drive what a proper solution looks like. It also seems like a good use case for 6th normal form, with unit quantity and unit weight tables to prevent NULL from cropping up.

Comments closed

Converting Excel Dates and Times to SQL

Kristyna Ferris marks the date:

Hey data friends! This one comes from my personal vault (aka backlog of drafts I’ve been needing to write up) and is a really simple code that I always forget how to do. So, to save us all some ChatGPT-ing, here’s my tried-and-true way of converting Excel Date & Time fields to a true Date & Time in SQL.

Click through for an example of the process.

Comments closed

The Power of Rounding

Denny Cherry makes a change:

So, I ran across a problem with QuickBooks that involves some of the most basic math that we were all taught in elementary school: how to round numbers properly. You’d think that a company that makes accounting and invoicing software for a living would understand how rounding of numbers works. But based on the last hour of having to edit the data that gets send to QuickBooks from our internal system you’d be wrong.

Denny’s example is $3.18497736 and rounding happens after four decimal spots, so it’s $3.1849 or $3.1850. Denny expects $3.1850 and QuickBooks gives $3.1849.

In this case, Denny’s right. The part that confuses people is banker’s rounding, which has you round to the even number if your last digit is a 5. For example, if it were $3.18495 and you round to four spots after the decimal, that would be $3.1850. With $3.18485, it would round to $3.1848.

.NET uses banker’s rounding by default, which can confuse people unfamiliar with the concept. SQL Server, meanwhile, rounds the way that Denny expects: 5 or higher rounds up, 0-4 rounds down.

Comments closed

Blank Dates and DAX

Marco Russo and Alberto Ferrari are blanking on us:

Handling missing dates in a semantic model can be challenging, especially when working with DAX time intelligence functions. Dates might be missing for various reasons: incomplete data entry, system errors, special placeholder values like 0000, or dates set far in the future. We will see that using a blank is the best way to manage missing dates, even though you should pay attention to DAX conditional expressions operating on those dates. We will also consider how to hide these blanks in a Power BI report if their presence is not desired in charts and slicers.

Read on to learn more.

Comments closed

GiST Indexes and Range Queries in PostgreSQL

Lee Asher can’t be limited to a single point:

Our Part I query used the following WHERE clause:

WHERE tsrange(o.start_time, o.end_time) && tsrange(p.enter, p.leave)

The “tsrange()” functions return timestamp ranges. But overlap queries aren’t limited to timestamps; they can be constructed from integers and floating-point values too. Imagine an arbitrage database that tracks the minimum and maximum price paid for a commodity.

Read on for examples of other types of ranges, preventing range intersection, and more.

Comments closed

Converting SQL Audit FileTime to DateTime Format

Patrick Keisler helps a customer:

One of my customers recently wanted to rename each of the SQL audit files will the datetime stamp of when it was created. I explained to them the filename already contains a datetime stamp. While it does not look like a typical timestamp, it is based on the Windows Filetime data structure that is a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC). Nonetheless, they still wanted a traditional datetime stamp in the file name.

Read on to see how. I can understand the displeasure in adding redundancy to a filename, though I also understand the reasoning from the customer’s point of view: FileTime isn’t human-readable in any meaningful way.

Comments closed

Using Week-Based Calendars in Power BI

Marco Russo and Alberto Ferrari work in weeks:

Weekly calendars are common in manufacturing, retail, and any business that is sensitive to weekends or to the number of working days. For example, the scenario described in this article uses the number of pageviews on a website from 2019 to 2024, with data available until September 3, 2024. The website analyzed has a clear weekly trend, with slower traffic over the weekend, as shown in the following line chart with a daily granularity. It seems like a business website. A sports website would probably display the opposite trend.

Read on to see some of the challenges around week-based calendars. There’s a reason I have a “Dates and Numbers” category on Curated SQL and it’s exactly for things like this: some of the most common things we as humans work with are extremely complex and fraught with exceptions, including calendars.

Comments closed

Power BI Data Type Optimization

Nikola Ilic shows how important it can be to choose the right data types:

For demo purposes, I’ll be using a fact table that contains the data about chats performed by a customer support department of the fictitious company Customer First. This table includes approximately 9 million rows, which is not considered a large table in the context of Power BI and analytical workloads. For the sake of simplicity, let’s pretend that our model consists of only this single table. Finally, a semantic model is configured as an Import mode model. If you want to learn how your data is stored in Power BI, I suggest you start by reading this article first.

Data was loaded into Power BI from the underlying data source (SQL Server database) as-is, without any additional optimizations applied.

Nikola walks through the process of finding the most expensive columns in terms of data size and using the least precise acceptable value. One other thing that I commonly see is identity columns or other keys on fact tables. Those are very rarely necessary, because the point of a fact table is typically to aggregate it in some fashion. And these keys are unique (by design), meaning they won’t compress very well and will take up a lot of space. Looking at Nikola’s example, my next question would be, knowing that the name of the table is factChat, does chatID tie to some chat dimension? If not, is it actually necessary for reporting? Again, if not, that could shave off another 60 MB or so from the data model.

Comments closed