Press "Enter" to skip to content

Day: February 15, 2024

Working with Date Sequences in R

Steven Sanderson isn’t satisfied with a single date:

In the world of data analysis and manipulation, working with dates is a common and crucial task. Whether you’re analyzing financial data, tracking trends over time, or forecasting future events, understanding how to generate date sequences efficiently is essential. In this blog post, we’ll explore three powerful R packages—lubridate, timetk, and base R—that make working with dates a breeze. By the end of this guide, you’ll be equipped with the knowledge to generate date sequences effortlessly and efficiently in R.

Click through for several ways to generate date sequences, including weekly sequences.

Comments closed

Weirdness with Aggregation

Erik Darling digs into a problem. Part 1 sets up the scenario:

Here’s the query plan, which yes, you’re reading correctly, runs for ~23 seconds, fully joining both tables prior to doing the final aggregation.

I’m showing you a little extra here, because there are missing index requests that the optimizer asks for, but we’ll talk about those in tomorrow’s post.

The wait stats for this query, since it’s running in Batch Mode, are predictably HT-related.

Part 2 covers those missing indexes:

I’ve taken a small bit of artistic license with them.

The crappy thing is… They really do not help and in some cases things get substantially worse.

Maybe it’s because it’s early and I’m trying to compile things in my head rather than actually trying it out, but it seems like a combo of CTE + CROSS APPLY or a pair of CROSS APPLY statements could work better (especially with a good index), assuming that join doesn’t need to be in place. Given the query as it is, with two MAX() aggregations and no GROUP BY clause, that could be an avenue for improvement, though one I have not actually tested. Nonetheless, read both of Erik’s posts.

Comments closed

Data Loading with BCP

Peter Schott describes a recent bit of messiness:

However, at the time this popped up, my most recent “ticket” was a separate request. I’d been chatting with a client who had mentioned that they were closing an account for one of the SaaS apps they use. The vendor would provide DDL and extract files for import into their own system, but only after the account was closed. We chatted back and forth about some ideas for them to load the data into their own Azure SQL DB instance. At one point, he asked if I’d want to just do it for a small consulting fee. We chatted a bit more and he realized that he really didn’t want to do it.

Read on for the rest of the story. BCP is powerful but always felt finicky to me. Either that or I wasn’t very good at using it. Either could be the case.

Comments closed

Azure SQL DB Serverless for Hyperscale now GA

Morgan Oslake has an announcement:

Optimizing resource allocation to achieve performance goals while controlling costs can be a challenging balance to strike especially for database workloads with complex usage patterns.  Azure SQL Database serverless provides a solution to help address these challenges, but until now the general availability of serverless has only been available in the General Purpose tier.  However, many workloads that can benefit from serverless may require greater performance and scale along with other capabilities unique to the Hyperscale tier.

We are pleased to announce the general availability of serverless auto-scaling for Hyperscale in Azure SQL Database.  The benefits of serverless and Hyperscale now come together into a single database solution.

Read on to see what this means for you and how it can change the billing strategy around Hyperscale.

Comments closed

Modeling I/O Utilization with Resource Governor

Michael J. Swart does some modeling:

How do we predict whether it’s safe to put workloads from two servers onto one?

We use Availability Groups to create readable secondary replicas (which I’ll call mirrors). The mirrors are used to offload reporting workloads. The mirrors are mostly bound by IOPS and the primaries are mostly bound by CPU, so I wondered “Is there any wiggle room that lets us consolidate these servers?”

Can we point the reporting workloads (queries) at the primary replica safely?

Read on for the answers to these questions.

Comments closed

Identifying Old OLEDB and ODBC Drivers on Machines

Lucas Kartawidjaja goes on a quest:

The vulnerabilities are affecting Microsoft ODBC Driver 17 and 18, as well as OLE DB Driver 18 and 19. For more information and also download location for the security update/ hotfix can be found on the following page: Update: Hotfixes released for ODBC and OLE DB drivers for SQL Server

We do an automated security scanning tool that would flag the systems (servers, desktops, latptops, etc.) that haven’t been patched. So we can quickly identify the systems that need to be patch and patched those systems quickly.

For this post, I was wondering if there is a quick way to identify Microsoft ODBC and OLE DB drivers that are being installed on the systems. 

Click through to see what Lucas came up with.

Comments closed

Viewing DAX in Microsoft Fabric with SemPy

Kevin Chant talks about a recent issue:

Recently I have been helping others get up to speed with Microsoft Fabric. Which includes going through some Power BI topics.

One issue that came up was how to show them the DAX used for a measure within a Power BI report that had been published to Microsoft Fabric. To link working with measures in Power BI Desktop with working in Microsoft Fabric.

Kevin shows the normal way of doing this, as well as an alternative using the SemPy library.

Comments closed