Press "Enter" to skip to content

Category: T-SQL Tuesday

Building a Data Detective Toolkit

Deb Melkin talks tools:

Happy T-SQL Tuesday! I wasn’t really sure I’d be able to crank something out for this one but somehow I managed to squeeze it in. Tim Mitchell ( b ) is hosting and he has a great topic for us: What’s in our Data Detective toolkit?

I love this topic for so many reasons. Partly because I feel like I’m asked to look at so many projects where I’m dropped in and asked to figure things out, usually performance related but occasionally new functionality or features. But as I’m asked to do this fairly often, I may have to see if Data Detective can be my new title… hmm…

Being a Data Detective in a film noir. On the one hand, that sounds like a really neat idea. On the other hand, things usually don’t turn out so well for the detective.

Comments closed

Generating a Multi-Aggregate Pivot in Spark

Richard Swinbank troubleshoots an issue:

I’m using a stream watermark to handle late arriving data – basically1) my watermark enables the stream to accept data arriving up to 10 seconds late …and that’s where the problem shows up.

When I run this streaming query – in Azure Databricks I can do this simply with display(df_pivot) – I receive the error:

AnalysisException: Detected pattern of possible ‘correctness’ issue due to global watermark. The query contains stateful operation which can emit rows older than the current watermark plus allowed late record delay, which are “late rows” in downstream stateful operations and these rows can be discarded. Please refer the programming guide doc for more details. If you understand the possible risk of correctness issue and still need to run the query, you can disable this check by setting the config `spark.sql.streaming.statefulOperator.checkCorrectness.enabled` to false.

Read on to learn more about the scenario, the issue, and the solution.

Comments closed

Query Start Times in Query Store

Hugo Kornelis describes an issue:

I was hired by a customer who had a very annoying issue with the daily data load of their data warehouse. The volume of data to be loaded is high and they were already struggling to finish the load before business opens. But that was not their biggest issue. The biggest problem, the real pain point that they hired me for, is that at unpredictable moments, the load would run much longer than normal, pushing it well into business hours. They wanted me to find out what caused those irregular delays, and find a way to stop them from happening.

Read on to learn more about the issue itself, as well as a discrepancy in what Query Store showed. Hugo also points out that the quick-and-easy solution may not be the right solution.

Comments closed

T-SQL Tuesday 177 Roundup

Mala Mahadevan gives us the low-down on database code management:

I was privileged to host yet another T-SQL Tuesday, for the month of August, 2024. My topic was on Managing database code. I was worried about getting responses, given that most of the community is no longer on X/Twitter and my invite didn’t seem to get much attention. I was wrong. People seem to keep track of this by other means and the response was excellent. Summary as below.

Read on for 11 takes on the topic.

Comments closed

T-SQL Tuesday 175 Round-Up

Andy Leonard rounds ’em up:

It’s time to celebrate and confess.

I was honored to host the June 2024 edition of T-SQL Tuesday – #175! That’s the celebration.

Confession: I had GoDaddy add a firewall back in February and it worked well. Too well, in fact! A friend reached out to let me know comments on the blog post – titled T-SQL Tuesday #175: Old Tech, New Tech, Bold Tech, Blue Tech –  was returning a nasty ACCESS DENIED message:

It was a bit of a short month in terms of turnout, but click through for Andy’s summary.

Comments closed

T-SQL Tuesday 174 Round-Up

I do a thing:

I thought about doing this in the normal Curated SQL style, where I grab a graf from each post. But instead, you get the second-best Curated SQL format: the hodge-podge bulleted list, but still in present tense and with a touch of commentary and the occasional rant.

Really, it should be “I did a thing” but I can’t go having that past tense nonsense here, now can I?

Thank you again to everyone who contributed. It was a lot of fun reading through all of these.

Comments closed

Durability and Hekaton

Rob Farley ponders a pair of potential performance improvements and their effects on durability:

Durability in SQL is handled by making sure that data changes are written to disk before the calling application is informed that the transaction is complete. We don’t walk out of a shop with our goods before the cashier has confirmed that the credit card payment has worked. If we did, and the payment bounced, the cashier would be calling us back pretty quickly. In SQL this is about confirming that the transaction log entry has been written, and it’s why you shouldn’t use disks with write-cache for databases

And yet, in-memory features of SQL, commonly called “Hekaton” handles transactions without writing to disk. The durability is delayed. This month, Todd Kleinhans invites us to write about Hekaton.

In-Memory OLTP is one of those features that I wish worked better for most use cases or didn’t have as many limitations around only working within the context of a single database. In practice, instead of using In-Memory OLTP for most tables, you’re usually better off just jamming more RAM on the box and limiting how many scans of large tables flush your buffer pool.

Comments closed

The Most Recent Issues You’ve Closed

Brent Ozar wraps up this month’s T-SQL Tuesday:

So when I’m meeting a new team and learning what they do,  I’ve found it helpful to ask, “What specifically was the last issue you closed?” Note that I don’t ask, “What are you working on now?” because that tends to lead to long-term projects that people want to do, but not necessarily what they’re paid to do. If you ask them about the last specific task they checked off, that’s usually related to something the company demands that they do because it’s urgent. It leads to fun discoveries about what people think they do, versus why managers really keep them around on the payroll.

Click through for this month’s list of respondents.

Comments closed

Troubleshooting a Slow Deletion

Aaron Bertrand has an admission:

Before looking at the code path, the query, or the execution plan, I didn’t even believe the application would regularly perform a hard delete. Teams typically soft delete “expensive” things that are ever-growing (e.g., change an IsActive column from 1 to 0). Deleting a user is bound to be expensive, because there are usually many inbound foreign keys that have to be validated for the delete to succeed. Also, every index has to be updated as part of the operation. On top of that, there are often triggers that fire on delete.

While I know that we do sometimes soft delete users, the engineer assured me that the application does, in some cases, hard delete users.

Click through for the full story and a minor bout of self-petard-hosting. I’m as guilty as anyone else of jumping to conclusions, and this is a good reminder to go through the process even when you think you know the answer.

Comments closed