Press "Enter" to skip to content

Month: January 2025

The Use Case for Memory-Optimized tempdb Metadata

Ben Johnston takes us through a scenario:

I recently had an interesting production SQL Server issue that turned out to be very easy to fix. The fix doesn’t fit every workload or server, but for the limited use cases described below, it’s a simple configuration change. The general best practice for server level configurations is to leave things at default, unless there is a specific reason for changing the settings. This will outline a use case for using memory-optimized tempdb metadata.

This covers a very specific scenario and workload. You won’t use this on a typical server, which is why it isn’t a default setting. You will see this for very specific server workloads, those with many transactions and high temp table usage. Most systems can better use the memory for the regular workload instead of optimizing tempdb metadata, so don’t use this as a default setting for your servers.

Click through for the scenario.

Comments closed

Cloning a Table in Snowflake

Kevin Wilkie creates a clone:

Recently, I was asked to compare data from a table in Snowflake with the same table’s data from a few hours before. As always, no one had thought about copying the data into another table before they started working with it. They just remembered an approximate time they started working with the table — 10 AM — and the table’s name — Public.WorkingTable. So, let’s see what we can do, shall we?

Read on for the process, as well as circumstances in which cloning might fail.

Comments closed

Postgres 18 Backend I/O Statistics

Bertrand Drouvot looks at some new statistics that will be available in Postgres 18:

It means that, thanks to the new pg_stat_get_backend_io() function, we can see the I/O activity in each backend (in addition to the overall aggregate still available through the pg_stat_io view).

Click through for some examples of what you can see and how it works. This seems like it’d be pretty helpful in a clustering scenario to make sure that you’re getting the load balancing that you expect.

Comments closed

Calculating Inter-Quartile Range and Z Score in T-SQL

Sebastiao Pereira hunts for outliers:

Outliers can significantly distort statistical analysis and lead to incorrect conclusions when interpreting data. In this article, we will look at how to find outliers in SQL Server using various T-SQL queries. Understanding how to find outliers in SQL is crucial for accurate data analysis.

Sebastiao uses PERCENTILE_CONT() in this demonstration. That works fine for relatively small tables, though it does not scale well at all. Once you’re in the millions of records, it gets slow. From there, my joke is that, if you have 100 million or more records, you can start a query with PERCENTILE_CONT() on one instance. Meanwhile, on a separate instance, as soon as you kick off that query, go install SQL Server ML Services, configure it, check out a tutorial on R or Python, figure out how you can calculate the inter-quartile range in that language, learn how ML Services works, and you’ll still get the answer before your first query finishes.

If you’re using SQL Server 2022, there is a new APPROX_PERCENTILE_CONT() that is orders of magnitude faster as you get increasingly large datasets. It’s also accurate to within 1.33% (on each side of the correct answer) within a 99% confidence. The way the query works is a bit different, though, because the approximation is a nested set function using a WITHIN GROUP() clause, whereas PERCENTILE_CONT() is a window function that uses an OVER() clause. That means it’s not quite as easy as slapping “APPROX_” to the start of the query, but because Sebastiao uses WITHIN GROUP in the T-SQL, it’s pretty close: PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY [ObsValue]) OVER() AS Q1 becomes APPROX_PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY [ObsValue]) AS Q1 or something like that–I’m compiling in the browser here.

Comments closed

Reading Data from Deleted Columns

Michael J. Swart checks the typewriter ribbon:

It’s hard to destroy data. Even when a column is dropped, the data is still physically there on the data page. We can use the undocumented/unsupported command DBCC PAGE to look at it.

This is tied in with how we can drop a column in SQL Server and have it not take a very long time: because when we drop the column, we’re just modifying the metadata associated with the table and telling SQL Server to ignore this bit here. Do read the whole thing, and also check out a fun comment from Paul White.

Comments closed

Setting RCSI on a Database in an Availability Group

Haripriya Naidu makes a change:

I was working on modifying isolation level of database from Read Committed to Read Committed Snapshot(RCSI) and had to get exclusive access to the database. After letting the application team know about it and having stopped their processes, I tried to set the database to SINGLE_USER but it errored out.

It turns out that you cannot set a database to single user mode if it is in an availability group or part of database mirroring. Nonetheless, there is still a way to make this change. Read on to learn more.

Comments closed

Sending E-mails in SSRS over Ports other than 25

Eitan Blumin fixes an error:

After configuring an Email subscription, the subscription result shows: “Failure sending mail: One or more errors occurred.“. In this blog post I will share how I investigated and resolved one such failure.

My first step in troubleshooting this error was to query from the ExecutionLog3 view inside the ReportServer database. I normally do this to check if and why a report subscription has failed its run.

However, in this case all the log records showed success (rsSuccess status):

Read on to see what this indicates, how Eitan was able to troubleshoot the problem, and the ultimate fix.

Comments closed

Transposing Data Frames in R

Steven Sanderson does a switcharoo:

Data manipulation is a crucial skill in R programming, and one common operation is transposing data frames – converting rows to columns and vice versa. Whether you’re cleaning data for analysis, preparing datasets for visualization, or restructuring information for machine learning models, understanding how to transpose data frames efficiently is essential. This comprehensive guide will walk you through various methods to transpose data frames in R, complete with practical examples and best practices.

Read on for a few approaches to the problem.

Comments closed

Generating an Example of SQL Injection

Vlad Drumea points out that SQL injection is still a thing:

SQL injection, also refereed to as SQLi, is a security vulnerability that allows attackers to modify the queries that an application makes to its underlying database.
This type of vulnerability can allow attackers to interact with data that they are not normally able to access, including data belonging to other application users.

For at least a decade, injection attacks were either number one or number two on OWASP’s top 10 list. It dropped all the way to number 3 in 2021. We’ll see how it looks with the 2025 OWASP top 10 application security vulnerabilities list, but in the meantime, check out Vlad’s post.

Comments closed