Table Swaps With Triggers

Jay Robinson walks through the process of making a breaking change to a large, active table with limited downtime:

I can only recall one time in the past several years (at least a decade) that I’ve found triggers to be useful. It involves data migration.

The problem: You have a massive, high-activity table. Let’s call this Table_A. You need to make significant changes to it. For example, the clustered index needs to change. How do you accomplish this?

I’ve used a similar process with good success in the past.

The Costs And Benefits Of Triggers

Rob Farley hits on my biggest complaint about triggers:

But the problem isn’t what they do, it’s that you didn’t know it was there. Just like asbestos.

People complain about nested triggers, about triggers that are overly complex, about triggers which do too many things… but stored procedures have all these faults too. It’s just that you knew that the stored procedure was there, and you didn’t know that the trigger was there.

I wish that triggers were more visible inside the tools (Management Studio / Operations Studio / Visual Studio), right alongside stored procedures and procedural functions (those ones that use BEGIN & END and are bad, rather than inline functions). They’re code and should be treated as such.

Hiding triggers under tables makes it easy to forget about them, at least until you get some unexpected results.

The Benefits Of DML Triggers

Kendra Little tells us a tale of trigger value:

When I first edged my way into a Junior DBA-ish role, I worked with a complex application with many large databases. Customers loaded and configured data into a (mostly) OLTP style database, and then database was replicated to multiple other systems — some to publish data to and adserving platform, and some to transform the data for reporting.

Triggers were used extensively in these downstream systems to detect changes. It went like this:

  • Transactional replication publications were set up on the OLTP  (ish) database.

  • Transactional replication subscriptions were set up on each downstream server. A dedicated database was used for replication articles on each instance.

  • After replication was initialized, DML triggers were created on each article in the subscriber database. For each modification, the trigger would insert a narrow row into a “delta” table related to that article.

  • The “delta” tables were in their own schema, and contained row identifiers, timestamp columns, and custom indexes for processing. This enabled batches to be efficiently pulled from these tables for processing into the related system.

  • Cleanup processes periodically pulled processed rows out of the delta tables in the background (and indexes were designed to prevent the cleanup process from fighting with data processing jobs and inserts)

Read the whole thing.  There are some things that triggers can do easily which would be difficult to handle otherwise, but they can also be dangerous in the wrong hands.

T-SQL Tuesday Roundup

Wayne Sheffield is back with a T-SQL Tuesday roundup for August:

I asked for the topic to be on:

Tell me about a time when you ran up against your own brick wall, and how you worked it out or dealt with it.

We had 17 folks take this topic and run with it. As a teaser, my favorite line in all of these is:

My feeling is that if you’re not hitting brick walls you’re probably not pushing yourself or SQL Server hard enough!

With 17 responses, there’s a good bit of reading to do.

Tips For Troubleshooting Code Problems

Bert Wagner shares some techniques he uses to troubleshoot code:

1. Rubber Duck Debugging

The first thing I usually do when I hit a wall like this is talk myself through the problem again.

This technique usually works well for me and is equivalent to those times when you ask  someone for help but realize the solution while explaining the problem to them.

To save yourself embarrassment (and to let your coworkers keep working uninterrupted), people often substitute an inanimate object, like a rubber duck, instead of a coworker to try and work out the problem on their own.

Alas, in this case explaining the problem to myself didn’t help, so I moved on to the next technique.

This one works more often than you might expect, and is a big part of the value behind pair programming.

T-SQL Tuesday 104 Roundup

Bert Wagner reviews the entries for T-SQL Tuesday 104:

This month’s T-SQL Tuesday topic asked “What code would you hate to live without?” Turns out you like using script and code to automate boring, repetitive, and error-prone tasks.

Thank you to everyone who participated; I was nervous that July holidays and summer vacations would stunt turnout, however we wound up with 42 posts!

Watch tsqltuesday.com for next month’s topic and consider signing up to host.

Read on for the 42 submissions.

Comparing System Metadata Between SQL Server Versions

Aaron Bertrand shows how he finds hidden features in new SQL Server builds:

One of the areas I like to focus on is new features in SQL Server. Under both MVP and Microsoft Partner programs, I get to see a lot of builds of SQL Server that don’t make it to the public, and documentation for these builds is typically sparse. In order to get a head start on testing things out, I often need to explore on my own. And so I wrote some scripts for that, which I’ve talked about in previous blog posts:

When I install a new version of SQL Server (be it a cumulative update, the final service pack for a major version, or the first CTP of vNext), there are two steps:

  1. Create a linked server to the build that came before it

  2. Create local synonyms referencing the important catalog views in the linked server

It’s a good way to get a glimpse at which features devs are currently working on but haven’t enabled yet.

The Problems With NOLOCK

Rob Farley demonstrates the downside of the READ UNCOMMITTED isolation level:

I’m going to create a table and insert exactly 1 million rows. This particular table will be a clustered index, and will contain 1 million GUIDs.

Next I prove that there a million rows.

Now without inserting or deleting any rows, I’m going to shuffle them.

And if while this is happening, I count the rows in a different session, I have to wait for that query to finish.

Read on to see what happens when someone gets the idea of running the select query with NOLOCK.

Building A Calendar Table

Louis Davidson has an example of a calendar table in SQL Server:

The solution is part of my calendar/date dimension code, and it is used to do relative positioning over date periods. For example, say you have the need to get data from the 10 days. You can definitely use a simple between to filter the rows, and a bunch of date functions to group by year, month, etc., generally all of the “normal” groupings. But using a calendar table allows you to prebuild a set of date calculations that make the standard values easier to get, and non-standard groupings possible. The technique I will cover makes moving around in the groupings more easily accessible. Like if you want data from the last 3 complete months. The query to do this isn’t rocket science, but it isn’t exactly straightforward either.

For the example, I will use the calendar table that I have on my website here: http://drsql.org/code in the download SimpleDateDimensionCreateAndLoad, and will load it with data up until 2020. Here is that structure:

Read on for examples of usage.  This is an example where thinking relationally differs from thinking procedurally—imagining date ranges as pre-calculated sets isn’t intuitive to procedural developers, but it can give a big performance boost.

Checking File Sizes In SQL Server

Andy Mallon looks back at a contribution by Junior DBA Andy, this one on checking file sizes:

This is every DBA’s favorite game. Figuring out what DMV contains the data you want. It turns out there are two places that database file size info is maintained. Each database has sys.database_files which has information for that database. The master database also has sys.master_files, which contains information for every database.

Using sys.master_files seems like it would be the obvious choice: everything in one view in master is going to be easier to query than hitting a view in a bunch of different databases. Alas, there’s a minor snag. For tempdb, sys.master_files has the initial file size, but not the current file size. This is valuable information, but doesn’t answer the use cases we set out above. I want to know the current file sizes. Thankfully, sys.database_files has the correct current file sizes for tempdb, so we can use that.

Using sys.database_files seems like it’s going to be the right move for us then. Alas, this isn’t quite perfect either. With Log Shipping, and Availability Group secondaries, if you’ve moved data files to a different location, sys.database_files will contain the location of the files on the primary database. Thankfully, sys.master_files has the correct local file locations for user databases, so we can use that.

Ugh, so it looks like the answer is “both”… we’ll need to use sys.database_files for tempdb, and sys.master_files for everything else.

Click through for the script, including Andy’s critical reflection on how Past Andy has failed him.

Categories

August 2019
MTWTFSS
« Jul  
 1234
567891011
12131415161718
19202122232425
262728293031