The initial email didn’t have a lot of details, so I started asking questions to understand what version was being installed, the environment configuration, etc. Turns out this was a two-node Windows Server Failover Cluster (WSFC) with multiple SQL Server 2012 instances installed, and one of the instances was still running on the node this person was trying to patch. To be clear, the two nodes were SRV1 and SRV2, and the instances were PROD-A and PROD-B running on SRV1, and PROD-C which was running on SRV2. This person was trying to install the cumulative update on SRV2.
The behavior Erin describes is a little bit crazy, but at least there’s a good explanation and way to solve the issue.
MIND YOUR COMPATIBILITY LEVEL
When going to 2014 (as of today, 2016’s RTM hasn’t been announced yet), you’ll have to decide whether or not the new cardinality estimator suits you. There’s not a cut and dry answer, you’ll have to test it on your workload. If you’d like some of the more modern SQL features added to your arsenal, you can bump yourself up to 2012-levels to get the majority of them.
The interesting survey would be, among people who still have SQL 2005 installations, how many will move as a result of Microsoft declaring end-of-life for 2005. My expectation is a fairly low percentage—by this point, I figure a at least a strong minority of 2005 instances are around for regulatory or compliance reasons (e.g., some piece of regulated software was certified only for 2005).
You can think of page compression as doing data deduplication within a page. If there is some value repeated in multiple spots on a page, then page compression can store the repetitive value only once, and save some space.
Page compression is actually a process that combines three different compression algorithms into a bigger algorithm. Page compression applies these three algorithms in order:
1) Row compression
2) Prefix compression
3) Dictionary compression
Page compression is my go-to compression option, typically. There are some cases in which it doesn’t work well, so check beforehand (start with sp_estimate_data_compression_savings), but I’ve had good luck with page compression.
Now, you have a meaningful list of wait statistics that will tell you exactly why, if not where, your server is running slow. Unfortunately, these waits still need to be interpreted. If you read further on Paul’s blog, you’ll see he has a number of waits and their causes documented. That’s your best bet to start understanding what’s happening on your system (although, I hear, Paul might be creating a more complete database of wait stats. I’ll update this blog post should that become available).
Wait stats are fantastic tools for figuring out your server resource limitations given the server’s historic query workload. They’re the first place to look when experiencing server pains.
IT professionals (and amateurs), it’s time we had a chat. It’s time to stop dragging and dropping (or copying and pasting) files between servers and/or workstations.
It’s clumsy. It’s childish. It uses memory on the server.
Oh, and there’s a really easy tool to copy files built into Windows – Robocopy.
The syntax is pretty easy and robocopy handles small files well. Check out Nic Cain’s comment, though, if you’re going to copy large files in production.
This post investigated two potential workarounds to either buy you time before changing your existing
IDENTITYcolumn, or abandoning
IDENTITYaltogether right now in favor of a
SEQUENCE. If neither of these workarounds are acceptable to you, please watch for part 4, where we’ll tackle this problem head-on.
This is your weekly reminder to plan for appropriate data sizes.
As you can see, Page Life Expectancy (PLE) on this graph dips, gradually climbs, and then dips again. With a collection every five minutes you may not catch the exact peak – all you know is that the PLE was 50,000 at 12:55am and then only 100 at 1:00am on 03/13. It may have climbed higher than that before it dipped, but by 1:00am it had dipped down to around 100 (coincidentally at 1am the CheckDB job had kicked off on a large database).
If you really need to know (in this example) exactly how high PLE gets before it dips, or exactly how low it dips, or at what specific time it valleys or dips, you need to actively watch or set up a collector with a more frequent collection. You will find that in most cases this absolute value isn’t important – it is sufficient to know that a certain item peaks/valleys in a certain five minute interval, or that during a certain five minute interval (“The server was slow last night at 3am”) a value was in an acceptable/unacceptable range.
Andy also gives us a set of counters he uses by default and how to set up automated counter collection. Left to the reader is integrating that into an administrator’s workflow.
The most critical thing as a SQL Server DBA is to ensure that your databases can be restored in the event of the loss of a server for whatever reason: disk crash, fire in the server room, tribble invasion, whatever. To do this, not only do you have to back up your databases, you also have to test restores! Create a database and restore the backups of your production DB to them. It’s the safest way to make sure that everything works. This test restore can be automated to run every night, but that’s outside the scope of what I want to talk about right now.
There are lots of places that problems can creep in, this is just one part of how you’ll need to monitor systems. This is how I’ve done things for a number of years, and thus far it has served me well.
Depending upon your instance count, average database size, maintenance windows, etc. etc. etc., some of these things may or may not work, but the principle is the same: protect the data, and automate your processes to protect that data. This is a good article to read for ideas, and then from there dig into other administrative blog posts, videos, and books to become better versed in the tools and techniques available to protect your data.
Fortunately we can find queries with high CPU time using sys.dm_exec_query_stats DMV. This DMV, created in SQL Server 2008, keeps performance statistics for cached query plans, allowing us to find the queries and query plans that are most harming our system.
Glenn Berry’s fantastic set of diagnostic queries also includes a couple for finding CPU consumers.
Summary: We will need to drop and re-create any indexes, clustered or not, that reference the IDENTITY column – in the key or the INCLUDE. If the IDENTITY column is part of the clustered index, this means all indexes, since they will all reference the clustering key by definition. And disabling them isn’t enough.
Getting column sizes right at the beginning is your best bet. Stay tuned for other alternatives.