Press "Enter" to skip to content

Category: Administration

Debugging a Production Failure

Roel Hogervorst diagnoses trouble:

When you are in panic mode you focus on what is right in front of you and make suboptimal decisions. Here is some I have made.

Read on for a couple stories as well as a practical implementation of debugging as an OODA loop. Something that Sean McCown mentioned before has always stuck with me: it’s amazing just how few people know how to troubleshoot issues. Our inclination seems to be one of two things: adduce a conclusion from the first piece of evidence (usually just a flimsy error message) or immediately give up.

Comments closed

Using Buffer Pool Extension in SQL Server

Chad Callihan looks at buffer pool extension:

Perhaps you started out with X amount of memory when your SQL server was brought online and over time, with additional load and activity on that SQL server, users are not quite getting the type of performance they’re used to getting. Sure, you can buy more memory. What if that’s not an option?

If you’re running low on memory and need a little boost, enabling buffer pool extension can take advantage of an SSD as an “extension” for the buffer pool.

This is one of those interesting features that probably help a small number of customers but shouldn’t be generally useful. That’s because even with SSD performance improvements, memory is still a couple orders of magnitude faster, so as long as you have the ability to increase RAM, that brings much better performance.

Comments closed

Installing Prometheus Exporter for Windows Clients

Jamie Wick exports some data:

Prometheus is an open-source monitoring solution that our Linux team has been using for several years. More recently, we began using it for our Windows-based servers too. (I’ll post a writeup about Prometheus in the future)

One of the obstacles to implementing Prometheus monitoring on our Windows servers was finding and installing an agent. We ultimately decided to use the windows_exporter agent available in the Prometheus Community on GitHub. The exporter is free to use under an MIT license and supports an extensive list of WMI metrics that are grouped into Collectors.

Read on for more info, including ways to avoid common errors.

Comments closed

SQL Audit for STIG Compliance

Tracy Boggiano has proof of existence:

Recently I spent months of my lift working on STIG and CIS compliance at my job and one of those tasks was setting up SQL Audit for STIG.  Now, that might seem like a trivial task after all don’t you just have to create an audit and audit specification and let it run.  If only it were that easy, some of the specifications can have a significant performance impact on your system depending on the type of activity happening and if you happened to lucky enough to have a monitoring software setup your will be logging even more data that doesn’t make sense to log.  In addition, on my system we are using SQL replication and that activity due to volume doesn’t make sense to log.  So, let’s walk through my setup and how I got there, the how I got there being the most important part so you can figure out how to use filters to setup a SQL audit that does that kill your performance.

Read on for the audit specification and server audit scripts, as well as some details on how to read from server audits.

Comments closed

Azure Delete Locks

Denny Cherry has some advice:

When I’m working in a client’s Azure environment, and they don’t have a delete lock on their production environment I always work on getting them to have one.

This doesn’t always play nicely with everything in Azure, so read on for Denny’s advice when working with Azure Migrate.

Comments closed

Monitoring Kubernetes in Production

Samir Behara provides some guidance:

Kubernetes is an open-source container orchestration system for automating the deployment and management of containerized applications. Kubernetes provides capabilities like service discovery, horizontal autoscaling, and load balancing, while ensuring that application configurations are declarative and that systems are self-healing.

In this article, I will explain how to monitor your Kubernetes cluster and implement automated health checks, and discuss the various monitoring tools available.

Read on for some thoughts. In addition to Samir’s links and ideas, I’d also throw in some tools like Rancher to make management a little easier.

Comments closed