Press "Enter" to skip to content

Category: Administration

Soft-NUMA Doesn’t Limit MAXDOP

Lonny Niederstadt tests whether soft-NUMA forces MAXDOP = 1:

I mentioned that I was planning to set up a soft-NUMA node for each vcpu on a 16 vcpu VM, to evenly distribute incoming connections and thus DOP 1 queries over vcpus.  Thomas Kejser et al used this strategy to good effect in “The Data Loading Performance Guide”, which used SQL Server 2008 as a base.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

My conversation partner cautioned me that leaving this soft-NUMA configuration in place after the specialized workload would result in DOP 1 queries whether I wanted them or not.  The claim was, effectively, a parallel query plan generated by a connection within a soft-NUMA node would have its MAXDOP restricted by the scheduler count (if lower than other MAXDOP contributing factors).  Though I wasn’t able to test at the time, I was skeptical: I’d always thought that soft-NUMA was consequential to connection placement, but not to MAXDOP nor to where parallel query workers would be assigned.

I’m back home now… time to test!!

Read on for the test.

Comments closed

Picking Azure VM Sizes

Glenn Berry helps us pick the right-sized Azure VM for a SQL Server installation:

A common issue with Azure VM sizing for SQL Server has been the fact that you were often forced to select a VM size that had far more virtual CPU cores than you needed or wanted in order to have enough memory and storage performance to support your workload, which increased your monthly licensing cost.

Luckily, Microsoft has recently made the decision process a little easier for SQL Server with a new series of Azure VMs that use some particular VM sizes (DS, ES, GS, and MS), but reduce the vCPU count to one quarter or one half of the original VM size, while maintaining the same memory, storage and I/O bandwidth. These these new VM sizes have a suffix that specifies the number of active vCPUs to make them easier to identify.

For example, a Standard_DS14v2 Azure VM would have 16 vCPUs, 112GB of RAM, and support up to 51,200 IOPS or 768MB/sec of sequential throughput (according to Microsoft). A new Standard_DS14-8v2 Azure VM would only have 8 vCPUs, with the same memory capacity and disk performance as the Standard_DS14v2, which would reduce your SQL Server licensing cost per year by 50%. Both of these Azure VM SKUs would have the same ACU score of 160.

Glenn is, as always, a font of useful information.  Go read the whole thing.

Comments closed

Error Handling On SQL With Linux

Anthony Nocentino explains Linux error codes and systemd behavior for SQL on Linux:

Now in the output above, you’ll notice a bolded line. In there, you can system that systemd[1] receives a return code from SQL Server of status=1/FAILURE.  Systemd[1] is the parent process to sqlservr, in fact it’s the parent to all processes on our system. It receives the exit code and immediately, systemd initiates a restart of the service due to the configuration we have for our mysql-server systemd unit.
What’s interesting is that this happens even on a normal shutdown. But that simply doesn’t make sense, return values on clean exits should return 0. It’s my understanding of the SHUTDOWN command, that it will cause the database engine to shutdown cleanly.

On the development side, there aren’t many differences between SQL on Linux versus SQL on Windows (aside from things which haven’t yet made the move); on the administration side, there are some interesting differences.

Comments closed

Stress Testing SQL Server

Jes Borland shows how to use ostress to perform load testing against a SQL Server instance:

Ostress allows you to specify one file, or a folder that contains multiple files, to run. You can also specify a number of connections to be made to the database, to simulate multiple users or applications running the same query. Each connection can then run the file one or more times.

The next thing you’ll need is one or more .sql files that the tool will run.

To run a load test, you’ll open RML cmd prompt and enter your command.

Ostress isn’t as nice as a replayable trace for generating production loads, but it’s an easy method to stress test a server.

Comments closed

Automating Cache Cleanup

Tracy Boggiano has a process to automate cleaning up different caches in SQL Server:

First, we need to create a table to store our information on the caches we would like to clear on an automated basis and populate it with values.

For example, we clear SQL Plans if we 10,000 plans are Adhoc or Prepared plans that take up 5GBs of memory or Single Used Plans is greater than 10,000 or the memory used for Adhoc or Prepared plans if more than 50% of memory.  We clear Transactions cache if is more than 2 GBs and Lock Manager : Node 0 if it is more than 2 GBs.

Read on for the script.

Comments closed

Query Store Capture Modes

Arun Sirpal notes an important difference in the default Query Store settings for SQL Server 2017 versus Azure SQL Database:

So just remember the only difference when analyzing settings is the difference in Query Store Capture Mode. For Azure it is set to AUTO whereas with local installed SQL Servers it is set to ALL.

What does this mean? ALL means that it is set to capture all queries but AUTO means infrequent queries and queries with insignificant cost are ignored. Thresholds for execution count, compile and runtime duration are internally determined.

Read on to learn more, including how to change these settings.

Comments closed

External Memory Pressure With SQL On Linux

Anthony Nocentino explains how SQL Server on Linux reacts to memory pressure:

We can use tools like ps, top and htop to look our are virtual and physical memory allocations. We can also look in the /proc virtual file system for our process and look at the status file. In here we’ll find the point in time status of a process, and most importantly the types of memory allocations for a process. We’ll get granular data on the virtual memory allocations and also the resident set size of the process. Here are the interesting values in the status file we’re going to focus on today.

  • VmSize – total current virtual address space of the process

  • VmRSS – total amount of physical memory currently allocated to the process

  • VmSwap – total amount of virtual memory currently paged out to the swap file (disk)

The differences are going to be interesting for people to troubleshoot later, particularly if you look at SOS_SCHEDULER_YIELD and give a knee-jerk reaction that the problem is with CPU.

Comments closed

When The Maximum Workspace Memory Isn’t The Internal Pool Maximum

Lonny Niederstadt answers the call from someone who needs the combination of Perfmon and DMV data:

When is a maximum not really the maximum?
When it’s a maximum for an explicitly or implicitly modified default.
Whether “the definitive documentation” says so or not.

Yesterday on Twitter #sqlhelp this question came up.

*****

*****

Aha! I thought to myself.  For this I am purposed! To show how Perfmon and DMV data tie out!

Read on for the simple form of the answer, followed by the complication which makes life interesting.

Comments closed

Tracking Long-Running Queries

Ryan Booz walks us through tracking long-running queries with sp_whoisactive:

This solution runs sp_WhoIsActive every minute and saves the output into a global temp table. From there, I look for any processes that have been running for more than the low threshold setting. Any of the processes that have not been identified and stored previously get logged, output to an HTML table, and an email alert sent.

Next, I take a second look at the table for anything that’s been running longer than the high threshold.  If a second email alert has not been sent for these processes, we output the same data and send the email. If two alerts have already been sent for these processes, I don’t do anything else at the moment. One of the next updates to this script will send an alert to our DevOps notification system for anything running longer than some final threshold (or maybe just the high threshold).

I particularly like this part about not re-alerting over and over for a long-running query.  It’s a relatively minor part of the whole solution, but it gets annoying watching the same e-mail come in every 5 minutes, especially if there’s nothing you can (or at least want to) do about the cause.

Comments closed