Category: Administration

Get-CmsHosts is a function I wrote as part of a custom PowerShell module we maintain internally at my employer. It is simple to use, but is the base of most automation projects I work on.

Simple Example
PS> Get-CmsHosts -SqlInstance 'srv-' -CmsInstance srv-mycms-01
This example will connect to srv-mycms-01 and return a distinct list of instance host names registered with that CMS server that start with the string srv-. This output can then be piped to other commands:

Read on for more examples and details, and then grab the script at the end of Mark’s post.

Comments closed

Invalid Class Error Trying To Access WMI Class

Published 2017-09-11 by Kevin Feasel

Claudio Silva troubleshoots an error which gives the user a red herring:

This can return more than one line with different ComputerManagement (like ComputerManagement10). It depends on the versions you have installed on the host. The number “10” refers to the SQL Server 2008.
Now I can uncomment the last command and run it. The result is:

Get-CimInstance : Invalid class
At line:1 char:1
+ Get-CimInstance -CimSession $CIMsession -Namespace $(“rootMicrosoftSQLServerC …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : MetadataError: (:) [Get-CimInstance], CimException
+ FullyQualifiedErrorId : HRESULT 0x80041010,Microsoft.Management.Infrastructure.CimCmdlets.GetCimInstanceCommand
+ PSComputerName : HOST001

Ok, a different error message. Let’s dig in it. I logged in on the host and confirmed that I have a SQL Server 2008 R2 instance installed. This means that I’m not accessing a lower version than 2005 like the initial warning message was suggesting.

Read the whole thing.

Comments closed

Troubleshooting Ambari Server

Published 2017-09-06 by Kevin Feasel

Jay SenSharma has an interesting article on troubleshooting Ambari Server:

When we notice that the ambari server is responding slow then we should look first the following details first:

1). The number of hosts added to the ambari cluster. So that accordingly we can tune the ambari agent thread pools.

2). The number of concurrent users (or the view users) who access the ambari server at a time. Sothat accordingly we can tune the ambari thread pools.

3). The age of the ambari cluster. If the ambari server is too old then the possibility is that some of the operational logs and the alert histories will be consuming a large amount of the Database which might be causing ambari DB queries to respond slow.

4). The Ambari Database health and it’s geographic location from the ambari server, to isolate if there are any network delays.

5). Ambari server memory related tuning parameters to see if the ambari heap is set correctly.

6). For ambari UI slowness we should check the network proxy issues to see if there are any network proxies added between client the ambari server machine Or the network slowness.

7). If the ambari users are synced with the AD or external LDAP and if the communication between server and the AD/LDAP is good.

8). Also the resource availability on the ambari host like the available free memory and if any other service/component running on ambari server is consuming more Memory/CPU/IO.

There is a lot of detail here, including quite a few checks to run.

Comments closed

Dealing With Trace Flags In The Registry

Published 2017-09-06 by Kevin Feasel

Wayne Sheffield shows us how to configure SQL Server trace flags within T-SQL using registry access commands:

In a recent post, I introduced you to how to work with the registry directly from within SQL Server. Continuing this theme, this post provides an example situation where you would do so.

In this example, we will want to configure SQL Server to enable a few trace flags (TF) when SQL Server starts. Specifically, let’s set trace flags 1117 and 1118 so that they are enabled when SQL Server starts up, and enable them now. Additionally, we have trace flags 1204 and 1222 now enabled on some servers, and we want to disable those (since we have the deadlocks being captured in the system health XE, we don’t need them in the error log also). We also don’t want to force a restart of the SQL Server services.

I’ve always felt a little icky about writing to the registry from SQL Server, but Wayne shows how to do it right.

Comments closed

Creating Minidumps In SQL Server

Published 2017-09-04 by Kevin Feasel

Joe Obbish shows how to read the call stack by creating a minidump:

We can use minidumps to generate small files that contain SQL Server call stacks. Note that if you aren’t careful you can end up writing the full contents of memory to a dump file. I believe that writing the file is a single-threaded process, so this can take hours and hundreds of GBs on a real server. Also SQL Server is frozen while the dump is happening, so don’t do it in production.

Dumps are most useful when you want to examine a long running, very simple query or a query that seems to be stuck at a certain point. They aren’t useful to see a full call stack of a non-simple query or to see something that happens just once during execution, like a specific task during query compilation. All that you can get is a snapshot and the snapshot may not contain the information that you’re looking for.

There are multiple ways to generate and read dump files. For this blog post I’ll be using sqldumper.exe and WinDbg.

Click through to read the whole thing.

Comments closed

Kafka Cruise Control

Published 2017-09-01 by Kevin Feasel

Jiangjie Qin announces Cruise Control, an automated workload management system for Kafka:

Intelligent automation is critical under these circumstances, which is why we developed Cruise Control: a general-purpose system that continually monitors our clusters and automatically adjusts the resources allocated to them to meet pre-defined performance goals. In essence, users specify goals, Cruise Control monitors for violations of these goals, analyzes the existing workload on the cluster, and automatically executes administrative operations to satisfy those goals. You can see a video here about Cruise Control at the Stream Processing Meet Up last fall.

Today we are pleased to announce that we have open sourced Cruise Control and it is now available on Github. In this post, we’ll describe Cruise Control’s uses both generally and at LinkedIn, its architecture, and some unique challenges we faced when creating it. For further details about Kafka terminology used throughout this post, this reference can be a helpful guide.

This isn’t a monitoring tool per se, but rather a resource balancing tool. And it’s now freely available to all.

Comments closed

Monitoring SQL Agent Job Failures

Published 2017-08-31 by Kevin Feasel

Mark Wilkinson shows how to set up a SQL Agent job failure monitoring solution:

Since we are storing the date the records are added to the table, this query will always return the latest set of failures. This is a simple example, but the possibilities are endless:

Send the results of this query via database mail
Join with dbo.sysjobs and dbo.syscategories, alerting on different thresholds per job category
Extend the TOP (1) to include multiple capture periods and alert on average failures per capture

Check it out. This is particularly helpful if you get blasted with thousands of error messages per minute because somebody made a bunch of untested changes and broke every job in your environment and caused the mail server to throttle your account for a multi-day period. Not that this has ever happened to me, of course…

Comments closed

Explaining Max Server Memory

Published 2017-08-31 by Kevin Feasel

Randolph West explains SQL Server memory limits in various editions:

For the sake of this post, I am talking specifically about SQL Server 2012 and higher.

Let’s break this down.

On all editions of SQL Server (Enterprise, Standard, Web, Express), there is a configuration setting called Max Server Memory. This value should not be left at the default 2.1 petabytes, because at that value SQL Server will use all the memory you allow it (plus a little bit more if it needs to), and once it has it, it doesn’t release it back to the operating system without a fight.

Read on to see what Max Server Memory actually includes, as well as additional limitations on editions other than Enterprise.

Comments closed

Sizing Memory-Optimized Workloads

Published 2017-08-29 by Kevin Feasel

Prashanth Purnananda gives us a few notes regarding memory-optimized table sizes:

Recovering database with memory-optimized tables involves hydrating the contents of checkpoint files (data/delta files) into memory and then replaying the tail of the log (see this link for more details). One of the important difference between disk based tables and memory-optimized store is frequency of checkpoints. Automatic checkpointing for in-memory tables occurs every 1.5GB of log records unlike traditional or indirect checkpoints (where checkpointing is done more often) leading to longer tail of log for in-memory tables. The 1.5 GB log flush is chosen to strike the right balance between flooding the IO subsystem with too many small inefficient IO operations and too few large IOPs. In most scenarios observed by our CSS teams, long recovery times for memory optimized databases is caused by the long tail of log which needs to be recovered for in-memory tables in the database. For these scenarios, running a manual checkpoint before a restart can reduce recovery times as manual checkpoint forces the checkpoint for memory optimized tables in addition to disk based tables.

If you’re looking at creating memory-optimized tables, these are important administrative notes.

Comments closed

Thinking About Virtual Log Files

Published 2017-08-24 by Kevin Feasel

Monica Rathbun has a reminder that Virtual Log Files can be troublesome in excess:

What causes High VLFs?

As transactions force growth of the log file, inappropriate log file sizing or auto-growth settings can cause a high number of VLFs to occur. Each growth event adds VLFs to the log file. The more often you grow in conjunction with smaller growth segments, the more VLFs your transaction log will have.

Example

If you grow your log by the default 1 MB you may end up with thousands of VLFs as opposed to growing by 1GB increments. MSDN does a great job on explaining how a transaction logs work for a deeper dive I recommend reading it.

Read on to see how many VLFs your databases have, as well as how to reduce the number should it grow excessive.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31