Finding Abnormally Long-Running Jobs

Lori Brown has a script to find jobs running longer than their 30-day average:

I needed to update some of our long running job monitoring code to improve it from the version that we have right now. I like this version because it uses msdb.dbo.syssessions (https://msdn.microsoft.com/en-us/library/ms175016.aspx) to validate that a job is actually running. I also wanted to know the percent difference between the current run duration versus an average duration per job from the past 30 days. I decided to place the calculated average into a table variable and then join on it to get my results. I also used the IIF function (https://msdn.microsoft.com/en-us/library/hh213574.aspx) to help me avoid a divide by zero error that comes up when the average duration equals 0.

One thing which could cut down on false positives would be to calculate the standard deviation as well.  I wouldn’t automatically assume that job executions were normally distributed, but if you look at things more than one standard deviation away from the mean, it should remove noise of jobs which are just a little over the average but not in dangerous territory.

SQL Agent Alerts

David Alcock has a script to create SQL Agent alerts for common errors:

These alerts cover a range of errors from potential IO subsystem problems to failed logins, all of which are things a DBA needs to know about, and quickly too.
As well as error notifications you can set up alerts to cover performance conditions. The final statement in the script below sets up an alert that triggers when Page Life Expectancy drops below 1000. In all honesty I don’t set up these performance alerts that often but I wanted to show you the kind of thing that is possible and would be handy if you don’t have any third party monitoring.

He follows this up with a post on appropriate response:

But what do I mean by sensible? Typically I see a number of problems with alerting setups; either alerts are inadequate and don’t cover the necessary errors (or there are none at all) but I also see the notifications to alerts not being set up correctly meaning problems go backwards and forwards delaying any fixes.
The other problem I see is an over provision of alerts. This usually is because one or more other monitoring systems have been deployed and error notifications have been duplicated as a result. Imagine having an operational tool like System Centre, some SQL monitoring software and native alerting all pinging the same message to the one recipient mailbox. Now on top of that let’s say the alerts have not been configured correctly so information emails are being issued every second. It’s a scary thought but it is easy to see how a critical error might be missed in this scenario.

If you don’t have automatic alerts for high-severity errors, this is an easy way of gaining insight into the problems your server is experiencing.

Visualizing SQL Server Agent Jobs

Daniel Hutmacher shows how to visualize SQL Server Agent job runtimes using spatial data types:

If all you have is a hammer, everything will eventually start looking like a nail. This is generally known as Maslow’s hammer and refers to the fact that you use the tools you know to solve any problem, regardless if that’s what the problem actually needs. With that said, I frequently need a way to visualize the load distribution of scheduled jobs over a day or week, but I could never be bothered to set up a web server, learn a procedural programming language or build custom visualizations in PowerBI.

So here’s how to do that without leaving Management Studio.

Click through for discussion and link to the code.

Alert On SQL Jobs Missing Schedules

Brian Hansen wraps up a three-part series on scheduled job alerts:

The first two parts of this series addressed the general approach that I use in an SSIS script task to discover and alert on missed SQL Agent jobs. With apologies for the delay in producing this final post in the series, here I bring these approaches together and present the complete package.

To create the SSIS, start with an empty SSIS package and add a data flow task. In the task, add the following transformations.

Regardless of how you do it, knowing when jobs fail is important enough to build some infrastructure around answering this question.

SQL Agent Job Durations As TimeSpans

Rob Sewell shows us how to convert the SQL Agent job duration into a .NET TimeSpan using Powershell:

The first job took 15 hours 41 minutes  53 seconds, the second 1 minute 25 seconds, the third 21 seconds. This makes it quite tricky to calculate the duration in a suitable datatype. In T-SQL people use scripts like the following from MSSQLTips.com

((run_duration/10000*3600 + (run_duration/100)%100*60 + run_duration%100 + 31 ) / 60)  as ‘RunDurationMinutes’

I wish that some version of SQL Server would fix this “clever” duration.  We’ve had the time datatype since 2008; at least add a new column with run duration as a time value if you’re that concerned with backwards compatibility.

Fork Bombs

Brent Ozar creates a fork bomb in SQL Server:

I’ve always found fork bombs funny because of their elegant simplicity, so I figured, why not build one in SQL Server?

In order to do it, I needed a way to spawn a self-replicating asynchronous process, so I built:

  1. A stored procedure

  2. That creates an Agent job

  3. That runs the stored procedure

I didn’t think it was possible.  I certainly didn’t think it would take a half-dozen lines of code.

Email Delay

Dave Mason looks at the Delay Between Responses option in SQL Agent alerts:

Fortunately, I had set up a SQL Agent Alert for errors with Severity Level 17, which emailed me and several coworkers to alert us to the problem. But this was unfortunate too. Every one of those alert occurrences sent an email. Sure, it’s nice to know when there’s a problem, but a thousand or more emails is most certainly overkill. After addressing the transaction log issue, the alert emails kept coming. This query told me there were still a few thousand unsent emails:

Getting a message that something is down is important.  Getting several thousand messages is counterproductive.

Notifying Different Sets Of Operators

Jason Clements has an interesting solution to the problem of user notification:

The other day I was asked How to notify multiple operators using database mail for failed jobs and a different operators for successes.

First I looked at the operator email addresses field [email protected];[email protected]….etc is not helping as there is a limit on the characters in the email name entry of operator and we still have the problem we need different groups for success and failures.

It makes perfect sense, but is non-trivial.  I like it.

Conditional Alerting

Dave Mason revs up SQL Server alerts using tokens and conditional responses:

There are three tokens within the T-SQL (highlighted in yellow above): A-MSG, DATE, and TIME. SQL server replaces these three tokens as follows:

 

  • A-MSG: Message text. If the job is run by an alert, the message text value automatically replaces this token in the job step.
  • DATE: Current date (in YYYYMMDD format).
  • TIME: Current time (in HHMMSS format).

See the MSDN documentation for a list of tokens and their descriptions.

 

This is a great way of being smarter with alerts.  Your SQL Server instance has a lot of information at the ready, so get familiar with what’s up for offer.

Credentials And Proxies

Kenneth Fisher shows how to use credentials and proxies to run external objects (like SSIS packages and Powershell scripts) through the SQL Server Agent:

There are purposes for credentials other than a proxy, but for our purposes you are just going to enter an AD username and password. Just to be even more clear, this is an AD/Windows user. Not a sql server login.

In Object Explorer: ServerName -> Security -> Right click on Credentials and select New Credential -> Fill in the Name, Identity and Password fields.

Kenneth’s getting fancy with animated GIFs, and gives us a good walkthrough of this aspect of SQL Agent security.

Categories

September 2017
MTWTFSS
« Aug  
 123
45678910
11121314151617
18192021222324
252627282930