Monitoring SQL Agent Job Failures

Mark Wilkinson shows how to set up a SQL Agent job failure monitoring solution:

Since we are storing the date the records are added to the table, this query will always return the latest set of failures. This is a simple example, but the possibilities are endless:

  • Send the results of this query via database mail

  • Join with dbo.sysjobs and dbo.syscategories, alerting on different thresholds per job category

  • Extend the TOP (1) to include multiple capture periods and alert on average failures per capture

Check it out.  This is particularly helpful if you get blasted with thousands of error messages per minute because somebody made a bunch of untested changes and broke every job in your environment and caused the mail server to throttle your account for a multi-day period.  Not that this has ever happened to me, of course…

Related Posts

New(ish) VLF Status: 4

Paul Randal points out a new VLF status which can appear if you’re using an Availability Group: At least since I started working on the SQL Server team (just after 7.0 shipped) and since then there have only been two VLF status codes: 0 = the VLF is not active (i.e. it can be (re)activated […]

Read More

What sys.dm_exec_query_stats Can Miss

Matthew McGiffen takes us through a scenario where sys.dm_exec_query_stats did not give a complete view of what was running on SQL Server: I got less than 50 rows back so figured I had everything covered, but the total elapsed time across all the queries was less than ten minutes, I knew the server hadn’t been […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031