Handling Rogue Queries In Spark

Alicja Luszczak, et al, introduce the Query Watchdog:

The previous query would cause problems on many different systems, regardless of whether you’re using Databricks or another data warehousing tool. Luckily, as an user of Databricks, this customer has a feature available that can help solve this problem called the Query Watchdog.

Note: Query Watchdog is available on clusters created with version 2.1-db3 and greater.

A Query Watchdog is a simple process that checks whether or not a given query is creating too many output rows for the number of input rows at a task level. We can set a property to control this and in this example we will use a ratio of 1000 (which is the default).

It looks like this is an all-or-nothing process, but a very interesting start.

Related Posts

Automatically Fix Those VLFs

Tracy Boggiano has a script which will fix log files with high virtual log file counts: First part of the process if to capture the info from DBCC LOGINFO or if you are ready for 2017 the new dmv sys.dm_db_log_stats into a table you can read later to know how many VLFs exist in your […]

Read More

Schema Comparison With Visual Studio

Arun Sirpal shows off the Schema Compare functionality within Visual Studio: A very common requirement which can be satisfied by various tools. Personally I like using Visual Studio 2017 Community Edition and I thought I would do a quick overview of it. First thing, you can find the download from this link: https://www.visualstudio.com/downloads/  and once installed (making […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930