Handling Rogue Queries In Spark

Alicja Luszczak, et al, introduce the Query Watchdog:

The previous query would cause problems on many different systems, regardless of whether you’re using Databricks or another data warehousing tool. Luckily, as an user of Databricks, this customer has a feature available that can help solve this problem called the Query Watchdog.

Note: Query Watchdog is available on clusters created with version 2.1-db3 and greater.

A Query Watchdog is a simple process that checks whether or not a given query is creating too many output rows for the number of input rows at a task level. We can set a property to control this and in this example we will use a ratio of 1000 (which is the default).

It looks like this is an all-or-nothing process, but a very interesting start.

Related Posts

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library: Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and […]

Read More

Managing Central Management Server

Chrissy LeMaire shows how you can use dbatools to manage Central Management Server and registered servers: It’s a super useful feature that not all DBAs know about. Since CMS data is stored in msdb and accessible via SMO, you can access it from SQL Server Management Studio or PowerShell modules like dbatools. Central Management Server’s essential […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930