Handling Rogue Queries In Spark

Alicja Luszczak, et al, introduce the Query Watchdog:

The previous query would cause problems on many different systems, regardless of whether you’re using Databricks or another data warehousing tool. Luckily, as an user of Databricks, this customer has a feature available that can help solve this problem called the Query Watchdog.

Note: Query Watchdog is available on clusters created with version 2.1-db3 and greater.

A Query Watchdog is a simple process that checks whether or not a given query is creating too many output rows for the number of input rows at a task level. We can set a property to control this and in this example we will use a ratio of 1000 (which is the default).

It looks like this is an all-or-nothing process, but a very interesting start.

Related Posts

Testing TDE Performance

Eduardo Pivaral tests the performance of a database with Transparent Data Encryption versus that same database without encryption: Transparent data encryption (TDE) helps you to secure your data at rest, this means the data files and related backups are encrypted, securing your data in case your media is stolen. This technology works by implementing real-time I/O […]

Read More

External Memory Pressure In SQL Server 2019 On Linux

Anthony Nocentino walks us through memory pressure in SQL Server on Linux: Now in SQL Server 2017 with that 7GB program running would cause Linux to need to make room in physical memory for this process. Linux does this by swapping least recently used pages from memory out to disk. So under external memory pressure, let’s look […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930