Lior Gavish shows off a useful operator in Apache Airflow:
But what happens when Airflow testing doesn’t catch all of your bad data? What if “unknown unknown” data quality issues fall through the cracks and affect your Airflow jobs?
One helpful but underutilized solution is to leverage the Airflow ShortCircuitOperator to create data circuit breakers to prevent bad data from flowing across your data pipelines.
Data circuit breakers are powerful, but as with most data quality tactics, the nuances of how they are implemented are critical. Otherwise, you can make a bad problem worse.
Read on to learn more about the operator and how you can use it. The code block images are a bit fuzzy but still readable enough. It might be a little clearer on the original post.
Comments closed