Lnadon Robinson continues the Spark Starter Guide:
Having is similar to filtering (
filter()
,where()
or where (in a SQL clause)), but the use cases differ slightly. While filtering allows you to apply conditions on your non-aggregated columns to limit the result set, Having allows you to apply conditions on aggregate functions / columns instead.
Read on for examples in Spark SQL, both as a SQL query and Scala/Python function calls.