With LLAP enabled, Spark reads from HDFS go directly through LLAP. Besides conferring all of the aforementioned benefits on Spark, LLAP is also a natural place to enforce fine grain security policies. The only other capability required is a centralized authorization system. This need is met by Apache Ranger. Apache Ranger provides centralized authorization and audit services for many components that run on Yarn or rely on data from HDFS. Ranger allows authoring of security policies for: – HDFS – Yarn – Hive (Spark with LLAP) – HBase – Kafka – Storm – Solr – Atlas – Knox Each of the above services integrate with Ranger via a plugin that pulls the latest security policies, caches them, and then applies them at run time.
Read on for more details.