Connecting Hadoop To LDAP

Giovanni Lanzani walks through some of the difficulties of getting LDAP working with Hadoop:

This section could probably could have much less workarounds if I’d knew more about LDAP.

But I’m a data scientist at heart and I want to get things done.

If you ever dealt with Hadoop, you know that there are a bunch of non-interactive users, i.e. users who are not supposed to login, such as hdfs, spark, hadoop, etc. These users are important to have. However the groups with the same name are also important to have. For example when using airflow and launching a spark job, the log folders will be created under the airflow user, in the spark group.

LDAP, however, doesn’t allow you, to my knowledge, to have overlapping user/groups, as Unix does.

The way I solved it was to create, in LDAP, the spark_user (or hdfs_user or …) to work around this limitation.

Also, Giovanni apparently lives in an interesting neighborhood.

Related Posts

Kafka Streams Basics

Anuj Saxena walks through Kafka Streams and provides a quick example: The features provided by Kafka Streams: Highly scalable, elastic, distributed, and fault-tolerant application. Stateful and stateless processing. Event-time processing with windowing, joins, and aggregations. We can use the already-defined most common transformation operation using Kafka Streams DSL or the lower-level processor API, which allow us […]

Read More

R For Apache Impala

Ian Cook describes implyr, an R interface for Apache Impala: dplyr provides a grammar of data manipulation, consisting of set of verbs (including mutate(), select(), filter(), summarise(), and arrange()) that can be used together to perform common data manipulation tasks. The implyr package helps dplyr translate this grammar into Impala-compatible SQL commands. This gives R users access to Impala’s scale and speed […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

July 2017
MTWTFSS
« Jun  
 12
3456789
10111213141516
17181920212223
24252627282930
31