Talking To Secure Hadoop Clusters

Mubashir Kazia shows how to connect to a secured Hadoop cluster using Active Directory:

The primary form of strong authentication used on a secure cluster is Kerberos. Kerberos supports credentials delegation where a server process to which a user has authenticated, can perform actions on behalf of the user. This involves the server process accessing databases or other web services as the authenticated user. Historically the form of delegation that was supported by Kerberos is now called “full delegation”. In this type of delegation, the Ticket Granting Ticket (TGT) of the user is made available to the server process and server can then authenticate to any service where the user has been granted authorization. Until recently most Kerberos Key Distribution Center(KDC)s other than Active Directory supported only this form of delegation. Also Java until Java 7 supported only this form of delegation. Starting with Java 8, Java now supports Kerberos constrained delegation (S4U2Proxy), where if the KDC supports it, it is possible to specify which particular services the server process can be delegated access to.

Hadoop within its security framework has implemented impersonation or proxy support that is independent of Kerberos delegation. With Hadoop impersonation support you can assign certain accounts proxy privileges where the proxy accounts can access Hadoop resources or run jobs on behalf of other users. We can restrict proxy privileges granted to a proxy account to act on behalf of only certain users who are members of certain groups and/or only for connections originating from certain hosts. However we can’t restrict the proxy privileges to only certain services within the cluster.

What we are discussing in this article is how to setup Kerberos constrained delegation and access a secure cluster. The example here involves Apache Tomcat, however you can easily extend this to other Java Application Servers.

This is a good article showing specific details on using Kerberos in applications connecting to Hadoop.

Related Posts

It’s All ETL (Or ELT) In The End

Robin Moffatt notes that ETL (and ELT) doesn’t go away in a streaming world: In the past we used ETL techniques purely within the data-warehousing and analytic space. But, if one considers why and what ETL is doing, it is actually a lot more applicable as a broader concept. Extract: Data is available from a source system Transform: We […]

Read More

Permissions Needed To View Permissions

Kenneth Fisher shows us what we can do to grant a low-privilege user the ability to view permissions for other users: Every now and again you’ll have a user that needs to be able to see what permissions other users have. Not change them, just look at them. In the cases I’ve seen it’s usually […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031