HDF 3.2 Updates

Dinesh Chandrasekhar walks us through some of the updates to Hortonworks Data Flow version 3.2:

Kerberos keytab isolation
Kerberos keytabs can now be isolated at a per principal level. This allows for users in a multi-tenant environment to safely be able to reference specific keytabs and principals. This ensures that just because a user has access to a HDFS keytab they will not have access to all of the HDFS principals. This provides a more granular control so that users are limited to only the principals they require.

Kafka 1.1.1 Support
In HDF 3.2, Kafka has been upgraded from 1.0.0 to 1.1.1. Key features and improvements have been added with respect to security and governance. In addition to these bug fixes, an important new feature was added to capture producer and topic metrics at partition level without instrumenting or configuring interceptors on the clients. This provides a non-invasive approach to capture important metrics for producers without refactoring/modifying your existing Kafka clients

Hive 3 support
Apache NiFi now supports Hive 3 running on HDP 3.0. This support ensures better performance for Hive streaming to HDP, Hive streaming to S3, and the ability to write directly to ORC from NiFi without first converting your datasets to Avro. Writing directly to ORC for better Hive query performance is accomplished by using the NiFi PutORC processor. With HDF 3.2, a few other processors related to HBase and HDFS have also been updated and enhanced.

Looks like there are some good updates to this version.

Related Posts

What’s New In Ambari 2.7

Paul Codding and Kat Petre share some of the new features in Ambari 2.7: With this release, we wanted to make Ambari more enjoyable to use every day, simplify finding and using our API, and unblock teams managing very large clusters.  Here is a preview of a few features we’re excited to share with you:Revamped […]

Read More

Working With Images In Spark 2.4

Tomas Nykodym and Weichen Xu give us an update on working with images in the most recent version of Apache Spark: An image data source addresses many of these problems by providing the standard representation you can code against and abstracts from the details of a particular image representation.Apache Spark 2.3 provided the ImageSchema.readImages API (see Microsoft’s post […]

Read More

Categories

August 2018
MTWTFSS
« Jul Sep »
 12345
6789101112
13141516171819
20212223242526
2728293031