Hortonworks Data Analytics Studio

Kevin Feasel

2018-09-14

Hadoop

Will Xu and Syed Mahmood announce Hortonworks Data Analytics Studio:

DAS leverages open-source technologies such as Apache Hive to share and extend the value of a modern data architecture in heterogeneous environments. It helps infrastructure administrators manage and optimize the performance of their Hive workloads by delivering visibility into query patterns and storage hotspots. DAS improves performance by uncovering inhibitors to query speed as well as providing recommendations to improve its efficiency.

In the past, Hive view did not provide full auto-complete capability during authoring time. We’ve addressed this shortcoming in DAS. This is not a trivial task especially on large databases, however through a number of caching optimizations we were able to make it work smoothly even with thousands of tables.

This product feels more like Management Studio or SQL Operations Studio than prior Hive UIs.  That’s definitely a good thing.

Related Posts

When Not to Use Spark

Ramandeep Kaur gives us several cases when it makes sense not to use Apache Spark: There can be use cases where Spark would be the inevitable choice. Spark considered being an excellent tool for use cases like ETL of a large amount of a dataset, analyzing a large set of data files, Machine learning, and […]

Read More

Hyperparameter Tuning with MLflow

Joseph Bradley shows how you can perform hyperparameter tuning of an MLlib model with MLflow: Apache Spark MLlib users often tune hyperparameters using MLlib’s built-in tools CrossValidator and TrainValidationSplit.  These use grid search to try out a user-specified set of hyperparameter values; see the Spark docs on tuning for more info. Databricks Runtime 5.3 and 5.3 ML and above support […]

Read More

Categories

September 2018
MTWTFSS
« Aug Oct »
 12
3456789
10111213141516
17181920212223
24252627282930