Presto On HDInsight

Ashish Thapliyal shows how to install Presto on an HDInsight cluster:

What is Presto?

Presto is a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. Presto is becoming popular SQL interactive query engine that has grabbed the attention and mind-share in Big data communities.

What are the key advantages of Presto?

1- It’s very fast – Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses.

2- Presto can query data where it lives – Presto supports many data sources via the number of connectors that community has built. You can query HDFS , Hive, Azure Storage or data stored in SQL Server , My SQL , CosmosDB or Cassandra etc.

You can install Presto in one simple step with HDInsight Script Action feature

Read on for instructions and showing how to connect this to other Azure products like CosmosDB and Azure SQL Database.

Related Posts

Hortonworks Data Platform 3.0 Released

Saumitra Buragohain, et al, announce the newest version of the Hortonworks Data Platform: Highlighted Apache Hive features include: Workload management for LLAP:  You can assign resource pools within LLAP pool and allocate resources on a per user or per group basis. This enables support for large multi-tenant deployments. ACID v2 and ACID on by default:  We are […]

Read More

New Features In Public Preview On Azure SQL Database

Microsoft has a round of announcements for public previews on Azure SQL Database.  First up is Kevin Farlee announcing approximate count distinct: The new APPROX_COUNT_DISTINCT aggregate function returns the approximate number of unique non-null values in a group. This function is designed for use in big data scenarios and is optimized for the following conditions: Access of […]

Read More

Categories