Lipwig

Kevin Feasel

2016-06-07

Hadoop

Peter Coates shows how to make Hive EXPLAIN plans a lot prettier:

As you probably know, if you prepend the word EXPLAIN to your SQL query and then run it, Hive prints out a text description of the query plan. This lets you explore the effects such variations as code changes, the use of analyze, turning on/off the cost-based optimizer (CBO), and so on. It’s an essential tool for optimizing Hive.

The output of EXPLAIN is far from pretty, but fortunately, a simple pipeline of Linux commands can give you a slick graphical rendition like the one below.

I’m going to have to keep this in mind.

Related Posts

Getting Started With Zeppelin

Sangeeta Gulia shows us how to get started building notebooks with Apache Zeppelin on top of Spark: There are 3 interpreter modes available in Zeppelin. 1) Shared Mode In Shared mode, a SparkContext and a Scala REPL is being shared among all interpreters in the group. So every Note will be sharing single SparkContext and single […]

Read More

How Per-Second AWS Billing Helps With Data Processing

Prakash Chockalingam explains how AWS per-second billing can make resource allocation easier: Because of the hourly increments in billing, users spend a lot of time playing a giant game of Tetris with their big data workloads — figuring out how to pack jobs to use every minute of the compute hour. Examples: If a job […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930