User-Defined Functions In Hive

Kevin Feasel

2016-06-17

Hadoop

Tim Spann talks about user-defined functions in Hive:

When you start using Hive you may miss some of the functions you are used to from Oracle, MySQL or elsewhere. Or you might just want a profanity filter. Whatever the case you can browse our list below for a large selection of UDF libraries. You can also use the pointers listed to write your own.

The Brickhouse Collection of UDFs from Klout includes functions for collapsing multiple rows into one, generating top K lists, a distributed cache, bloom counters, JSON functions, and HBase tools.

Coming from a SQL Server background, UDFs might be something you instinctively avoid (or at least that’s the case with me).  In practice, though, they’re a really good addition to the product.

Related Posts

A Simple Example With Spark And Kafka

Gary Dusbabek has a nice example showing how to build a simple application with Spark and Kafka: This is a hands-on tutorial that can be followed along by anyone with programming experience. If your programming skills are rusty, or you are technically minded but new to programming, we have done our best to make this […]

Read More

Talking To Secure Hadoop Clusters

Mubashir Kazia shows how to connect to a secured Hadoop cluster using Active Directory: The primary form of strong authentication used on a secure cluster is Kerberos. Kerberos supports credentials delegation where a server process to which a user has authenticated, can perform actions on behalf of the user. This involves the server process accessing […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930