Beginning With Amazon Athena

Jen Underwood looks at the basics behind Amazon Athena:

Today early adopters of Amazon Athena are using it for big data analytics pipeline projects along with Kinesis streaming data and other Amazon data sources.

Athena is serverless parallel query pay-per-use service. There is no infrastructure to set up or manage. It scales automatically and can handle large datasets or complex distributed queries.

The easy way of thinking about Athena is that it’s ElasticMapReduce (a pay-as-you-go Hadoop cluster) without the ceremony of administering or spinning up the cluster.

Related Posts

Using Azure Blob Storage Archive Tier For Archival Data

Bob Pusateri shows us how to configure Azure Blob Storage Archive Tier: Two of the products I use extensively for this purpose are Amazon Glacier and, more recently, Microsoft Azure Blob Storage Archive Tier. As happy as I’ve been with Amazon Glacier since its introduction in 2012, I always hoped Microsoft would offer a similar service. My wish […]

Read More

Replicating Data In HDFS Between Clusters

Murali Ramasami and Niru Anisetti have an article showing how to use the Hortonworks Data Lifecycle Manager to set up replication between two Hadoop clusters: Data Lifecycle Manager (DLM) delivers on the promise of location-agnostic, secure replication by encapsulating and copying data seamlessly across physical private storage and public cloud environments. This empowers businesses to […]

Read More

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031