Beginning With Amazon Athena

Jen Underwood looks at the basics behind Amazon Athena:

Today early adopters of Amazon Athena are using it for big data analytics pipeline projects along with Kinesis streaming data and other Amazon data sources.

Athena is serverless parallel query pay-per-use service. There is no infrastructure to set up or manage. It scales automatically and can handle large datasets or complex distributed queries.

The easy way of thinking about Athena is that it’s ElasticMapReduce (a pay-as-you-go Hadoop cluster) without the ceremony of administering or spinning up the cluster.

Related Posts

Corrupting Managed Instances

Brent Ozar has found a bug with Azure SQL Database Managed Instances: Corruption happens. It’s just a fact of life – storage is gonna fail. Microsoft’s SLAs for storage only give you 3-4 9’s, and there’s nothing in there about never losing your data. Nothing against Azure, either – I’ve lost entire VMs in AWS due to […]

Read More

Data Lake Permissions

Melissa Coates has started a multi-part series on Azure Data Lake permissions.  She’s put up the first three parts already.  Part 1 covers the types of permissions available as well as some official documentation: (1) RBAC permissions to the ADLS account itself, for the purpose of managing the resource. RBAC = Role-based access control. RBAC are […]

Read More


January 2017
« Dec Feb »