R On Athena

Kevin Feasel


Cloud, Hadoop, R

Gopal Wunnava shows how to run R scripts using Amazon Athena as a data source:

Next, you’ll practice interactively querying Athena from R for analytics and visualization. For this purpose, you’ll use GDELT, a publicly available dataset hosted on S3.

Create a table in Athena from R using the GDELT dataset. This step can also be performed from the AWS management console as illustrated in the blog post “Amazon Athena – Interactive SQL Queries for Data in Amazon S3.”

This is an interesting use case for Athena.

Related Posts

Logistic Regression In R

Steph Locke has a presentation on performing logistic regression using R: Logistic regressions are a great tool for predicting outcomes that are categorical. They use a transformation function based on probability to perform a linear regression. This makes them easy to interpret and implement in other systems. Logistic regressions can be used to perform a classification […]

Read More

Feature Improvements In Microsoft R Server 9.1

David Smith gives us a nice roundup of feature improvements in Microsoft R Server 9.1: Interoperability between Microsoft R Server and sparklyr. You can now use RStudio’s sparklyr package in tandem with Microsoft R Server in a single Spark session New machine learning models in Hadoop and Spark. The new machine learning functions introduced with Version 9.0 […]

Read More


March 2017
« Feb Apr »