Creating A Linked Server To Amazon Athena

Kevin Feasel

2017-12-18

Cloud

Maria Zakourdaev shows that you can create a linked server connection in SQL Server to query data using Amazon Athena:

I will show you today how you can use Management Studio or any stored procedure to query the data, stored in a csv file, located on S3 storage. I am using CSV file format as an example here, columnar PARQUET gives much better performance.

I am going to:

1. Put a simple CSV file on S3 storage

2. Create External table in Athena service over the data file bucket

3. Create linked server to Athena inside SQL Server

4. Use OPENQUERY to query the data.

Athena service is built on the top of Presto, distributed SQL engine and also uses Apache Hive to create, alter and drop tables. You can run ANSI SQL statements in the Athena query editor, launching it from the AWS web services UI. You can use complex joins, window functions and many other great SQL language features. Using Athena eliminates need for ETL because it projects your schema on the data files at the time of the query.

Standard linked server warnings apply, but sometimes you need to bridge a couple technologies.

Related Posts

Jupyter Notebooks In Azure

Steve Jones looks at using Jupyter Notebooks in Azure: There’s a new feature in Azure, and I stumbled on it when someone posted a link on Twitter. Apologies, I can’t remember who, but I did click on the Azure Notebooks link and was intrigued. I’ve gotten Jupyter notebooks running on my local laptop, but these are often […]

Read More

Uploading Data Sets To Azure ML From R

Leila Etaati continues her series on the Azure ML R package by showing how to upload a data set: There is a function in AzureML package name “workspace” that creates a reference to an AzureML Studio workspace by getting the authentication token and workspace id as below: 1 ws <– workspace( id , auth  ) to […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031