I will show you today how you can use Management Studio or any stored procedure to query the data, stored in a csv file, located on S3 storage. I am using CSV file format as an example here, columnar PARQUET gives much better performance.
I am going to:
1. Put a simple CSV file on S3 storage
2. Create External table in Athena service over the data file bucket
3. Create linked server to Athena inside SQL Server
4. Use OPENQUERY to query the data.
Athena service is built on the top of Presto, distributed SQL engine and also uses Apache Hive to create, alter and drop tables. You can run ANSI SQL statements in the Athena query editor, launching it from the AWS web services UI. You can use complex joins, window functions and many other great SQL language features. Using Athena eliminates need for ETL because it projects your schema on the data files at the time of the query.
Standard linked server warnings apply, but sometimes you need to bridge a couple technologies.