U-SQL

Ginger Grant has a quick intro on U-SQL:

In my previous series on Stream Analytics, I wrote some U-SQL. That U-SQL didn’t look much different than Ansi-SQL, which is sort of the point of porting the functionality to a different yet familiar language. Another application which heavily uses U-SQL is Azure Data Lake. Data Lake stores its data in HDInsight, but you don’t need to write hive to query the data, as U-SQL will do it. Like Hive, U-SQL can be used to create a schema on top of some data, and then query it.

For example, to write a query on this csv file stored in a Data Lake, I would need to create the data definition for the data, then I could easily write a statement to query it.

I’m interested in seeing how much adoption we see in this language.

Related Posts

Setting Up SparklyR In Azure

David Smith shows how you can spin up a Spark cluster in Azure and install SparklyR on top of it: The SparklyR package from RStudio provides a high-level interface to Spark from R. This means you can create R objects that point to data frames stored in the Spark cluster and apply some familiar R paradigms (like dplyr) […]

Read More

Comparing Data Lake Job Runs

Yanan Cai shows how to compare stats on different executions of a job: Troubleshooting issues in recurring job is a time-consuming task. It starts with searching through the Job Browser to find instances of a recurring job and identifying both baseline and anomalous performance. This is followed by multi-way comparisons between job instances to figure out what […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031