JSON Parsing In U-SQL

Kevin Feasel

2016-10-07

JSON, U-SQL

Ginger Grant pulls out everybody’s favorite .NET JSON parser:

In USQL there are built-in extractors for parsing text, comma delimited or tab delimined files. Once again, parsing JSON becomes problematic. There is a solution built into USQL, write some C# code to extend it or use someone else’s C# code to extend USQL. Since I wanted to parse JSON, fortunately there are libraries available on github containing the information required to do it. Download the github package and open up the Microsoft.Analytics.Samples project in Visual Studio. When I did this the first time, there was a problem loading the Newtonsoft.Json reference, so I right clicked on the references and downloaded the missing parts again. Build the solution and check out the code in the directory …Examples\DataFormats\Microsoft.Analytics.Samples.Formats\bin\Debug\ . There will be two DLLs, Microsoft.Analytics.Samples.Formats.dll and Newtonsoft.Json.dll. These dlls then need to be registered in Data Lake Analytics and locally if you chose to run your USQL locally. As at some point the goal is to run from within Data Lake analytics, you will need to copy both of these dlls to the data lake. I created a folder for the dlls called Assemblies, and ran this command

It’s funny how often that library comes up…  Click through to see how to use it with U-SQL jobs.

Related Posts

Splitting Arrays with OPENJSON

Kevin Feasel

2019-05-08

JSON, T-SQL

Dave Mason continues a journey into parsing JSON with T-SQL: Starting with SQL Server 2016, Microsoft provided a STRING_SPLIT function. It is a table-valued function that splits a string into rows of substrings, based on a specified separator character. It’s been a welcome addition that we waited a long time for. It has one shortcoming, though: […]

Read More

Parsing JSON with T-SQL

Kevin Feasel

2019-05-06

JSON, T-SQL

Dave Mason has a primer on JSON parsing using T-SQL: Microsoft added support for JSON data beginning with SQL Server 2016. JSON is an open-standard file format consisting of attribute–value pairs and array data types. It is commonly used to transmit data objects for asynchronous browser–server communication. But it is also used for storing unstructured data […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31