Image Processing In U-SQL

Kevin Feasel



Rukmani Gopalan and Apostolos Lerios show how to perform image processing using U-SQL:

We have published C# libraries that supply UDOs and UDFs for processing images with U-SQL in our GitHub site. In this section, we introduce these UDOs and UDFs and, in the next section, we use them within a U-SQL walkthrough to operate on images.

The basic flow behind processing images in U-SQL has three stages:

  1. Use the custom UDO extractor ImageExtractor to read a (JPEG or non-JPEG) image file and return the image data as a byte[] column value which contains the same exact image as the file in an (always) JPEG representation. Please note that there is a current limitation in U-SQL that a row cannot exceed a size of 4 MB, so you will run into issues if your image size is greater than 4 MB.

  2. Use the image processing UDFs to manipulate this byte[] (the UDFs support JPEG and non-JPEG representations within this byte[] despite the previous step always producing a JPEG representation). For example, one UDF extracts metadata from an image to produce textual or numeric data. More interesting UDFs derive an output image from an input image; that output represents the visually transformed input (e.g. rotated or scaled/resized), also stored as a byte[] containing an (always) JPEG representation of the output.

  3. Use the custom UDO outputter ImageOutputter to writes each byte[] to a JPEG image file so that we can view the output images of the aforementioned UDFs.

The major value proposition to me for U-SQL is “doing stuff SQL can’t do very well.”  This is one of those cases.

Related Posts

Overview: U-SQL Database Projects

Zach Stagers gives us an overview of the new U-SQL Database Project structure: Source Control The projects integrates much more nicely with TFS than the older “U-SQL Project” does. It actually gives you the icons (padlock, check mark, etc..) in the solution explorer, so it actually looks like it’s under source control! Something that I’d really hoped […]

Read More

Reusing U-SQL Scripts

Kevin Feasel



Matthew Hicks shows how to use Powershell to parameterize U-SQL scripts: You can use this feature either via Azure Cloud Shell or on a Windows machine with Azure PowerShell installed. When submitting, simply construct a hashtable of U-SQL variable names to values and pass it in using the -ScriptParameter cmdlet parameter. The .NET type of each value in the hashtable is used when defining […]

Read More


August 2016
« Jul Sep »