Against Visual Programming Languages

Ian Hellstrom has a critique of visual programming languages for data engineers:

Anyone with a software development background who has ever dealt with visual ETL tools may have marvelled at the lack of proper version control and diff tools that go with it. Some tools come with their own built-in VCS, while others allow you to use any or no VCS at all. The difficulty lies in the fact that the visual representation is often stored as an XML (or JSON) file. So, if a box is moved by 1 pixel, the file is different. You could argue that it’s indeed different because the layout is different, but you could equally make the case that the logic has not changed. This argument is moot though: it is technically possible to ensure that the tool auto-aligns blocks and routes/colours arrows, very much like yEd does (via menu items). Some users may not be happy with the reduced control over the way the flow looks, but others may rejoice that version control has become usable.

ETL (and ORM) tools often auto-generate code that is not particularly tuned for the data source in question. I have encountered many odd nested loops where simple hash joins would have been more appropriate if only the predicates had been pushed down properly (and if only the tool had evaluated blocks lazily). Aggregations and timestamp-based filters are also often a cause for performance issues. Again, performance is technically solvable, so this may be a valid argument against visual tools in data engineering now but perhaps not tomorrow.

This is a good argument against VPLs, although there are a couple of good arguments for VPLs, including how it’s easier to see if the overall architecture of a flow looks correct.  In the end, I like the compromise that Biml offers Integration Services developers:  write code but visualize results.

Related Posts

Package References And ssisUnit

Bartosz Ratajczyk shows us a few methods for setting package references in ssisUnit: When you set the packages’ references in the ssisUnit tests you have four options for the source (StoragePath) of the package: Filesystem – references the package in the filesystem – either within a project or standalone MSDB – package stored in the msdb database Package store – […]

Read More

Bug When Importing Packages In BimlExpress 2018

Kevin Feasel

2018-06-27

Biml, Bugs

Andy Leonard reports a bug as well as a temporary workaround for BimlExpress 2018: I had no sooner published my blog post about the coolness of Biml 2018 when I encountered a bug trying to use one of the features I really like – converting SSIS packages to Biml using (FREE!) BimlExpress 2018. My first response was, “Durnit! […]

Read More

Categories

September 2016
MTWTFSS
« Aug Oct »
 1234
567891011
12131415161718
19202122232425
2627282930