I am convinced they got tired of me saying, “Well, you’ll need SSIS if you want to read Excel…” Probably not, though. Either way… score! :{>
Read on to see how easy it is to select Excel as a file format for your Blob Storage data.
Comments closedA Fine Slice Of SQL Server
I am convinced they got tired of me saying, “Well, you’ll need SSIS if you want to read Excel…” Probably not, though. Either way… score! :{>
Read on to see how easy it is to select Excel as a file format for your Blob Storage data.
Comments closedMark Kromer meets a challenge:
Today, we don’t have an option in data flows in ADF to include headers when your transformations result in zero rows. But you can build the logic to handle this scenario. So, until we add a checkbox feature to include headers, you can use this technique below to achieve this.
Click through for the explanation, as well as a completed version you can take for your own.
Comments closedPaul Andrew has an update to the Azure Data Factory Procedural Framework:
The primary goal of this release was to implement email alerting within the existing processing framework and using existing metadata driven practices to deliver this in an easy to control, flexible and granular way. That said, the following statements have been met in terms of alerting capabilities and design.
Read on for the full change list.
Comments closedIn part three of this series I looked at functional tests for ADF pipelines: verifying, in isolation, that pipelines are “doing things right”. In this post I’ll be testing isolated pipelines to check that they’re “doing the right things” – this is one description of a unit test. In a general-purpose programming language, unit tests might be used to verify that an individual line of code is executed, or that it has a particular effect. In Azure Data Factory, the smallest unit of development – a “line of code” – is a pipeline activity. I will be writing tests to verify that specific activities are executed (or not) and to inspect their results.
There’s a fair bit involved in this sort of test.
Comments closedAdrian Hills walks us through the SqlBulkCopy
class:
Ever been in a situation where rumblings of “process X is too slow” suddenly build into a super-high priority ball of urgency when that next step up in data volume hits? Yeah, that can be fun. No, really, it can be fun because we have strategies to sort this stuff out, right?
In this blog post, I’m going to talk about one particular piece of functionality—SqlBulkCopy—that can help you with bulk data loading. If I had to single out my favorite .NET class, SqlBulkCopy would be at the top of the list. My goal is to introduce you to this class so that maybe it can become a part of your tool belt, too.
Click through to see how it works. If you’re familiar with SSIS, you’re already familiar with the concept if not the specifics.
Comments closedEd Elliott shares several approaches for executing Databricks code from Azure Data Factory:
.Net
No post I write about Spark is going to miss off .Net, if you are writing C# or F# then you can’t use a notebook anyway. If you do this then you will also include the .Net driver JAR file (weird statement to make) and call that.
Read on for all of the techniques.
Comments closedArun Sirpal takes us through the process of moving data from Databricks into Azure SQL Database:
Recently I got to a stage where I leveraged Databricks to the best of my ability to join couple of CSV files together, play around some aggregations and then output it back to a different mount point ( based on Azure Storage) as a parquet file, I decided that I actually wanted to move this data into Azure SQL DB, which you may want to do one day.
This isn’t just dropping files into Blob Storage and picking them up, but rather a direct integration.
Comments closedPaul Andrew explains how to become a factory mogul:
The obvious and easy reason for having multiple Data Factory’s could be that you simply want to separate your business processes. Maybe they all have separate data delivery requirements and it just makes management of data flows easier to handle. For example:
– Sales
– Finance
– HRThey could have different data delivery deadlines, they process on different schedules and don’t share any underlying connections.
You may also have multiple projects underway that mean you want to keep teams isolated.
But that’s not the only reason, so click through to learn several other reasons why you might have multiple Azure Data Factory instances running.
Comments closedHadi Fadlallah walks us through the HDFS file source and destination components:
To test these components, we will create an SSIS package and add three connection managers:
1. Hadoop Connection Manager: to connect with the Hadoop cluster (check the previous article)
2. OLE DB Connection Manager: to connect to SQL Server instance where AdventureWorks2017 database is stored
3. Flat File Connection Manager: We will use it to export data from HDFS Source:
I wonder if they ever fixed the 4K screen resolution problem (kind of tells you how often I use SSIS anymore…). That was one of the things which made these components unusable for me on any modern screen.
Comments closedRayis Imayev shows how we can find values in a group using Azure Data Factory:
However only this use-case for the OR function with 2 condition could be possible:or(equals(variables(‘var1’), ‘A’), equals(variables(‘var1’), ‘B’)) – limit of two conditions
But what if we have an ability to check if a particular element variable/parameter/other ADF object value belongs to a range of values (array of value), similarly to what we can do with the IN operator in SQL language, this would definitely solve our problem and remove the limitation of logical conditions to check.
Click through for the answer.
Comments closed