Press "Enter" to skip to content

Building Data Pipelines with Apache NiFi

The Hadoop in Real World team takes a look at Apache NiFi:

NiFi is an easy to use tool which prefers configuration over coding.

However, NiFi is not limited to data ingestion only. NiFi can also perform data provenance, data cleaning, schema evolution, data aggregation, transformation, scheduling jobs and many others. We will discuss these in more detail in some other blog very soon with a real world data flow pipeline. 

Hence, we can say NiFi is a highly automated framework used for gathering, transporting, maintaining and aggregating data of various types from various sources to destination in a data flow pipeline.

Click through for an example with instructions. The feeling is pretty close to Informatica or SQL Server Integration Services, so if you’re an old hand at one of those, you’ll get into this pretty easily.