Satish Bomma uses Apache NiFi to perform change data capture on a MySQL database:
The main things to configure is DBCPConnection Pool and Maximum-value Columns
Please choose this to be the date-time stamp column that could be a cumulative change-management column
This is the only limitation with this processor as it is not a true CDC and relies on one column. If the data is reloaded into the column with older data the data will not be replicated into HDFS or any other destination.
This processor does not rely on Transactional logs or redo logs like Attunity or Oracle Goldengate. For a complete solution for CDC please use Attunity or Oracle Goldengate solutions.
That last paragraph in the snippet is key: it’s not a true replacement for CDC-friendly products. It is, however, a good example for showing how to use NiFi to connect to a relational database and pump data out of it.