Apache Falcon is a framework for managing data life cycle in Hadoop clusters. It establishes relationship between various data and processing elements on a Hadoop environment, and also provides feed management services such as feed retention, replications across clusters, archival etc.
Let us first discuss how to setup Apache Falcon. Run the below given command to download git repository of Falcon:
Command: git clone https://git-wip-us.apache.org/repos/asf/falcon.git falcon
Falcon comes as part of the Hortonworks Data Platform; Cloudera has its own alternative.