Backup And Recovery With Hadoop

Tim Spann explains how to perform backup/recovery operations and disaster recovery using Hadoop:

You can mirror datasets with Falcon. Mirroring is a very useful option for enterprises and is well-documented. This is something that you may want to get validated by a third party. See the following resources:

Tim shows several recovery options, making it useful reading if you use Hadoop as a source system for anything (or if you can’t afford it to be down for a 2-3 day period as you recover data).

Related Posts

Auto ML With SQL Server 2019 Big Data Clusters

Marco Inchiosa has a model scenario for using Big Data Clusters to scale out a machine learning problem: H2O provides popular open source software for data science and machine learning on big data, including Apache SparkTM¬†integration. It provides two open source python AutoML classes: h2o.automl.H2OAutoML and pysparkling.ml.H2OAutoML. Both APIs use the same underlying algorithm implementations, […]

Read More

Erasure Coding In Hadoop

Guy Shilo explains erasure coding, a new feature in Hadoop 3: The benefits are, of course, space-saving, and for large files also improved performance (blocks striped across datanodes can be read in parallel, and less blocks are written because there is no x3 replication). The larger the file the more notable is the performance gain. […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031