Over the past few years at Databricks, we’ve seen a new data management paradigm that emerged independently across many customers and use cases: the lakehouse. In this post we describe this new paradigm and its advantages over previous approaches.
The Data Lake’s Aristotelian counterpart is the Data Swamp. I’m working on a similar comp for the Data Lakehouse (Data Swampboat? Data Swamphouse is too easy), but in the meantime, that one person who goes and slaughters your application’s performance by butchering the data in your Data Lakehouse? That’s a Data Jason.