Ben Lorica, et al, have a new metaphor to try out:
Over the past few years at Databricks, we’ve seen a new data management paradigm that emerged independently across many customers and use cases: the lakehouse. In this post we describe this new paradigm and its advantages over previous approaches.
The Data Lake’s Aristotelian counterpart is the Data Swamp. I’m working on a similar comp for the Data Lakehouse (Data Swampboat? Data Swamphouse is too easy), but in the meantime, that one person who goes and slaughters your application’s performance by butchering the data in your Data Lakehouse? That’s a Data Jason.
Lol, good post. Thanks for sharing. I am contemplating using a system to clean up my invoice data. Have you used any of these software before? I am If so, what’s your experience/thoughts? I am looking at several options ( https://www.bisok.com/data-science-workbench/data-cleansing-tools/ is one) right now but am certainly not an expert. Thanks