Matei Zaharia, et al, announce an interesting idea:
We are excited to announce data cleanrooms for the Lakehouse, allowing businesses to easily collaborate with their customers and partners on any cloud in a privacy-safe way. Participants in the data cleanrooms can share and join their existing data, and run complex workloads in any language – Python, R, SQL, Java, and Scala – on the data while maintaining data privacy.
With the demand for external data greater than ever, organizations are looking for ways to securely exchange their data and consume external data to foster data-driven innovations. Historically, organizations have leveraged data sharing solutions to share data with their partners and relied on mutual trust to preserve data privacy. But the organizations relinquish control over the data once it is shared and have little to no visibility into how data is consumed by their partners across various platforms. This exposes potential data misuse and data privacy breaches. With stringent data privacy regulations, it is imperative for organizations to have control and visibility into how their sensitive data is consumed. As a result, organizations need a secure, controlled and private way to collaborate on data, and this is where data cleanrooms come into the picture.
Read on to learn more about how this all works. It’s definitely a lot better than sending off a bunch of CSVs…