I recently came across an insightful research paper titled “Moving Fast With Broken Data” by Shreya Shankar, Labib Fawaz, Karl Gyllstrom, and Aditya G. Parameswaran from UC Berkeley and Meta. The paper addresses the significant issue of data corruption in machine learning (ML) pipelines, which often leads to decreased model accuracy. The authors present an automatic data validation system implemented at Meta that aims to solve this problem.
Sounds like I have some beach reading.
Ed. Note: He’s kidding, right?
Ed. 2 Note: About going to the beach maybe.
Ed. & Ed. 2 Note: HAHAHAHAHAH.
Yeah, I hired Statler and Waldorf as my editors. Worst Best decision of my life.