Provenance In Distributed Systems

Jessica Kerr discusses methods for determining data lineage, particularly in distributed systems:

Can you take a piece of data in your system and say what version of code put it in there, based on what messages from other systems? and what information a human viewed before triggering an action?

Me neither.

Why is this acceptable? (because we’re used to it.)
We could make this possible. We could trace the provenance of data. And at the same time, mostly-solve one of the challenges of distributed systems.

This is an interesting essay; read the whole thing.

Related Posts

Finding Dependency Clusters

Michael J. Swart performs cluster analysis with tables: That ball of mush in the middle is hard to look at, but the smaller disconnected bits aren’t! Just like Ben, I want to work on those smaller pieces too! And just like the lonely tables we looked at last week, these small isolated components are also good […]

Read More

Big Data Often Isn’t

Arnon Rotem-gal-oz argues that “big data” is often a misnomer: I couldn’t find numbers from Google but others say that by 2017 Google processed over 20PB a day (not to mention answering 40K search queries/second) so Google is definitely in the big data game. The numbers go down fast after that, even for companies who are […]

Read More

Categories

September 2016
MTWTFSS
« Aug Oct »
 1234
567891011
12131415161718
19202122232425
2627282930