Lambda Architecture

Sebastiao Correia discusses Lambda architecture:

The batch layer stores all the data with no constraint on the schema. The schema-on-read is built in the batch views in the serving layer. Creating schema-on-read views requires algorithms to parse the data from the batch layer and convert them in a readable way. This allows input data to freely evolve as there is no constraint on their structure. But then, the algorithm that builds the view is responsible to manage the structural change in order to still deliver the same view as expected. 

This shows a coupling between the data and the algorithms used for serving the data. Focusing on data quality is therefore not enough and we may ask the question of the algorithm quality. As the system lives and evolves, the algorithms may become more and more complex. These algorithms must not be regarded as black boxes, but a clear understanding of what they are doing is important if we want to have a good data governance. Moreover, during the batch view creation, data quality transformations could be done so as to provide data of better quality to the consumer of the views.

Lambda is an interesting architectural concept, as it tries to solve the age-old “fast or accurate?” problem with “both.”  Get your fast estimates streamed through a speed layer, but your accurate, slow calculations handled through the serving layer.  Definitely check out this article.

Related Posts

Microservices With Kafka Streams

Ben Stopford walks us through a microservices architecture built on top of Kafka: So we can use the Kafka Streams API to piece together complex business systems as a collection of asynchronously executing, event-driven services. The differentiator here is the API itself, which is far richer than, say, the Kafka Producer or Consumer. It makes […]

Read More

Thinking About Slowly Degrading Page Performance

Ritesh Maheshwari talks about how LinkedIn deals with performance regressions: Looking at the chart above, where the dotted red line is a reference point to show where we started the year, notice how site speed improvements tend to be significant and noticeable, as they are optimization-driven. Degradations, however, can generally be of any “amount,” as […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930