This view of microservices shares much in common with object-oriented programming: encapsulated data access and mutable state change are both achieved via synchronous calls, the web of such calls among services forming a graph of dependencies. Programmers can and should enjoy a lively debate about OO’s merits and drawbacks for organizing code within a single memory and process space. However, when the object-oriented analogy is extended to distributed systems, many problems arise: latency which grows with the depth of the dependency graph, temporal liveness coupling, cascading failures, complex and inconsistent read-time orchestration, data storage proliferation and fragmentation, and extreme difficulty in reasoning about the state of the system at any point in time.
Luckily, another programming style analogy better fits the distributed case: functional programming. Functional programming describes behavior not in terms of in-place mutation of objects, but in terms of the immutable input and output values of pure functions. Such functions may be organized to create a dataflow graph such that when the computation pipeline receives a new input value, all downstream intermediate and final values are reactively computed. The introduction of such input values into this reactive dataflow pipeline forms a logical clock that we can use to reason consistently about the state of the system as of a particular input event, especially if the sequence of input, intermediate, and output values is stored on a durable, immutable log.
It’s an interesting analogy.
A major drawback of modulo hashing is that the size of the cache pool needs to be stable over time. Changing the size of the cache pool will cause most cache keys to hash to a new server. Even though the values are still in the cache, if the key is distributed to a different server, the lookup will be a miss. That makes changing the size of the cache pool—to make it larger or for maintenance—an expensive and inefficient operation, as performance will suffer under tons of spurious cache misses.
For instance, if you have a pool of 4 hosts, a key that hashes to
500will be stored on pool member
500 % 4 == 0, while a key that hashes to
1299will be stored on pool member
1299 % 4 == 3. If you grow your cache by adding a fifth host, the cache pool calculated for each key may change. The key that hashed to
500will still be found on pool member
500 % 5 == 0, but the key that hashed to
1299be on pool member
1299 % 5 == 4. Until the new pool member is warmed up, your cache hit rate will suffer, as the cache data will suddenly be on the ‘wrong’ host. In some cases, pool changes can cause more than half of your cached data to be assigned to a different host, slashing the efficiency of the cache temporarily. In the case of going from 4 to 5 hosts, only 20% of cache keys will be on the same host as before!
It’s interesting reading.
So we can use the Kafka Streams API to piece together complex business systems as a collection of asynchronously executing, event-driven services. The differentiator here is the API itself, which is far richer than, say, the Kafka Producer or Consumer. It makes code more readable, provides reusable implementations of common patterns like joins, aggregates, and filters and wraps the whole ecosystem with a transparent level of correctness.
Systems built in this way, in the real world, come in a variety of guises. They can be fine grained and fast executing, completing in the context of an HTTP request, or complex and long-running, manipulating the stream of events that map a whole company’s business flow. This post focusses on the former, building up a real-world example of a simple order management system that executes within the context of a HTTP request, and is entirely built with Kafka Streams. Each service is a small function, with well-defined inputs and outputs. As we build this ecosystem up, we will encounter problems such as blending streams and tables, reading our own writes, and managing consistency in a distributed and autonomous environment.
This post stays high-level and covers a lot of ground. I’m wishy-washy on the idea of microservices, but if you are going to do them, it’s better to do them right.
Looking at the chart above, where the dotted red line is a reference point to show where we started the year, notice how site speed improvements tend to be significant and noticeable, as they are optimization-driven. Degradations, however, can generally be of any “amount,” as they happen for various reasons. LinkedIn’s page-serving pipeline has many moving parts. We deploy code multiple times per day, operate a micro-service architecture with hundreds of services, and infrastructure upgrades are frequent. A slowdown in any of these components can cause degradations.
While large degradations can be caught using A/B testing, canary analysis, or anomaly detection, small ones tend to leak to production. Thus, performance of a page has a tendency to always degrade over time.
This led to having the centralized Performance Team focus on identifying these leaks, called “site speed regressions,” and to craft tools and processes to fix them.
It’s an interesting principle. I could see this principle work for tracking database performance degradation as well.
Once you have the adjacency matrix, you perform some straightforward matrix calculations to calculate a vector of Hub scores and a vector of Authority scores as follows:
- Sum across the columns and normalize, this becomes your Hub vector
- Multiply the Hub vector element-wise across the adjacency matrix
- Sum down the rows and normalize, this becomes your Authority vector
- Multiply the Authority vector element-wise down the the adjacency matrix
An important thing to note is that the algorithm is iterative: you perform the steps above until eventually you reach convergence—that is, the vectors stop changing—and you’re done. For our purposes, we just pick a set number of iterations, execute them, and then accept the results from that point. We’re mostly interested in the top entries, and those tend to stabilize pretty quickly.
This is an architectural-level post, so there’s no code but there is a useful discussion of the algorithm.
Overdelivery occurs when free ads are shown for out-of-budget advertisers. This reduces opportunities for advertisers with available budget to have their products and services discovered by potential customers.
Overdelivery is a difficult problem to solve for two reason:
Real-time spend data: Information about ad impressions needs to be fed back into the system within seconds in order to shut down out-of-budget campaigns.
Predictive spend: Fast, historical spend data isn’t enough. The system needs to be able to predict spend that might occur in the future and slow down campaigns close to reaching their budget. That’s because an inserted ad could remain available to be acted on by a user. This makes the spend information difficult to accurately measure in a short timeframe. Such a natural delay is inevitable, and the only thing we can be sure of is the ad insertion event.
This is a very interesting architectural overview.
The essence of this architecture is that it uses Kafka as an intermediary between the various data sources from which feature data is collected, the model building environment where the model is fit, and the production application that serves predictions.
Feature data is pulled into Kafka from the various apps and databases that host it. This data is used to build models. The environment for this will vary based on the skills and preferred toolset of the team. The model building could be a data warehouse, a big data environment like Spark or Hadoop, or a simple server running python scripts. The model can be published where the production app that gets the same model parameters can apply it to incoming examples (perhaps using Kafka Streams to help index the feature data for easy usage on demand). The production app can either receive data from Kafka as a pipeline or even be a Kafka Streams application itself.
This is approximately 80% of my interests wrapped up in one post, so of course I’m going to read it…
In SQL Server, scaling out reads (i.e. using Active secondary replicas via AlwaysOn Availability Groups) is a lot easier than scaling out writes. So what are your options when you have a tremendous amount of writes that scaling up will not handle, no matter how big your server is? There are a number of options that allow you to write to many servers (instead of writing to one master server) that I’ll call distributed writes. Here are some ideas:
Peer-to-Peer transactional replication (or Multi-master replication) with SQL Server. See Peer-to-Peer – Transactional Replication
Merge replication in SQL Server. See Merge Replication
Create a messaging and queuing application in SQL Server Service Broker where all writes are placed on the queue and sent to different servers
Read on for more options and some additional thoughts around Cosmos DB. My first inclination would be to put Kafka in front of a distributed write system, but that’s my bias.
The basic idea of a trigram search is quite simple:
- Persist three-character substrings (trigrams) of the target data.
- Split the search term(s) into trigrams.
- Match search trigrams against the stored trigrams (equality search)
- Intersect the qualified rows to find strings that match all trigrams
- Apply the original search filter to the much-reduced intersection
We will work through an example to see exactly how this all works, and what the trade-offs are.
A must-read. N-grams in SQL Server is an example of a non-obvious data architecture which performs much better than the obvious alternative, at least when the conditions are right.
These are all sources of what we call published content. This is content that has been written, edited, and that is considered ready for public consumption.
On the other side we have a wide range of services and applications that need access to this published content — there are search engines, personalization services, feed generators, as well as all the different front-end applications, like the website and the native apps. Whenever an asset is published, it should be made available to all these systems with very low latency — this is news, after all — and without data loss.
This article describes a new approach we developed to solving this problem, based on a log-based architecture powered by Apache KafkaTM. We call it the Publishing Pipeline. The focus of the article will be on back-end systems. Specifically, we will cover how Kafka is used for storing all the articles ever published by The New York Times, and how Kafka and the Streams API is used to feed published content in real-time to the various applications and systems that make it available to our readers. The new architecture is summarized in the diagram below, and we will deep-dive into the architecture in the remainder of this article.
This is a nice write-up of a real-world use case for Kafka.