Predicting Advertising Budgets With Kafka Streams

Boyang Chen explains how Pinterest uses Kafka Streams to reduce advertising overdelivery:

Overdelivery occurs when free ads are shown for out-of-budget advertisers. This reduces opportunities for advertisers with available budget to have their products and services discovered by potential customers.

Overdelivery is a difficult problem to solve for two reason:

  1. Real-time spend data: Information about ad impressions needs to be fed back into the system within seconds in order to shut down out-of-budget campaigns.

  2. Predictive spend: Fast, historical spend data isn’t enough. The system needs to be able to predict spend that might occur in the future and slow down campaigns close to reaching their budget. That’s because an inserted ad could remain available to be acted on by a user. This makes the spend information difficult to accurately measure in a short timeframe. Such a natural delay is inevitable, and the only thing we can be sure of is the ad insertion event.

This is a very interesting architectural overview.

Related Posts

Page Ranking With Kafka Streams

Hunter Kelly walks through a page ranking algorithm: Once you have the adjacency matrix, you perform some straightforward matrix calculations to calculate a vector of Hub scores and a vector of Authority scores as follows: Sum across the columns and normalize, this becomes your Hub vector Multiply the Hub vector element-wise across the adjacency matrix […]

Read More

Stateful Processing In Spark Streaming

Bill Chambers and Jules Damji look at a couple of stateful scenarios within Spark Streaming: No streaming events are free of duplicate entries. Dropping duplicate entries in record-at-a-time systems is imperative—and often a cumbersome operation for a couple of reasons. First, you’ll have to process small or large batches of records at time to discard […]

Read More

1 Comment

  • Anuj Agarwal on 2017-10-14

    Hi Curated SQL Team,

    My name is Anuj Agarwal. I’m Founder of Feedspot.

    I would like to personally congratulate you as your blog Curated SQL has been selected by our panelist as one of the Top 40 Hadoop Blogs on the web.

    https://blog.feedspot.com/hadoop_blogs/

    I personally give you a high-five and want to thank you for your contribution to this world. This is the most comprehensive list of Top 40 Hadoop Blogs on the internet and I’m honored to have you as part of this!

    Also, you have the honor of displaying the badge on your blog.

    Best,
    Anuj

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

October 2017
MTWTFSS
« Sep  
 1
2345678
9101112131415
16171819202122
23242526272829
3031