Programmatic advertising is an actively developing industry and research area. Some of the research in this area concerns the development of optimal or approximately optimal contracts and policies between publishers, advertisers and intermediaries such as ad networks and ad exchanges. Both the development of contracts and the construction of policies governing their implementation are difficult challenges, and different models take different features of the problem into account. In programmatic advertising decisions are made in real time, and time is a scarce resource particularly for publishers who are concerned with content load times. Policies for advertisement placement must execute very quickly once content is requested; this requires policies to either be pre-computed and accessed as needed, or for the policy execution to be very efficient. We formulate a stochastic optimization problem for per publisher ad sequencing with binding latency constraints. Within our context an ad request lifecycle is modeled as a sequence of one by one solicitations (OBOS) subprocesses/lifecycle stages. From the viewpoint of a supply side platform (SSP) (an entity acting in proxy for a collection of publishers), the duration/span of a given lifecycle stage/subprocess is a stochastic variable. This stochasticity is due both to the stochasticity inherent in Internet delay times, and the lack of information regarding the decision processes of independent entities.
In our work we model the problem facing the SSP, namely the problem of optimally or near-optimally choosing the next lifecycle stage of a given ad request lifecycle at any given time. We solve this problem to optimality (subject to the granularity of time) using a classic application of Richard Bellman's dynamic programming approach to the 0/1 Knapsack Problem. The DP approach does not scale to a large number of lifecycle stages/subprocesses so a sub-optimal approach is needed. We use our DP formulation to derive a focused real time dynamic programming (FRTDP) implementation, a heuristic method with optimality guarantees for solving our problem. We empirically evaluate (through simulation) the performance of our FRTDP implementation relative to both the DP implementation (for tractable instances) and to several alternative heuristics for intractable instances. Finally, we make the case that our work is usefully applicable to problems outside the domain of online advertising.