Digital Advertising is an astonishing Machine Learning playground, it combines data rich activities, scaling challenges and a lot of automation, especially since the rise of Programmatic buying and selling of ads in real-time.
With 20 billion page views and more than 3 billion unique viewer IDs each month we are now reaching interesting volumes for our algorithms.
In this first post we will describe some of the Machine Learning use cases that we have been working on:
- View-through rate prediction
- Broken creative detection
- Bid-request relevancy prediction
- Look-alike modeling
View-through rate prediction (VTR)
When we started experimenting machine learning two years ago, we wanted to predict the probability for a video to be watched for more than x seconds, according to the advertiser requirement. This prediction aims at only showing the most interesting ads for the users. Considering that at Teads we charge buyers only when an Ad is viewed, another benefit of using this model is that it avoids the waste of inventory. When the predicted view-through rate for a given advertiser is too low, the display opportunity is free for someone else to take it.
At that time, different possible options were available to build a Machine Learning system. Ranging from open source initiatives like Spark MLlib, Scikit-Learn, etc. to the recently introduced managed Machine Learning service from Amazon. As we were already using Spark to compute analytics jobs we initially studied MLlib.
Unfortunately, MLlib had (and still has) major limitations for our use cases. One of them is that Spark uses DenseVectors in its logistic regression implementation, which is incompatible with high-dimensional sparse data like ours. We also wanted to be able to use the same code for offline training and online predictions to avoid discrepancies.
This led us to develop our own prediction library, acting as a thin abstraction layer between Spark and underlying implementations from processing libraries like Breeze or our custom algorithms. This library is part of a more general prediction framework that enables us to test new experimental approaches and guarantees that the same code is used both online and offline.
Coming back to the VTR prediction, we used our framework to design an efficient model by tuning its hyper-parameters (training duration, regularization, optimization, …) and selecting an appropriate set of features.
These features include information from the advertiser, the publisher, the user and various interactions amongst them. For example, some ads have a better VTR on specific websites or on specific users.
To be effective, the resulting model learns 10^5+ parameters from 10^7+ examples and is updated every few hours. It takes 1ms to perform this prediction.
Broken creative detection
In another attempt to avoid inventory waste, we adopted a machine learning approach to detect broken video ads, in the sense that they cannot be played properly. The fact is, we cannot simply assess if a creative is « broken » as it can change over time and could be due to many reasons:
- It can be directly related to the availability and quality of the creative’s files,
- Also, it could be due to the creative’s behavior, depending on the context of execution. This case is impossible to test as it would require to assess all the combinations of publishers (webpages) and user contexts (OS and Browser type & version).
Thus, we needed to make a prediction as to when a video is likely to be broken so that we no longer try to display it.
This gives the opportunity for other advertisers to display their ads, and for the publishers to increase their fill rate.
This algorithm predicts if a creative is unlikely to start and is able to distinguish between many different contexts.
Bid-request relevancy prediction
Alongside the previous studies, we embraced Programmatic, an automated and non-managed way of delivering ads. With Programmatic, Teads opened its platform to external demand (DSPs). Prior to this work, whenever a slot was available to display an ad to a user, our SSP used to systematically send a bid request to all the connected DSPs.
This was rather inefficient for both sides, it caused a great waste of network resources and a useless load for our connected DSPs. The challenge was here to learn which requests were interesting to the buyers.
To tackle this issue, we developed a model that computes the probability that a given bid request will trigger a response from a given DSP. Using this model, only the most relevant users/contexts are sent. On the buyer side, only a high-quality inventory will be seen, this will simplify its filtering process and improve the overall performance.
We used a logistic regression model to classify the different requests depending on their probability to get a response. Then we defined a threshold above which requests are sent. Several feature combinations have been studied to adjust the model. The tuning of the classifier’s parameters was done during an offline experiment protocol.
As a result, we achieved a significant reduction of calls that were previously considered pointless. On average we are able to cut 60% of the Bid Requests without impacting the delivery for both DSPs and Publishers.
Throttling effect on the Network Out volumes generated by Bid Requests, last uptrend is specific to that day.
However, traffic reduction will never be optimal, our models need to stay reactive when the market evolves. Hence, we need to keep a high exploration rate and continue sending calls that have a low response probability to detect any behavior changes from the market.
Look-alike Modeling (on-going)
Some of our advertisers know the particular users who are highly interested in their brand or products and wish to promote to a similar audience. We plan to leverage our first-party data and our machine learning tools to compute similarities between any user and a target audience.
User clustering based on browsing history is a promising way of addressing this challenge. This will allow the advertisers to deliver their ads on the most interested and interesting users.
How to identify users with similar interests?
As one can imagine, there is an infinite source of applications for Machine Learning in the fast paced AdTech environment. In particular, other research topics address industry challenges like:
- Campaign conversion,
- Cross-device user identification,
- Inventory forecasting,
- Dynamic Creative Optimization,
- And many other cool things …
* * * *
Machine Learning gives us the possibility to improve the advertising experience on all sides and most importantly scale our business. There is still a lot to do and we are growing our team of Data Scientists & ML Engineers in both our offices in Paris and Montpellier to tackle these exciting challenges.
This is an original post by Benjamin Davy, Innovation Manager at Teads, and Cyrille Dubarry, Lead Data Scientist at Teads, which was posted on Medium.