Ready to Start?
Run effective ML-based UA campaigns based on contextual targeting.
With the effective depreciation of the IDFA on iOS 14, advertisers are challenged to find an alternative non user-based targeting solution, that would still generate competitive CPIs and valuable users - the game is different, but the goal remains the same - effective growth.
As contextual targeting now takes center stage, in order to gain information that goes beyond the easily manipulated store category, we use a word representation method called ELMo to represent each bundle as a vector, enabling measurement of contextual ′distance′ between apps. The calculated distance is then fed as one of the hundreds of other features to our prediction models and plays a significant role in our bidding logic.
To highlight its effectiveness, we created the Context Distance Calculator. Using it will show you the apps closest to yours in terms of mechanics, theme, features, and more, based on an intricate comparison between each app store description text.
When engineering context-related features for our install and purchase prediction models, we considered two key preconceived notions: context plays a major role in predicting user behavior and ad engagement, and that context can be significantly nuanced.
These notions rendered using store categories useless, as they include inherent bias and in some cases, blatant inaccuracies due to ASO concerns, and other approaches such as creating categories ourselves by clustering using topic modeling methods (such as LDA or W2V) didn′t generate the information gain and accuracy improvement we aimed for, we realized that we have to find a way to represent each and every bundle individually.
After weighing the complexity of the task and the tools commonly used to solve language-related problems with machine learning, our data scientists′ research led to testing ELMo. ELMo uses a neural network to learn associations between words and their meaning when used in different contexts, by learning in what phrases they are used inside a massive 5.5 billion token (words and their composing parts) data set.
After the ELMo model is trained, it represents each word as a vector, which allows us to measure the cosine distance between each word, which indicates the level of semantic and syntactical similarity between them. Since considering each word is crucial for understanding context, ELMo proves to be one of the most reliable options for such a complex task.
An example of how word vectors are contextually close, while effectively being measured by cosine distance
A depiction of how a bundle vector is formed from a weighted sum aggregation of its words
After generating the word vectors, the next step is to aggregate them for each bundle, in a way that would represent its context effectively. Ostensibly, embedding the entirety of the store description into a single vector representation, so we could measure the contextual ′distance′ between different apps.
This step in our research and development process turned out to be the most convoluted. Simply averaging the vectors of all words in the app′s store description sounds reasonable, but would incur losing valuable data that relates to the frequency in which specific words appear or repeat themselves within the specific apps store description, as well as the entire corpus.
After extensive testing, the approach that generated the most information gain in testing was based on a staple numerical statistic in NLP (natural language processing, the area of machine learning that focuses on text analysis) - TF-IDF - Term Frequency Inverse (to) Document Frequency, which basically indicates the rarity of specific words inside a corpus.
Eventually, our bundle representation evolved to be a weighted average of each word vector, multiplied by its normalized TF-IDF value, in order to put emphasis on the more rare and nuanced words that usually express each apps′ unique themes, mechanics and features, and therefore providing our models with very granular yet valuable information.
Run effective ML-based UA campaigns based on contextual targeting.