2 Attention Metric
2.1 Definition
For each processed news story, our topic model generates a vector of 200 probabilities. These probabilities represent the proportion of text associated with each topic that the model detects. Based on these values, we calculate an attention metric to measure the popularity of a topic at a specific granularity level and/or geographical zone. Currently, we provide attention values daily and monthly for the US, with the possibility of extending this coverage to other time zones through a prior agreement.
To minimise errors during the inference process, we consider only news stories that contain at least 100 words and have topics that capture at least 10% of the content. This helps ensure the accuracy of the scores.
The daily attention score is calculated by summing the probabilities associated with a specific topic throughout the day. The aggregated value is normalized by considering the count of all the processed news stories on the same day. The attention is a value ranging in the interval [0, 1], which facilitate the interpretation of the metric.