July 27, 2017

Big Twitter Data Predicts Trends

“When you notice patterns over time, you have a suspicion that there’s an underlying reason why they happen — and maybe we can encode that in a model so that with just a little bit of data we can predict what will happen,” noted Josh Montague. Montague and Scott Hendrickson, another Data Scientist at Twitter, created what they call “The Social Media Pulse” which is a model that can predict twitter explosions of tweets when big events happen.

From their report

In the digitally connected world, social media platforms are often the primary means by which people share their observations and perspectives during significant events. Such platforms present low-friction ways to share their point-of-view experiences, whether in simple text or visual media (photos, videos, gifs, etc.). Because of this ease of sharing, the aggregate data produced by the platform’s users is a rich source of insight into broad, cultural behavior. At times, we can even observe the ways by which these behaviors manifest in platform-specific patterns. Given enough data that displays these patterns, we can begin to develop models based on them.

An analyst can be prepared to produce both descriptive and predictive results based on observed data by empowering them with models that describe the users and the users’ responses to events. Obtaining a model representation of the data enables the analyst to compare parameters across multiple events (e.g., time scales, coefficient magnitudes); or, for a single event, one could compare similar parameters across multiple social platforms. Such a model could also be a component of broader trend- or event-detection methods, potentially assisting the analyst in handling real-time news media or public relations.

Their model looks at 3 types of events that can explode on Twitter; Expected Events, Unexpected Network Spread Events and Unexpected Social Media Pulse Events. “With a Social Media Pulse model applied to observed data, one can calculate relevant metrics like an estimated time to the Pulse peak, or total expected Tweet volume,” noted Cassie Stewart, who is in Data Process Marketing at Twitter, in a blog announcement. “The Social Media Pulse model can take on a range of similar shapes, as shown in the figure below that compares three different earthquakes. The resulting fits can then be compared across multiple events to draw comparisons.”

screen-shot-2016-12-01-at-4-55-31-pm

“This model is just a start, but with it we have an opportunity to look for, and compare patterns observed on the platform,” says Stewart. “Through the use of analyses like these you can gain quantitative insights into real-time, real-world events, to better quantify the observations and conclusions made from social data streams.”

The team has released code details with examples to encourage developers to incorporate their model into third party applications. (PDF Download and GitHub Repository)