search button
newscenter logo
Thursday, March 23, 2023

Follow SDSU Follow SDSU on Twitter Follow SDSU on Facebook SDSU RSS Feed

SDSU geography professor Ming-Hsiang Tsou follows disease-related keywords on Twitter in order to identify outbreaks of influenza. Photo: Whitney Mullen SDSU geography professor Ming-Hsiang Tsou follows disease-related keywords on Twitter in order to identify outbreaks of influenza. Photo: Whitney Mullen

Hashtag Health

SDSU geography professor Ming-Hsiang Tsou's method of using Twitter to track the spread of influenza is producing results.
By Michael Price

As the United States enters the sniffly, sneezy heart of flu season, a social media–monitoring program led by a San Diego State University researcher could clue in physicians and health officials to when and where severe outbreaks are occurring in real time.

SDSU geography professor Ming-Hsiang Tsou, who leads the project, follows disease-related keywords that pop up on Twitter in order to identify locations where outbreaks of influenza are occurring, sometimes weeks before traditional methods can detect them. Last month, the first results from the project were published in the Journal of Medical Internet Research. Tsou’s technique might allow officials to more quickly and efficiently direct resources to outbreak zones and better contain the spread of the disease.

“There is the potential to use social media to really improve the way we monitor the flu and other public health concerns,”Tsou said.

Flu dynamics

Tsou's program plots tweets with flu-related keywords onto a map.
The Centers for Disease Control and Prevention (CDC) defines flu season as the period from October through May, usually peaking around February. The disease normally shows up first in the western half of the country and then zooms east. The 2012-13 flu season was atypical. Health officials documented the season’s initial cases on the East Coast, and it peaked earlier than normal, in January, petering out by March.

Such unpredictability makes it difficult for hospitals and regional health agencies to prepare for where and when to deploy physicians and nurses armed with vaccines and medicines.

Current methods for disease-reporting involve hospitals tracking incoming patients with flu-like symptoms and forwarding those data to regional health agencies. The agencies deliver that information to the CDC, who collates it and gives national guidance on where outbreaks are occurring. It’s a relatively lengthy and costly enterprise.

Tsou thought there had to be a better way. For several years, he has explored ways researchers can mine the tremendous volume of continuously updating, freely available social media content to uncover societal trends. The overarching project is called Mapping Ideas from Cyberspace to Realspace and is funded by a $1.3 million grant from the National Science Foundation. Tsou wondered, could his techniques be used to track the flu?

For the recent study, he and colleagues selected 11 U.S. cities for which disease reporting was readily available and developed a computer program to monitor tweets originating from within a 17-mile radius of those cities. Whenever people tweeted the keywords “flu” or “influenza,” the program would record the username; the user’s location (and GPS coordinates, when available); the tweet’s 140-character content; time and date; whether the tweet came from a cell phone or a computer; whether it was a tweet or retweet; and if it included a link to a website.

From June 2012 to the beginning of December, the algorithm recorded 161,821 tweets containing the word “flu,” 6,174 containing “influenza.”

Tsou then compared his team’s data to regional reports from city and county health agencies based on the CDC’s definition of influenza-like illnesses (ILI). Nine of the 11 cities showed a statistically significant correlation between locally reported outbreaks and a higher-than-normal count of keyword-containing tweets; Tsou’s algorithm detected outbreaks earlier than local reports in five of them. The cities with the strongest correlations were San Diego, Denver, Jacksonville, Seattle and Fort Worth.

“Traditional procedures take at least two weeks to detect an outbreak,” Tsou said. “With our method, we’re detecting daily.”

The tightest correlation between tweets and ILI data occurred with non-retweets and tweets that did not include a URL, possibly because retweets and links to other websites are less likely to reflect individuals posting about their own mucousy maladies, Tsou said.

“Original tweets have more power to predict relevance than retweets,” he said.

#Achoo #need a tissue

The next step in Tsou’s research is hunting for even finer-grained correlations between ILI data and specific symptomatic keywords like “cough,” “sneeze,” “congestion,” and “sore throat.”

Last week, Tsou demonstrated the power of this technique by calibrating his program to locate tweets originating from the San Diego region with the keywords “cold” plus either “sneezing” or “coughing” or “congestion.” It identified a handful of tweets that seemed to closely hit the mark:


  • "At home fighting a cold. Headache, congestion, body aches, etc. This sucks!"
  • "Blueberry, Strawberry, orange juice smoothie...lethal combination of Vitamin C. I'm going to destroy my cough and cold! #bam"

“These represent people who probably have the flu,” Tsou said. “On the other hand, there are also some tweets that are probably not related to the flu, even though we used this very relevant keyword search. For example, ‘It's cold in my room and I can't stop sneezing.’”

The trick will be tuning his program to filter out these less relevant tweets and zero in on those that indicate the tweeter likely has the flu, Tsou said.

Mark H. Sawyer, co-author on the recent paper and medical director of the University of California, San Diego Immunization Coalition, which partners with the City of San Diego Health and Human Services Agency, said Twitter tracking could provide a valuable weapon to health officials looking to gain some advantage in their yearly battle with the flu.

“The health department uses all sorts of different metrics to measure the flu season—where it’s breaking out, how quickly it’s spreading,” he said. “The more tools we have, the better.”

Tsou envisions this kind of “infoveillance,” as he calls it, applying to a range of public health concerns in the future, such as monitoring regional incidences of heart attack or diabetes. The project is connected to a larger SDSU initiative, Human Dynamics in the Mobile Age, one of the university’s four recently selected Areas of Excellence. Tsou is a core faculty member for the initiative.

He also founded a company, PathGeo, through SDSU’s technology transfer office that is designed to allow businesses to track societal trends relevant to their interests, such as political campaigning or regional buying habits. The company is presently located in the university’s Zahn Innovation Center.

Related Stories:

Huffington Post - Twitter Could Tell You Where Flu Is Ramping Up, Study Suggests