Foodborne illness is not only an unpleasant experience, but also a major public health concern. Many individuals who acquire foodborne illnesses do not seek medical care and do not report their illness to health departments, which can make complete and timely outbreak detection nearly impossible. With the emergence of social media as a primary form of communication, many individuals do, however, complain to their friends and followers online about their illness, symptoms and possible causes. So, how can we harness the power of social media to stop foodborne outbreaks?
As a fellow with the Project SHINE Informatics Training in Place Program in the New York City Department of Health and Mental Hygiene (DOHMH) – with support from the Alfred P. Sloan Foundation and the National Science Foundation – I have been tasked with developing a system, using data from Twitter, to identify complaints of foodborne illness across the city. The DOHMH has a long history of applying innovative methods to improve foodborne disease surveillance. We utilize the citywide non-emergency information system, “311,” where anyone can submit a food poisoning complaint related to a New York City restaurant. Additionally, in 2011, after identifying reports of illness on the restaurant review website Yelp that were not reported to 311, DOHMH began collaborating with Yelp and Columbia University to obtain a daily feed of Yelp reviews and develop a machine learning program using text mining to identify reviews pertaining to foodborne illness. This project was supported by two former CSTE Applied Epidemiology fellows, Cassandra Harrison, MPH and Kenya Murray, MPH and resulted in the full integration of Yelp into our foodborne illness complaint system. Each year, approximately 4,000 restaurant-associated complaints are received via 311 and Yelp combined, which result in the detection of about 30 outbreaks.
Nevertheless, New York City is a large metropolitan area with more than 8.5 million residents, 78 percent of whom eat food purchased from the city’s approximately 24,000 restaurants and 15,000 food retailers at least once per week. There are ample opportunities for exposure to foodborne pathogens at New York City restaurants. Even with the integration of Yelp and 311, we remain concerned that we are not receiving all reports of restaurant-associated foodborne illness incidents in the city.
Working with Columbia University, we have developed a system very similar to that used for Yelp reviews, which pulls publicly available data from Twitter’s application program interface (API), and uses text mining and machine learning to identify tweets indicating foodborne illness. We have also developed a web-based application, which displays all Yelp reviews and tweets for epidemiologists to review and manually classify, and allows us to track follow up and conduct interviews with complainants.
Using this application, we can respond to Twitter users we believe to be tweeting about a potential food poisoning incident and ask them to complete a brief online survey. The survey asks about the restaurant name and location, date of their visit, details of the incident and contact information for follow-up. DOHMH staff attempt to interview all users who submit surveys to obtain more information about their symptoms, incubation period and a three-day food history.
The process of developing and launching the application was extensive; we encountered many roadblocks, such as accessing data through firewalls and obtaining secure public facing servers to allow survey data collection. We have only recently started tweeting and sending surveys; so far, the survey completion rate has been low (roughly two percent), but we have observed an overall positive reaction from the public to our tweets. We hope the response rate increases over time and the application is successful, so we can share our work and lessons learned with other health departments who want to incorporate social media into their surveillance and outbreak detection efforts.
Already, our project was recognized at the 2016 New York City Technology Forum as the Most Innovative Use of Social Media/Citizen Engagement. Since then, we’ve enhanced the application to allow us to automate processes and increase the sustainability of the project over time. We have also evaluated different data sources and aim to incorporate those that will increase both the timeliness and completeness of foodborne illness outbreak detection in New York City.