The Problem
- Vast underreporting: Only ~3% of food poisoning cases are ever reported to health departments, leaving most outbreaks undetected.
- Health agencies rely on consumer complaints, but very few people formally report their foodborne illness to authorities.
- Untapped signals: Many individuals share food poisoning stories on platforms like Twitter instead of official channels, meaning critical clues were being missed.
The Solution
- Developed an automated system to continuously scan Twitter for posts indicating possible food poisoning (using keywords and ML filters).
- Engaged directly with users who tweeted about getting sick (e.g., via prompts or replies), encouraging them to submit official illness reports to their local health department.
- Integrated tweet-based alerts into health department workflows by providing a dashboard that mapped suspected cases and facilitated targeted restaurant inspections.
Architecture Overview
- Twitter Data Pipeline: Set up continuous ingestion of tweets mentioning food poisoning symptoms or related keywords, filtered by location.
- NLP Classification: Employed text processing and machine-learning classifiers to distinguish likely foodborne illness reports from unrelated chatter.
- User Engagement Module: Automated response system that contacts authors of flagged tweets with guidance on how to report their illness to authorities.
- Health Dept Dashboard: A web dashboard for officials that displays a map of tweet-indicated incidents, links to each case’s details, and tracks which have been escalated to inspections.
- Multi-City Template: Designed the system to be easily deployable to different city or state health departments with minimal customization.
Results and Impacts
- Deployed across 18 public health agencies (U.S. and U.K.), tripling the volume of citizen-reported food poisoning cases compared to prior reporting rates.
- Cut detection and response time by ~60%, enabling inspectors to identify and address outbreak clusters much faster than traditional complaint-driven methods.
- Achieved high precision in identifying true cases: the tweet classification and geolocation approach proved ~91% accurate in pinpointing verifiable foodborne illness incidents in the relevant jurisdiction.
Skills and Tools Used
| Technique/Skill | Tools/Implementation |
|---|---|
| Social Media Mining | Twitter API integration for real-time data streaming |
| NLP Classification | Machine-learning models to flag food-poisoning-related tweets |
| Geospatial Analysis | Geocoding tweet locations to map incidents to jurisdictions |
| Web Dashboard | Interactive visualization for health officials (maps, alerts, case management) |
Cross-Project Capabilities
- Digital Epidemiology: Pioneered using social media posts as disease surveillance signals, a method extended in later health monitoring projects.
- Public Engagement: Techniques to prompt and capture user reports via digital platforms were reused in other participatory data projects (surveys, forums).
- Real-Time Pipeline: Expertise in building live data pipelines and anomaly detection translated to subsequent projects in both health and security domains.
Published Papers/Tools
- Research Outputs: Three peer-reviewed publications (2018–2020) documenting the methods and impact of Twitter-based foodborne illness surveillance.PaperDemo PaperAbstract
- Operational Tool: The Twitter surveillance dashboard was implemented by multiple city and state health departments, becoming a standard tool for live food safety monitoring.