Earlier this month the Smart Chicago Collaborative, in partnership with local developers Cory Nissen, Joe Olson, and Scott Robbin, and the Chicago Department of Public Health (CDPH), launched Foodborne Chicago, an innovative application that trawls Twitter for mentions of food poisoning in Chicago, enabling a team of administrators to connect with affected people and encourage them to report details of their food poisoning to the CDPH.
The Foodborne Chicago application is a collection of different services that make up a complex workflow. This post explains the overall architecture of the application and the direction that development is headed.
Foodborne searches Twitter for all tweets near Chicago containing the string “food poisoning”. The ingestion service consumes thousands of tweets, storing them in a large MongoDB instance. A collection of classification servers, running R, churn through the collected tweets, applying a series of filters. The tweets are classified using a model that was trained via supervised learning, which determines if the tweets are related to a food poisoning illness or not. The Twitter crawler, classification machines, and MongoDB instance are all virtual EC2 instances running on the Smart Chicago Collaborative Amazon Web Services account.
Here is a sample of actual tweets and the determination of the classifier:
food poisoning tweets:
- Knocked down by food poisoning for the second day. Not a good way to start the week
- Stomach flu/food poisoning is like eating gas station sushi without the joys of eating gas station sushi
- I think I ate my food too quick, either that or I sense food poisoning
- Food poisoning at the first chapter meeting. Awesome..
- My stomach keeps making the weirdest noise. Possibly food poisoning from Golden Nugget!
not food poisoning tweets:
- I read that over six million people will get food poisoning this year with 100,000 requiring hospitalization. This is entirely preventable.
- It’s really hard to snack while watching Honey Boo Boo. It’s the second best diet to food poisoning.
The Foodborne web application, a standard Ruby on Rails application, runs on Heroku, and has a scheduled job that loads classified tweets from the MongoDB instance every few minutes. This administrative interface shows the admin team, a partnership between Smart Chicago and the CDPH, a list of previously classified food poisoning tweets. For each tweet, the application shows if the tweet has been replied to, and if not, a simple mechanism for sending an @-reply to the tweet. The reply can use one of a standard set of replies, or a custom message, depending on context.
— Foodborne Chicago (@foodbornechi) April 16, 2013
When users respond to the Twitter @-reply, they fill out a simple food poisoning report form on Foodborne. This form is submitted to the City of Chicago via its Open311 interface. This submission is equivalent to the person calling Chicago 311 to report their food poisoning. The 311 software routes the submission to the Chicago Department of Public Health, where investigators review the submission and take action, including conducting inspections, based on the report.
Foodborne has a number of exciting development goals ahead. The backend infrastructure, while adequate, can be optimized and made far more efficient. Joe and Cory are exploring how to use EC2 spot instances and queuing tools to perform the classification work when computing resources are less expensive. The administrative interface will be extended to show more information about suspected food poisoning tweets, including if a person has submitted a request to 311. Scott and Cory are also working on building a feedback loop to the classifier; eventually administrators will be able to flag tweets that are incorrectly classified as relating to food poisoning illness and the classifier model will then learn to ignore similar tweets in the future.
Foodborne is an exciting addition to the collection of applications hosted by the Smart Chicago Collaborative. We’re proud of the work the entire Foodborne team has done, and look forward to supporting future development. If you’re a developer working with open data in Chicago, you may qualify for free hosting, too!