Geo‐parsing Messages from Microtext
Widespread use of social media during crises has become commonplace, as shown by the volume of messages during the Haiti earthquake of 2010 and Japan tsunami of 2011. Location mentions are particularly important in disaster messages as they can show emergency responders where problems have occurred. This article explores the sorts of locations that occur in disaster‐related social messages, how well off‐the‐shelf software identifies those locations, and what is needed to improve automated location identification, called geo‐parsing. To do this, we have sampled Twitter messages from the February 2011 earthquake in Christchurch, Canterbury, New Zealand. We annotated locations in messages manually to make a gold standard by which to measure locations identified by a Named Entity Recognition software. The Stanford NER software found some locations that were proper nouns, but did not identify locations that were not capitalized, local streets and buildings, or non‐standard place abbreviations and mis‐spellings that are plentiful in microtext. We review how these problems might be solved in software research, and model a readable crisis map that shows crisis location clusters via enlarged place labels.
No Supplementary Data
Document Type: Research Article
Affiliations: Language Technologies Institute, School of Computer Science, Carnegie-Mellon University
Publication date: 2011-12-01