Location extraction from social media: geoparsing, location disambiguation and geotagging
Location extraction from social media: geoparsing, location disambiguation and geotagging
Location extraction, also called toponym extraction, is a field covering geoparsing, extracting spatial representations from location mentions in text, and geotagging, assigning spatial coordinates to content items. This paper evaluates five ‘best of class’ location extraction algorithms. We develop a geoparsing algorithm using an OpenStreetMap database, and a geotagging algorithm using a language model constructed from social media tags and multiple gazetteers. Third party work evaluated includes a DBpedia-based entity recognition and disambiguation approach, a named entity recognition and Geonames gazetteer approach and a Google Geocoder API approach. We perform two quantitative benchmark evaluations, one geoparsing tweets and one geotagging Flickr posts, to compare all approaches. We also perform a qualitative evaluation recalling top N location mentions from tweets during major news events. The OpenStreetMap approach was best (F1 0.90+) for geoparsing English, and the language model approach was best (F1 0.66) for Turkish. The language model was best (F1@1km 0.49) for the geotagging evaluation. The map-database was best (R@20 0.60+) in the qualitative evaluation. We report on strengths, weaknesses and a detailed failure analysis for the approaches and suggest concrete areas for further research.
1-27
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Kordopatis-Zilos, Giorgos
a69aa09a-56bc-4b34-9f06-b149f2baab1c
Papadopoulos, Symeon
818a6f28-8102-45b4-8e95-53be585ec20a
Kompatsiaris, Yiannis
364cc081-661c-4f71-b6e0-025b02c25592
30 June 2018
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Kordopatis-Zilos, Giorgos
a69aa09a-56bc-4b34-9f06-b149f2baab1c
Papadopoulos, Symeon
818a6f28-8102-45b4-8e95-53be585ec20a
Kompatsiaris, Yiannis
364cc081-661c-4f71-b6e0-025b02c25592
Middleton, Stuart, Kordopatis-Zilos, Giorgos, Papadopoulos, Symeon and Kompatsiaris, Yiannis
(2018)
Location extraction from social media: geoparsing, location disambiguation and geotagging.
ACM Transactions on Information Systems, 36 (4), , [40].
(doi:10.1145/3202662).
Abstract
Location extraction, also called toponym extraction, is a field covering geoparsing, extracting spatial representations from location mentions in text, and geotagging, assigning spatial coordinates to content items. This paper evaluates five ‘best of class’ location extraction algorithms. We develop a geoparsing algorithm using an OpenStreetMap database, and a geotagging algorithm using a language model constructed from social media tags and multiple gazetteers. Third party work evaluated includes a DBpedia-based entity recognition and disambiguation approach, a named entity recognition and Geonames gazetteer approach and a Google Geocoder API approach. We perform two quantitative benchmark evaluations, one geoparsing tweets and one geotagging Flickr posts, to compare all approaches. We also perform a qualitative evaluation recalling top N location mentions from tweets during major news events. The OpenStreetMap approach was best (F1 0.90+) for geoparsing English, and the language model approach was best (F1 0.66) for Turkish. The language model was best (F1@1km 0.49) for the geotagging evaluation. The map-database was best (R@20 0.60+) in the qualitative evaluation. We report on strengths, weaknesses and a detailed failure analysis for the approaches and suggest concrete areas for further research.
Text
location extraction from social media
- Accepted Manuscript
More information
Accepted/In Press date: 27 March 2018
e-pub ahead of print date: 15 June 2018
Published date: 30 June 2018
Identifiers
Local EPrints ID: 419443
URI: http://eprints.soton.ac.uk/id/eprint/419443
ISSN: 1046-8188
PURE UUID: 84f38428-d8cf-4eda-a9bd-bfacf0619d22
Catalogue record
Date deposited: 12 Apr 2018 16:30
Last modified: 16 Mar 2024 06:27
Export record
Altmetrics
Contributors
Author:
Giorgos Kordopatis-Zilos
Author:
Symeon Papadopoulos
Author:
Yiannis Kompatsiaris
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics