The University of Southampton
University of Southampton Institutional Repository

A new Twitter verb lexicon for natural language processing.

A new Twitter verb lexicon for natural language processing.
A new Twitter verb lexicon for natural language processing.
We describe in-progress work on the creation of a new lexical resource that contains a list of 486 verbs annotated with quantified temporal durations for the events that they describe. This resource is being compiled from more than 14 million tweets from the Twitter microblogging site. We are creating this lexicon of verbs and typical durations to address a gap in the available information that is represented in existing research. The data that is contained in this lexicon is unlike any existing resources, which have been traditionally comprised of literature excerpts, news stories, and full-length weblogs. This kind of knowledge about how long an event lasts is crucial for natural language processing and is especially useful when the temporal duration of an event is implied. We are using data from Twitter because Twitter is a rich resource since people are publicly posting real events and real durations of those events throughout the day.
European Language Resources Association
Williams, Jennifer
3a1568b4-8a0b-41d2-8635-14fe69fbb360
Katz, Garaham
dcae091a-0290-4bf5-886b-a2815ccc6177
Williams, Jennifer
3a1568b4-8a0b-41d2-8635-14fe69fbb360
Katz, Garaham
dcae091a-0290-4bf5-886b-a2815ccc6177

Williams, Jennifer and Katz, Garaham (2012) A new Twitter verb lexicon for natural language processing. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). European Language Resources Association. 6 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

We describe in-progress work on the creation of a new lexical resource that contains a list of 486 verbs annotated with quantified temporal durations for the events that they describe. This resource is being compiled from more than 14 million tweets from the Twitter microblogging site. We are creating this lexicon of verbs and typical durations to address a gap in the available information that is represented in existing research. The data that is contained in this lexicon is unlike any existing resources, which have been traditionally comprised of literature excerpts, news stories, and full-length weblogs. This kind of knowledge about how long an event lasts is crucial for natural language processing and is especially useful when the temporal duration of an event is implied. We are using data from Twitter because Twitter is a rich resource since people are publicly posting real events and real durations of those events throughout the day.

This record has no associated files available for download.

More information

Published date: 25 May 2012

Identifiers

Local EPrints ID: 470365
URI: http://eprints.soton.ac.uk/id/eprint/470365
PURE UUID: 1495e52c-19f4-4623-8153-e8d2bc51da3b
ORCID for Jennifer Williams: ORCID iD orcid.org/0000-0003-1410-0427

Catalogue record

Date deposited: 07 Oct 2022 16:32
Last modified: 17 Mar 2024 04:12

Export record

Contributors

Author: Jennifer Williams ORCID iD
Author: Garaham Katz

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×