The University of Southampton
University of Southampton Institutional Repository

Using Twitter data for population estimates

Using Twitter data for population estimates
Using Twitter data for population estimates
Twitter is increasingly being used as a source of data for the Social Sciences. However, deriving the demographic characteristics of users and dealing with the non-random non-representative populations from which they are drawn represent challenges for social scientists. This paper has two objectives: first, it compares different methods for estimating demographic information from Twitter data based on the crowd-sourcing platform CrowdFlower and the image-recognition software Face++. Second, it proposes a method for calibrating the non-representative sample of Twitter users with auxiliary information from official statistics, hence allowing to generalize findings based on Twitter to the general population.
1025-1031
Firenze University Press
Yildiz, Dilek
5773b8d6-699c-4491-bee5-c6bea047fc1d
Munson, Jo
8f482a63-f0ea-427b-bfd3-e8899177d45b
Vitali, Agnese
56acb6b8-5161-4106-9e73-20712840d675
Tinati, Ramine
4102a244-c312-4d57-88c2-d219d9f8d69a
Holland, Jennifer
53f89965-1900-4972-9d74-8d9c659676bb
Yildiz, Dilek
5773b8d6-699c-4491-bee5-c6bea047fc1d
Munson, Jo
8f482a63-f0ea-427b-bfd3-e8899177d45b
Vitali, Agnese
56acb6b8-5161-4106-9e73-20712840d675
Tinati, Ramine
4102a244-c312-4d57-88c2-d219d9f8d69a
Holland, Jennifer
53f89965-1900-4972-9d74-8d9c659676bb

Yildiz, Dilek, Munson, Jo, Vitali, Agnese, Tinati, Ramine and Holland, Jennifer (2017) Using Twitter data for population estimates. In SIS 2017. Statistics and Data Science: new challenges, new generations. Firenze University Press. pp. 1025-1031 .

Record type: Conference or Workshop Item (Paper)

Abstract

Twitter is increasingly being used as a source of data for the Social Sciences. However, deriving the demographic characteristics of users and dealing with the non-random non-representative populations from which they are drawn represent challenges for social scientists. This paper has two objectives: first, it compares different methods for estimating demographic information from Twitter data based on the crowd-sourcing platform CrowdFlower and the image-recognition software Face++. Second, it proposes a method for calibrating the non-representative sample of Twitter users with auxiliary information from official statistics, hence allowing to generalize findings based on Twitter to the general population.

This record has no associated files available for download.

More information

Published date: June 2017
Venue - Dates: Statistics and Data Science: new challenges, new generations, Florence, Florence, Italy, 2017-06-28 - 2017-06-30

Identifiers

Local EPrints ID: 413025
URI: http://eprints.soton.ac.uk/id/eprint/413025
PURE UUID: 20851149-a1b2-4aab-b310-028e95063f1c
ORCID for Agnese Vitali: ORCID iD orcid.org/0000-0003-0029-9447

Catalogue record

Date deposited: 14 Aug 2017 16:30
Last modified: 15 Mar 2024 15:38

Export record

Contributors

Author: Dilek Yildiz
Author: Jo Munson
Author: Agnese Vitali ORCID iD
Author: Ramine Tinati
Author: Jennifer Holland

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×