On two existing approaches to statistical analysis of social media data
On two existing approaches to statistical analysis of social media data
Using social media data for statistical analysis of general population faces commonly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures are typically not available directly but need to be extracted by algorithms or machine learning techniques. In this paper we examine and summarise two existing approaches to statistical analysis based on social media data, which can be discerned in the literature. In the first approach, analysis is applied to the social media data that are organised around the objects directly observed in the data; in the second one, a different analysis is applied to a constructed pseudo survey dataset, aimed to transform the observed social media data to a set of units from the target population. We elaborate systematically the relevant data quality frameworks, exemplify their applications, and highlight some typical challenges associated with social media data.
measurement, non-probability sample, quality, representation, test
54-71
Patone, Martina
51bbd4cc-1c19-4a64-a0c2-1534b076fa79
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
1 April 2021
Patone, Martina
51bbd4cc-1c19-4a64-a0c2-1534b076fa79
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Patone, Martina and Zhang, Li-Chun
(2021)
On two existing approaches to statistical analysis of social media data.
International Statistical Review, 89 (1), .
(doi:10.1111/insr.12404).
Abstract
Using social media data for statistical analysis of general population faces commonly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures are typically not available directly but need to be extracted by algorithms or machine learning techniques. In this paper we examine and summarise two existing approaches to statistical analysis based on social media data, which can be discerned in the literature. In the first approach, analysis is applied to the social media data that are organised around the objects directly observed in the data; in the second one, a different analysis is applied to a constructed pseudo survey dataset, aimed to transform the observed social media data to a set of units from the target population. We elaborate systematically the relevant data quality frameworks, exemplify their applications, and highlight some typical challenges associated with social media data.
Text
ISRfinal204
- Accepted Manuscript
More information
Accepted/In Press date: 25 July 2020
e-pub ahead of print date: 26 August 2020
Published date: 1 April 2021
Additional Information:
Publisher Copyright:
© 2020 The Authors. International Statistical Review published by John Wiley & Sons Ltd on behalf of International Statistical Institute
Keywords:
measurement, non-probability sample, quality, representation, test
Identifiers
Local EPrints ID: 443044
URI: http://eprints.soton.ac.uk/id/eprint/443044
ISSN: 0306-7734
PURE UUID: 7c40e10e-5420-497a-ae33-35e8ff979fd5
Catalogue record
Date deposited: 06 Aug 2020 16:36
Last modified: 17 Mar 2024 05:47
Export record
Altmetrics
Contributors
Author:
Martina Patone
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics