The University of Southampton
University of Southampton Institutional Repository

Measuring the completeness of scholarly communications databases

Measuring the completeness of scholarly communications databases
Measuring the completeness of scholarly communications databases
As scholarly communication has been digitised and moved online, large streams of data are being generated by the millions of publications, citations, or viewership statis- tics. This data, gathered by a few specialised services, serves an important role in help- ing individual researchers conduct literature review, science policymakers to analyse the impact of research, and science as a whole to progress effectively.

This research is aiming to summarise the requirements regarding scope, quality, trans- parency and accessibility of scholarly communication databases, create a uniform method- ology of analysis of these datasets based on these requirements. The methodology is then used to analyse Google Scholar, Microsoft Academic and Scopus and the re- sults are compared to other studies of these datasets. High similarity of the results ob- tained using designed methodology to established publications show that the method- ology may be a promising method of partially automated, cross-disciplinary analysis of scholarly databases. Finally, a method of conducting an automated overlap analysis of datasets is presented as a methodological contribution, alongside relevant statistics of precision and recall.
University of Southampton
Paszcza, Bartosz
4c891abc-8dcb-45a7-8f43-dd6d145cc9b3
Paszcza, Bartosz
4c891abc-8dcb-45a7-8f43-dd6d145cc9b3
Carr, Leslie
0572b10e-039d-46c6-bf05-57cce71d3936
Harnad, Stevan
442ee520-71a1-4283-8e01-106693487d8b
Frey, Jeremy
ba60c559-c4af-44f1-87e6-ce69819bf23f

Paszcza, Bartosz (2021) Measuring the completeness of scholarly communications databases. University of Southampton, Doctoral Thesis, 86pp.

Record type: Thesis (Doctoral)

Abstract

As scholarly communication has been digitised and moved online, large streams of data are being generated by the millions of publications, citations, or viewership statis- tics. This data, gathered by a few specialised services, serves an important role in help- ing individual researchers conduct literature review, science policymakers to analyse the impact of research, and science as a whole to progress effectively.

This research is aiming to summarise the requirements regarding scope, quality, trans- parency and accessibility of scholarly communication databases, create a uniform method- ology of analysis of these datasets based on these requirements. The methodology is then used to analyse Google Scholar, Microsoft Academic and Scopus and the re- sults are compared to other studies of these datasets. High similarity of the results ob- tained using designed methodology to established publications show that the method- ology may be a promising method of partially automated, cross-disciplinary analysis of scholarly databases. Finally, a method of conducting an automated overlap analysis of datasets is presented as a methodological contribution, alongside relevant statistics of precision and recall.

Text
Measuring the completeness of scholarly communications databases Bartosz Pa - Version of Record
Available under License University of Southampton Thesis Licence.
Download (1MB)
Text
Permission to deposit thesis - BPaszcza
Restricted to Repository staff only

More information

Published date: July 2021

Identifiers

Local EPrints ID: 474095
URI: http://eprints.soton.ac.uk/id/eprint/474095
PURE UUID: 742bb97a-c084-4085-9228-eef9906668bf
ORCID for Bartosz Paszcza: ORCID iD orcid.org/0000-0001-6394-3573
ORCID for Leslie Carr: ORCID iD orcid.org/0000-0002-2113-9680
ORCID for Stevan Harnad: ORCID iD orcid.org/0000-0001-6153-1129
ORCID for Jeremy Frey: ORCID iD orcid.org/0000-0003-0842-4302

Catalogue record

Date deposited: 13 Feb 2023 17:58
Last modified: 17 Mar 2024 02:41

Export record

Contributors

Author: Bartosz Paszcza ORCID iD
Thesis advisor: Leslie Carr ORCID iD
Thesis advisor: Stevan Harnad ORCID iD
Thesis advisor: Jeremy Frey ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×