The University of Southampton
University of Southampton Institutional Repository

Manual Evaluation of Robot Performance in Identifying Open Access Articles

Manual Evaluation of Robot Performance in Identifying Open Access Articles
Manual Evaluation of Robot Performance in Identifying Open Access Articles
Antelman et al. (2005) hand-tested the accuracy of the algorithm that Hajjem et al.'s (2005) software robot used to trawl the web and automatically identify Open Access (OA) and Non-Open-Access (NOA) articles (references derived from the ISI database). Antelman et al. found much lower accuracy than Hajjem et al. Had reported. Hajjem et al. have now re-done the hand-testing on a larger sample (1000) in Biology, and demonstrated that Hajjem et al.'s original estimate of the robot's accuracy was much closer to the correct one. The discrepancy was because both Antelman et al. And Hajjem et al had hand-checked a sample other than the one the robot was sampling. Our present sample, identical with what the robot saw, yielded: d' 2.62, bias 0.68, true OA 93%, false OA 12%. We also checked whether the OA citation advantage (the ratio of the average citation counts for OA articles to the average citation counts for NOA articles in the same journal/issue) was an artifact of false OA: The robot-based OA citation Advantage of OA over NOA for this sample [(OA-NOA)/NOA x 100] was 70%. We partitioned this into the ratio of the citation counts for true (93%) OA articles to the NOA articles versus the ratio of the citation counts for the false (12%) "OA" articles. The "false OA" advantage for this 12% of the articles was 33%, so there is definitely a false OA Advantage bias component in our results. However, the true OA advantage, for 93% of the articles, was 77%. So in fact, we are underestimating the true OA advantage.
signal detection analysis, citation analysis, open access, research impact, webmetrics
Hajjem, Chawki
4bf0a8ac-941b-4573-bc97-8748e1356bc3
Harnad, Stevan
442ee520-71a1-4283-8e01-106693487d8b
Hajjem, Chawki
4bf0a8ac-941b-4573-bc97-8748e1356bc3
Harnad, Stevan
442ee520-71a1-4283-8e01-106693487d8b

Hajjem, Chawki and Harnad, Stevan (2006) Manual Evaluation of Robot Performance in Identifying Open Access Articles

Record type: Monograph (Project Report)

Abstract

Antelman et al. (2005) hand-tested the accuracy of the algorithm that Hajjem et al.'s (2005) software robot used to trawl the web and automatically identify Open Access (OA) and Non-Open-Access (NOA) articles (references derived from the ISI database). Antelman et al. found much lower accuracy than Hajjem et al. Had reported. Hajjem et al. have now re-done the hand-testing on a larger sample (1000) in Biology, and demonstrated that Hajjem et al.'s original estimate of the robot's accuracy was much closer to the correct one. The discrepancy was because both Antelman et al. And Hajjem et al had hand-checked a sample other than the one the robot was sampling. Our present sample, identical with what the robot saw, yielded: d' 2.62, bias 0.68, true OA 93%, false OA 12%. We also checked whether the OA citation advantage (the ratio of the average citation counts for OA articles to the average citation counts for NOA articles in the same journal/issue) was an artifact of false OA: The robot-based OA citation Advantage of OA over NOA for this sample [(OA-NOA)/NOA x 100] was 70%. We partitioned this into the ratio of the citation counts for true (93%) OA articles to the NOA articles versus the ratio of the citation counts for the false (12%) "OA" articles. The "false OA" advantage for this 12% of the articles was 33%, so there is definitely a false OA Advantage bias component in our results. However, the true OA advantage, for 93% of the articles, was 77%. So in fact, we are underestimating the true OA advantage.

Image
sigdet.gif - Other
Download (11kB)
Text
manual-eval.html - Other
Download (15kB)
Image
true-falseOAA.gif - Other
Download (17kB)
Text
manual-eval.pdf - Other
Download (71kB)
Text
manual-eval.doc - Other
Download (71kB)

Show all 5 downloads.

More information

Published date: March 2006
Additional Information: Commentary On: http://dlist.sir.arizona.edu/1015/
Keywords: signal detection analysis, citation analysis, open access, research impact, webmetrics
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 262220
URI: http://eprints.soton.ac.uk/id/eprint/262220
PURE UUID: 0ddc827c-62a9-4078-b4b5-18ea68b8c550
ORCID for Stevan Harnad: ORCID iD orcid.org/0000-0001-6153-1129

Catalogue record

Date deposited: 30 Mar 2006
Last modified: 15 Mar 2024 02:48

Export record

Contributors

Author: Chawki Hajjem
Author: Stevan Harnad ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×