<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html xmlns:v="urn:schemas-microsoft-com:vml"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:w="urn:schemas-microsoft-com:office:word"
 xmlns="http://www.w3.org/TR/REC-html40">
<head>
  <meta name="Title"
 content="Manual Evaluation of Algorithm Performance on Identifying OA">
  <meta name="Keywords" content="">
  <meta http-equiv="Content-Type" content="text/html; charset=macintosh">
  <meta name="ProgId" content="Word.Document">
  <meta name="Generator" content="Microsoft Word 10">
  <meta name="Originator" content="Microsoft Word 10">
  <link rel="File-List" href="manual-eval_files/filelist.xml">
  <link rel="Edit-Time-Data" href="manual-eval_files/editdata.mso">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
  <title>Manual Evaluation of Algorithm Performance on Identifying OA</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>Stevan Harnad</o:Author>
  <o:Template>Normal</o:Template>
  <o:LastAuthor>Stevan Harnad</o:LastAuthor>
  <o:Revision>2</o:Revision>
  <o:LastPrinted>2006-03-30T17:24:00Z</o:LastPrinted>
  <o:Created>2006-03-30T17:32:00Z</o:Created>
  <o:LastSaved>2006-03-30T17:32:00Z</o:LastSaved>
  <o:Pages>5</o:Pages>
  <o:Words>949</o:Words>
  <o:Characters>5318</o:Characters>
  <o:Company>Universit&#381; du Qu&#381;bec &#710; Montreal</o:Company>
  <o:Lines>91</o:Lines>
  <o:Paragraphs>20</o:Paragraphs>
  <o:CharactersWithSpaces>6647</o:CharactersWithSpaces>
  <o:Version>10.2418</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:DisplayHorizontalDrawingGridEvery>0</w:DisplayHorizontalDrawingGridEvery>
  <w:DisplayVerticalDrawingGridEvery>0</w:DisplayVerticalDrawingGridEvery>
  <w:UseMarginsForDrawingGridOrigin/>
 </w:WordDocument>
</xml><![endif]-->
  <style>
<!--
 /* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0in;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:Times;}
h1
	{mso-style-next:Normal;
	margin-top:12.0pt;
	margin-right:0in;
	margin-bottom:3.0pt;
	margin-left:0in;
	mso-pagination:widow-orphan;
	page-break-after:avoid;
	mso-outline-level:1;
	font-size:16.0pt;
	font-family:Helvetica;
	mso-font-kerning:16.0pt;}
h3
	{mso-style-next:Normal;
	margin-top:12.0pt;
	margin-right:0in;
	margin-bottom:3.0pt;
	margin-left:0in;
	mso-pagination:widow-orphan;
	page-break-after:avoid;
	mso-outline-level:3;
	font-size:13.0pt;
	font-family:Helvetica;}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;
	text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;
	text-underline:single;}
@page Section1
	{size:8.5in 11.0in;
	margin:.5in .5in 41.05pt .5in;
	mso-header-margin:.5in;
	mso-footer-margin:.5in;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
  </style><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="1043"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
 </o:shapelayout></xml><![endif]-->
</head>
<body style="background-color: white;" lang="EN-US" link="blue"
 vlink="purple">
<div class="Section1">
<h1 style="text-align: center;" align="center">Manual Evaluation of
Robot Accuracy
in Automatically Identifying Open Access Articles on the Web</h1>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<h3 style="text-align: center;">Chawki Hajjem (UQaM) &amp; Stevan
Harnad (UQaM &amp; U. Southampton)</h3>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal" style="margin-left: 0.5in;"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal" style="margin-left: 0.5in;"><b>Previous AmSci
Topic Thread</b><span style="font-weight: normal;">:</span></p>
<p class="MsoNormal" style="margin-left: 0.5in;">"Manual Evaluation of
Algorithm Performance on Identifying OA" (Dec 2005)</p>
<p class="MsoNormal" style="margin-left: 0.5in;"><a
 href="http://www.ecs.soton.ac.uk/%7Eharnad/Hypermail/Amsci/5021.html">http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/5021.html</a>
</p>
<p class="MsoNormal" style="margin-left: 0.5in;"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal" style="margin-left: 0.5in;"><b>References:<o:p></o:p></b></p>
<p class="MsoNormal" style="margin-left: 0.5in;"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal" style="margin-left: 0.5in;">Antelman, K.,
Bakkalbasi, N.,
Goodman, D., Hajjem, C. and Harnad, S.<span style="">&nbsp;
</span>(2005) Evaluation of Algorithm Performance on Identifying OA.
Technical
Report, North Carolina State University Libraries, North Carolina State
University. <a href="http://eprints.ecs.soton.ac.uk/11689/">http://eprints.ecs.soton.ac.uk/11689/</a>
</p>
<p class="MsoNormal" style="margin-left: 0.5in;"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal" style="margin-left: 0.5in;">Hajjem, C., Harnad, S.
and Gingras,
Y. (2005) Ten-Year Cross-Disciplinary Comparison of the Growth of Open
Access
and How it Increases Research Citation Impact. IEEE Data Engineering
Bulletin
28(4) pp. 39-47. <a href="http://eprints.ecs.soton.ac.uk/11688/">http://eprints.ecs.soton.ac.uk/11688/</a>
</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">In an unpublished study, Antelman et al. (2005)
hand-tested
the accuracy of the algorithm that Hajjem et al.'s (2005) software
robot used
to identify Open Access (OA) and Non-Open-Access (NOA) articles in the
ISI
database. Antelman et al. found much lower accuracy (d' 0.98, bias
0.78, true
OA 77%, false OA 41%), with their larger sample of nearly 600 (half OA,
half
NOA) in Biology (and even lower, near-chance performance in Sociology,
sample
size 600, d' 0.11, bias 0.99, true OA 53% false OA 49%) compared to
Hajjem et
al., who had with their smaller Biology sample of 200, found:<span
 style="">&nbsp; </span>d' 2.45, beta 0.52, true OA 93%, false
OA 16%.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">Hajjem et al. have now re-done the hand-testing on
a still
larger sample (1000) in Biology, and we think we have identified the
reason for
the discrepancy, and demonstrated that Hajjem et al.'s original
estimate of the
robot's accuracy was closer to the correct one.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">The discrepancy was because Antelman et al. were
hand-checking a sample other than the one the robot was sampling: The
templates
are the ISI articles. The ISI bibliographic data (author, title, etc.)
for each
article is first used to automatically trawl the web with search
engines
looking for hits, and then the robot applies its algorithm to the first
60 hits,
calling the article "OA" if the algorithm thinks it has found at
least one OA full-text among the 60 hits sampled, and NOA if it does
not find
one.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">Antelman et al. did not hand-check these same 60
hits for
accuracy, because the hits themselves were not saved; the only thing
recorded
was the robot's verdict on whether a given article was OA or NOA. So
Antelman
et al. generated another sample -- with different search engines, on a
different occasion -- for about 300 articles that the robot had
previously
identified as having an OA version in its sample, and 300 for which it
had not
found an OA version in its sample; Antelman et al.'s hand-testing found
much
lower accuracy.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">Hajjem et al.'s first test of the robot's accuracy
made the
very same mistake of hand-checking a new sample instead of saving the
hits, and
perhaps it yielded higher accuracy only because the time difference
between the
two samples was much smaller (but the search engines were again not the
same
ones used). Both accuracy hand-tests were based on incommensurable
samples.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">Testing the robot's accuracy in this way is
analogous to
testing the accuracy of an instant blood test for the presence of a
disease in
a vast number of villages by testing a sample of 60 villagers in each
(and
declaring the disease to be present in the village (OA) if a positive
case is
detected in the sample of 60, NOA otherwise) and then testing the
accuracy of
the instant test against a reliable incubated test, but doing this by
picking
<i>another</i><span style="font-style: normal;">
sample of 60
from 100 of the villages that had previously been identified as "OA"
based on the instant test and 100 that had been identified as "NOA."
Clearly, to test the accuracy of the first, instant test, the second
test ought
to have been performed on the very same i</span><i>ndividuals</i><span
 style="font-style: normal;"><span style="">&nbsp;
</span>on which the first test had been performed, not on another
sample based
only on the overall outcome of the first test, at the whole-village
level.</span></p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">So when we hand-checked the actual hits (URLs)
that the
robot had identified as "OA" or "NOA" in our Biology sample
of 1000, saving all the hits this time, the robot's accuracy was again
much
higher: d' 2.62, bias 0.68, true OA 93%, false OA 12%.</p>
<p style="margin-left: 80px;" class="MsoNormal"><!--[if !supportEmptyParas]--><img
 src="sigdet.gif" v:shapes="_x0000_s1040"
 height="243" width="540"><!--[endif]--> <o:p></o:p></p>
<p class="MsoNormal">All this merely concerned the robot's accuracy in
detecting
true OA.<span style="">&nbsp; </span>But our larger
hand-checked sample now also allowed us to check whether the OA
citation
advantage (the ratio of the average citation counts for OA articles to
the
average citation counts for NOA articles in the same journal/issue) was
an
artifact of false OA:</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">We accordingly had the robot's estimate of the OA
citation
Advantage of OA over NOA for this sample [(OA-NOA)/NOA x 100 = 70%],
and we
could now partition this into the ratio of the citation counts for true
(93%)
OA articles to the NOA articles (false NOA was very low, and would have
worked
against an OA citation advantage) versus the ratio of the citation
counts for
the false (12%) "OA" articles. The "false OA" advantage for
this 12% of the articles was 33%, so there is definitely a false OA
Advantage
bias component in our results. However, the true OA advantage,<span
 style="">&nbsp; </span>for 93% of the articles, was 77%. So in
fact, we are underestimating the OA advantage.<br>
</p>
<p style="margin-left: 80px;" class="MsoNormal"><img
 src="true-falseOAA.gif" v:shapes="_x0000_s1042"
 height="372" width="540"></p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal">As explained in previous postings on the American
Scientist
topic thread, the purpose of the robot studies is not to get the most
accurate
possible estimate of the current percentage of OA in each field we
study, nor
even to get the most accurate possible estimate of the size of the OA
citation
Advantage. The advantage of a robot over much more accurate
hand-testing is
that we can look at a much larger sample, and faster -- indeed, we can
test all
of the articles in all the journals in each field in the ISI database,
across
years. Our interest at this point is in nothing more accurate than a
rank-ordering of %OA as well as %OA citation Advantage across fields
and years.
We will nevertheless tighten the algorithm a little; the trick is not
to make
the algorithm so exacting for OA as to make it start producing
substantially
more false NOA errors, thereby weakening its overall accuracy for %OA
as well
as %OA advantage.</p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal"><!--[if gte vml 1]><o:wrapblock><v:shapetype id="_x0000_t75"
  coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe"
  filled="f" stroked="f">
  <v:stroke joinstyle="miter"/>
  <v:formulas>
   <v:f eqn="if lineDrawn pixelLineWidth 0"/>
   <v:f eqn="sum @0 1 0"/>
   <v:f eqn="sum 0 0 @1"/>
   <v:f eqn="prod @2 1 2"/>
   <v:f eqn="prod @3 21600 pixelWidth"/>
   <v:f eqn="prod @3 21600 pixelHeight"/>
   <v:f eqn="sum @0 0 1"/>
   <v:f eqn="prod @6 1 2"/>
   <v:f eqn="prod @7 21600 pixelWidth"/>
   <v:f eqn="sum @8 21600 0"/>
   <v:f eqn="prod @7 21600 pixelHeight"/>
   <v:f eqn="sum @10 21600 0"/>
  </v:formulas>
  <v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
  <o:lock v:ext="edit" aspectratio="t"/>
 </v:shapetype><v:shape id="_x0000_s1040" type="#_x0000_t75" style='position:absolute;
  margin-left:0;margin-top:-485pt;width:539.5pt;height:242.7pt;z-index:2'>
  <v:imagedata src="manual-eval_files/image001.gif" o:althref="manual-eval_files/image002.pct"
   o:title=""/>
  <w:wrap type="topAndBottom"/>
 </v:shape><![endif]--><!--[if !vml]--><!--[endif]--><!--[if gte vml 1]></o:wrapblock><![endif]--><br
 style="" clear="all">
<!--[if gte vml 1]><v:shape id="_x0000_s1028" type="#_x0000_t75" style='position:absolute;
 margin-left:-31.95pt;margin-top:792.2pt;width:618pt;height:278pt;z-index:1;
 mso-position-horizontal:absolute;mso-position-vertical:absolute'>
 <v:imagedata src="manual-eval_files/image004.gif" o:althref="manual-eval_files/image005.pct"
  o:title=""/>
</v:shape><![endif]--><!--[if !vml]--><span
 style="position: relative; z-index: 0; left: -32px; top: 792px; width: 618px; height: 1070px;"></span><!--[endif]--><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal"><!--[if !supportEmptyParas]--><!--[endif]--> <o:p></o:p></p>
<p class="MsoNormal"><!--[if !supportEmptyParas]-->&nbsp;<!--[endif]--><o:p></o:p></p>
<p class="MsoNormal"><!--[if gte vml 1]><o:wrapblock><v:shape id="_x0000_s1042"
  type="#_x0000_t75" style='position:absolute;margin-left:0;margin-top:0;
  width:539.8pt;height:372.1pt;z-index:3'>
  <v:imagedata src="manual-eval_files/image007.gif" o:althref="manual-eval_files/image008.pct"
   o:title=""/>
  <w:wrap type="topAndBottom"/>
 </v:shape><![endif]--><!--[if !vml]--><!--[endif]--><!--[if gte vml 1]></o:wrapblock><![endif]--><br
 style="" clear="all">
</p>
</div>
</body>
</html>
