<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="GENERATOR" content="Mozilla/4.61 (Macintosh; I; PPC) [Netscape]">
   <title>Nature Debates: The self-archiving initiative</title>
</head>
<body bgcolor="#FFFFFF">
See also the current ongoing
discussion of liberating the refereed literature at:
<br><i>Nature</i>: <a  href="http://www.nature.com/nature/debates/e-access/index.html">http://www.nature.com/nature/debates/e-access/index.html</a>
<br><i>Science</i>: <a  href="http://www.sciencemag.org/cgi/eletters/291/5512/2318b">http://www.sciencemag.org/cgi/eletters/291/5512/2318b</a>
<br><i>American Scientist</i>: <a  href="http://amsci-forum.amsci.org/archives/september98-forum.html">http://amsci-forum.amsci.org/archives/september98-forum.html</a>
<br>&nbsp;
<table BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH="455" >
<tr ALIGN=RIGHT BGCOLOR="#CCCCCC">
<td BGCOLOR="#CCCCCC">
<center>26 April 2001 Nature 410, 1024 - 1025 (2001)
<br>&copy; Macmillan Publishers Ltd.</center>

<h1>
<b>The self-archiving initiative</b></h1>

<h3>
Freeing the refereed research literature online</h3>
</td>
</tr>

<tr ALIGN=RIGHT BGCOLOR="#FFFFCC">
<td BGCOLOR="#FFFFCC" class="yellowbox">
<br><b>Stevan Harnad</b>
<br>Intelligence/Agents/Multimedia Group&nbsp;
<br>Department of Electronics and Computer Science&nbsp;
<br>University of Southampton UK
<br><a  href="http://cogsci.soton.ac.uk/~harnad">http://cogsci.soton.ac.uk/~harnad</a></td>
</tr>

<tr>
<td>
<div class="bodytwo"></div>&nbsp;

<table ALIGN=RIGHT BORDER=0 WIDTH="10%" >
<tr>
<td><img  SRC="nature2.jpg" height=176 width=149></td>
</tr>
</table>

<p>Unlike the authors of books and magazine articles, who write for royalty
or fees, the authors of refereed journal articles write only for 'research
impact'. To be cited and built on in the research of others, their findings
have to be accessible to their potential users. From the authors' viewpoint,
toll-gating access to their findings is as counterproductive as toll-gating
access to commercial advertisements.
<p>With the online age, it has at last become possible to free the literature
from this unwelcome impediment. Authors need only deposit their refereed
articles in 'eprint' archives at their own institutions; these interoperable
archives can then all be harvested into a global virtual archive, its full
contents freely searchable and accessible online by everyone (see <a  href="#B1">The
transition scenario</a>).&nbsp;
<p>Unlike the royalty/fee-based literature, which constitutes the vast
majority of the printed word, the special, tiny literature of refereed
journal articles is, and always has been, an 'author giveaway'. Researchers
never benefited from the fact that people had to pay access tolls to read
their papers (as subscriptions, and for the online version, site-licences
or pay-per-view). On the contrary, those access barriers represent impact
barriers for researchers, whose careers and standing depend largely on
the visibility and uptake of their research.
<p>There are currently at least 20,000 refereed journals across all fields
of scholarship, publishing more than 2 million refereed articles each year.
The amount collectively paid by those of the world's institutions which
can afford the tolls for just one of those refereed papers averages $2,000
per paper. In exchange for that fee, that particular paper is accessible
to readers at those, and only those, paying institutions.
<p>The research libraries of the world can be divided into the (minority)
Harvards and the (majority) Have-nots ó the last by no means limited to
the developing world. It is obvious how the Have-nots would benefit from
free access to the entire refereed literature, for without it their meagre
serials budgets can afford only a pitifully small portion. But not even
Harvard can afford access to anywhere near all of the literature (see <a  href="http://fisher.lib.virginia.edu/newarl/index.html" target="_blank">Association
of Research Libraries Statistics</a>). Hence, most refereed articles are
inaccessible to most researchers. For the authors, this means that much
of their potential impact is lost. And it is solely this curtailed research
impact and access that is being purchased by the collective $2,000 outlay
per article mentioned above.
<p>This is the way things had to be in the past, when print-on-paper was
the only publishing medium, and the sizeable costs of printing and distribution
had to be recovered somehow. The new online era may be threatening the
majority, royalty/fee-based literature (books, magazine articles) in the
form of digital piracy; but for the 'giveaway' research literature, it
has at last made it possible to eliminate all those counterproductive access/impact
barriers.
<p>Not all costs have vanished, of course. Although the costs of printing
and distribution (and their online successors, such as publishers' PDF
page-images) are no longer essential ones, the cost of the quality-control
and certification that differentiates the refereed literature from an unfiltered,
anarchic vanity press still needs to be paid. Paper and PDF files have
become mere options, purchasable by those who want and can afford them.
Refereeing, however, is essential.
<p><span class="bodytwo"><b>Essential costs of refereeing</b></span>
<br>Refereeing (peer review) is the system of evaluation and feedback by
which expert researchers assure the quality of each others' research findings.
Referees' services are donated free to virtually all scientific journals,
but there is a real cost to implementing the refereeing procedures, which
include archiving submitted papers on a website; selecting appropriate
referees; tracking submissions through rounds of review and author revision;
making editorial judgments, and so on.
<p>The minimum cost of refereeing has been estimated as $500 per accepted
article (see <a  href="http://documents.cern.ch/archive/electronic/other/agenda/a01193/a01193s5t11/transparencies/">slideshow</a>
), but that figure almost certainly has inessential costs wrapped into
it (for example, the creation of the publisher's PDF). I think that the
true figure for peer-review implementation alone across all refereed journals
probably averages closer to $200 per article, or even lower. Hence, quality-control
costs account for only 10% of the collective tolls actually being paid
per article.
<p>Can this situation, in which the authors' and referees' giveaways are
needlessly being held hostage to obsolete printing costs and cost-recovery
methods, be remedied? Note that it is not simply a matter of lowering the
financial access barriers: even if those were slashed by 90%, most researchers
would still be unable to access most research papers. There is an optimal
solution, and it is inevitable: the refereed research literature must be
freed online for everyone, everywhere, for ever. The irreducible 10% or
so quality-control cost need no longer be paid for by readers' institutions;
it can be paid in the form of quality-control service costs, per paper
published, by authors' institutions, out of their savings on subscription
costs.
<p>Journal publishers certainly will not scale down to becoming only quality-control
providers of their own accord. Nor can libraries effect such a transition
on their own. And authors cannot and should not be expected to stop submitting
their research to established high-quality, high-impact journals in preference
for new, alternative journals just because those are prepared to provide
stand-alone quality control right now. Journal niches are largely filled
already, and immediate careers and standing are far more important to researchers
than the potential long-term benefits of risky sacrifices.
<p>But researchers can hasten the optimal and inevitable outcome without
any sacrifice or risk. The entire refereed journal literature can be freed,
virtually overnight, without authors having to give up their established
refereed journals, by a method that a portion of the physics community
has already shown to work. These physicists have since 1991 been publicly
self-archiving their research papers online ó both before and after refereeing
(preprints and postprints) ó in the physics '<a  href="http://www.arxiv.org" target="_blank">eprint
archive</a>'. This archive currently holds 150,000 articles. The number
of new articles being self-archived there is currently about 30,000 annually,
and increasing by some 3,500 papers each year. The archive, with its 14
mirror-sites world-wide, gets about 160,000 user hits each weekday at its
US site alone. So there is no doubt that self-archiving is feasible, and
that when papers are thus made freely accessible online, they are heavily
used.
<p>But although these physicists have shown the way to free the refereed
research literature, authors in other disciplines have been slow to realize
that the system can work for them too. They have assumed that there must
be something unique about physics that makes self-archiving work. This
misapprehension has been encouraged by the incorrect impression that the
physics archive contains only unrefereed preprints, and that self-archiving
somehow compromises the quality control of journals.
<p>Yet absolutely nothing has changed in peer review in physics. The same
authors who self-archive continue to submit all their papers to their journals
of choice, just as they always did, and virtually all the papers in the
archive appear in refereed journals about 12 months after journal submission.
The only thing that has changed is that a growing portion of the refereed
literature in physics is accessible, free for all, online. Yet even in
physics, self-archiving is still growing far too slowly: at the present
linear growth rate it will be another decade before the entire physics
literature is online and free.
<p><span class="bodytwo">Institution-based self-archiving</span>
<br>There is now a way both to accelerate the rate of self-archiving in
physics and to extend the practice to the other disciplines (see <a  href="#B1">The
transition scenario</a>). My original 'subversive proposal' to free the
refereed literature through author self-archiving fell largely on deaf
ears because self-archiving in an anonymous FTP archive or a web home page
would be unsearchable, unnavigable, irretrievable and hence unusable. Nor
has centralized archiving, even when made available to other disciplines,
been catching on fast enough either (it has taken three years for the number
of articles in <a  href="http://cogprints.soton.ac.uk" target="_blank">cogprints</a>
to reach 1,000).
<p>The new breakthrough is agreement on metadata tagging standards that
make the contents of distributed archives interoperable, hence harvestable
into one global virtual archive, all papers searchable and retrievable
by everyone for free. The <a  href="http://www.openarchives.org" target="_blank">open
archives initiative</a> (OAI) has now provided the metadata tagging standards
and a registry for all OAI-compliant eprint archives. The <a  href="http://www.eprints.org" target="_blank">self-archiving
initiative</a> is providing free software for institutions to create OAI-compliant
archives, interoperable with all other open archives, ready to be registered
and for their contents to be harvested into searchable global archives,
interlinked to one another by citations (see <a  href="http://cite-base.ecs.soton.ac.uk/cgi-bin/search" target="_blank">citebase</a>)
<p>Distributed, institution-based self-archiving benefits research institutions
in three ways. First, it maximizes the visibility and impact of their own
refereed research output. Second, by symmetry, it maximizes their researchers'
access to the full refereed research output of all other institutions.
Third, institutions themselves can hasten the transition to self-archiving
and so more quickly reduce their library's annual serials expenditures
to 10% (paid to journal publishers for refereeing their submissions).
<p>The institutional library can help researchers to do self-archiving
and can maintain the institution's own eprint archives as an outgoing refereed
collection for external use, in place of the old incoming collection via
subscription costs for internal use. Institutional library consortial power
can also be used to provide leveraged support for journal publishers who
commit themselves to a timetable of downsizing on the way to becoming pure
quality-control service providers (see <a  href="http://www.arl.org/sparc/home/index.asp" target="_blank">SPARC</a>
).
<br>&nbsp;
<br>&nbsp;
<p><span 
      class="bodytwo"><b>References</b></span>
<p>1. Odlyzko, A. M. "The economics of electronic journals" in Technology
and Scholarly Communication (eds Ekman, R. &amp; Quandt, R.) 380-393 (Univ.
Calif. Press, Berkeley, 1998). <a  href="http://www.press.umich.edu/jep/04-01/odlyzko.html" target="_blank">http://www.press.umich.edu/jep/04-01/odlyzko.html</a>
<p>2. Harnad, S. "Universal FTP archives for esoteric science and scholarship:
a subversive proposal" in Scholarly Journals at the Crossroads: A Subversive
Proposal for Electronic Publishing (eds Okerson, A. &amp; O'Donnell, J.)
1 (Association of Research Libraries, Washington DC, 1995). <a  href="http://www.arl.org/scomm/subversive/toc.html" target="_blank">http://www.arl.org/scomm/subversive/toc.html</a>
<br>&nbsp;</td>
</tr>

<tr>
<td BGCOLOR="#FFFFCC"><span class="bodytwo"><a NAME="B1"></a><b>The transition
scenario</b></span>
<p>As soon as all refereed journal articles are self-archived by their
authors in their institution's eprint archive, the literature is freed
from all access barriers and impact barriers. Self-archiving could be done
virtually overnight. The day after, all refereed research becomes freely
accessible online to researchers the world over.&nbsp;
<p>One possible outcome is that that will be the end of it. The refereed
literature will be free online for those who want it and cannot get it
any other way, but those who can afford to get it the old way via paying
journals will continue to do so. In this event, the access/impact problem
will be solved, but the library's budget crisis will not: it will simply
become less important.
<p>An alternative outcome is that when the refereed literature is accessible
online for free, users will prefer the free version (as so many physicists
already do). Journal revenues will then shrink and institutional savings
grow, until journals eventually have to scale down to providing only the
essentials (the quality-control service), with the rest (paper version,
online PDF version, other 'added values') sold as options.
<p>In none of these outcomes is peer-review itself compromised or put at
risk; nor do authors have to give up, even temporarily, submitting to their
established journals of choice. All they have to do is self-archive their
preprints and postprints in their institutional eprint archives.
<p>Nor are copyright restrictions an obstacle to self-archiving: preprints
can be self-archived without any restriction at the time the paper is submitted
to a journal. When the final draft is accepted, authors can ask the journal
to retain their right to give away that draft online by self-archiving
it. In practice, many publishers will agree to this if the author asks,
although most do not publicly state it as policy. For these papers, the
author can self-archive the refereed postprint alongside the pre-refereeing
preprint(s). For those publishers who insist that all rights are transferred,
authors can sign the agreement and self-archive a linked 'corrigenda' file,
listing for the user what changes have to be made in the preprint to make
it equivalent to the postprint. (See <a  href="http://www.cogsci.soton.ac.uk/~harnad/Tp/resolution.htm#Harnad/Oppenheim" target="_blank">copyright details</a>)</td>
</tr>

<tr>
<td>&nbsp;</td>
</tr>
</table>

</body>
</html>
