Harnad, S. (2003) Electronic Preprints and Postprints. Encyclopedia of Library and Information Science Marcel Dekker, Inc.
http://www.ecs.soton.ac.uk/~harnad/Temp/eprints.htm


Eprints: Electronic Preprints and Postprints


Stevan Harnad

Intelligence/Agents/Multimedia Group
Department of Electronics and Computer Science
University of Southampton Highfield, Southampton
SO17 1BJ UNITED KINGDOM
harnad AT ecs.soton.ac.uk
http://www.cogsci.soton.ac.uk/~harnad/


SUMMARY: Preprints are drafts of a research paper before peer review and postprints are drafts of a research paper after peer review. Researchers have always given away their preprints and postprints in order to increase the impact of their work. The online age has at last made it possible for researchers to maximize their work's visibility, usage and impact by self-archiving their preprints and postprints in institutional Eprint Archives, making them openly accessible to all would-be users worldwide.

Research Papers Before and After Peer-Review

To understand what a preprint is, we first need to understand what a peer-reviewed journal article is: There are 20,000 peer reviewed journals, in all disciplines and in most of the world's languages. They publish scholarly and scientific research that has been reviewed or "refereed" by qualified experts ("peers") to evaluate whether they meet that particular journal's established quality standards. The verdict of the referees can be to accept, reject, or to revise and resubmit.

A successfully revised article is accepted and published in the journal, and the author (in paper days) used to receive, or could order, various quantities of "reprints" or "offprints" of that article for distribution to any interested users. The Institute for Scientific Information's "Current Contents" magazine has for decades been publishing the contents pages of the top 6500 of the world's 20,000 refereed journals, along with the authors' addresses, so potential users can write to the author to request a reprint.

These days, the reprint will often be available in electronic form. This has come to be called an "eprint," but the term "eprint" also applies to earlier versions of the paper, including the unrefereed, not yet revised or accepted draft that was originally submitted to the journal for peer review. That draft is called the "preprint." In reality, there might be a succession of revised drafts, all preprints, until the final accepted draft. Moreover, even after publication, the draft might be further revised to correct errors and add postpublication updates. The critical milestone is the publication itself. Let us agree to call all prepublication drafts "preprints" and all postpublication drafts (including the official, accepted, published draft itself) "postprints." Eprints are either preprints or postprints in electronic form.

The Physics Preprint Culture

Disciplines differ in how much emphasis they place on preprints. Traditionally, certain areas of physics have made intensive use of preprints. Even before the advent of electronic texts, high energy physics, for example, was a preprint-intensive culture. Researchers would share their findings with one another even before they had been refereed, mailing preprints to individuals as well as to institutional distribution points and citing one another's work in preprint form when the postprint was not yet available.

It must be noted, however, that apart from this much more extensive use of the preprint phase of research findings, physicists have always continued to rely on peer review, just as every other discipline did. Virtually all preprints were submitted to journals for peer review, and most of them were eventually accepted for publication in some journal, though perhaps after various degrees of revision. There is a quality hierarchy of journals in all disciplines and all specialty areas, with the highest-quality, highest-rejection-rate, and highest-impact-factor (most cited) journals at the top, grading down into lower and lower quality journals, until something near a vanity press at the bottom. The journal name is the known quality-certification tag that researchers rely upon in deciding what is worth reading, trusting and trying to build upon in a literature that has otherwise grown too large to navigate any other way.

There is some evidence that physics not only has more of a preprint culture, but that it has a lower rejection rate than other fields -- not because its work is of a lower quality, but because physicists may be more realistic about the appropriate quality level of their work, submitting it directly to the suitable journal, rather than trying to submit it to the highest quality journal first, and when it is rejected, submitting it to a lower level one, until they find one that accepts it. As a consequence, the preprint in some areas of physics may be closer to the postprint, again because the authors prepared it more realistically, rather than waiting to have the referees tell them what else they needed to do to improve it. Nevertheless, all papers in physics are submitted for peer review, and published in peer-reviewed journals, just as in other disciplines.

These details are given here because many (including physicists) have drawn the conclusion from the growth of the preprint culture and then the eprint culture in physics that these developments meant that peer review was becoming obsolete in physics, whereas there is not only no evidence that this is the case, but there are good reasons to believe it is unlikely ever to be the case. Even a law-abiding neighbourhood still needs police, human nature being what it is. It is their authors' knowledge that they are all destined to be answerable to the "invisible hand" of peer review that keeps preprints in physics to their high standards.

But it was no doubt because it was a preprint culture already that the transition to eprints happened soonest and fastest in physics. Once email became available, it immediately made sense to distribute preprints in that new medium rather than through slower and more costly paper dissemination via mail; and then, once the Web was available, there was no need to email the text, just to deposit it in the Web Archive and alert interested subscribers about its URL through email or alerting lists: http://arxiv.org

From the very beginning of electronic preprints in physics, however, there were electronic postprints as well (with a delay of 8-12 months between preprint and postprint, during which the refereeing, revision, and eventual acceptance took place). Once a final draft was accepted, authors would either deposit the postprint in the Physics Archive or, if the changes had only been minor or trivial, sometimes they would merely update the reference data for their preprint, putting in its full publication details for bibliographic citation.

The Physics Eprint Archive has been growing steadily since its birth in 1991, but its growth has been only linear. This was fast enough so that within a few years virtually the entire annual high energy physics output was archived there and accessible to all, but if we extrapolate the linear growth we find that it will still take another 10 years (i.e., till the year 2011) for the peer-reviewed research literature (of that year) in all areas of physics to at last be freely accessible online.

Before we consider the reasons and possible remedy for this, we need to consider why open online access to the peer reviewed research literature -- not only in physics, but in all fields of research -- is so important to have.

Free On-Line Full-Text Access to Peer-Reviewed Research

Fully explaining why and how free online access is the optimal and inevitable solution for this special literature (of 20,000 peer-reviewed journals) would go beyond the scope of this article, but it is based on two factors:

(1) All peer-reviewed research is and always has been an author/institution give-away: The first factor is that this literature differs from most other literatures in that it is all, without exception, an author give-away: Not one of the authors of the annual 2 million refereed articles seeks royalties or fees in exchange for his text. The only thing they seek is as many readers and users as possible, for it is the research impact of their articles -- of which a rough measure is the number of times each is cited -- that brings these authors and their institutions their rewards (employment, promotion, tenure, grants, prizes, prestige). Subscription/license sales revenue brings these authors nothing: on the contrary, these toll-based access-barriers, because they are also impact-barriers, are in direct conflict with the interests of research and researchers. And that is also the second factor:

(2) Unaffordable access-tolls block the potential usage and impact of give-away peer-reviewed research: Most of the 2 million refereed research articles published annually are inaccessible to most of their would-be users worldwide because their institutions cannot afford to pay the access tolls. No institution can afford anywhere near all of the 20,000 journals in which they appear and most can afford only a small and shrinking proportion of them. All of that missed potential impact is a loss not only for researchers, their careers and their institutions, but for research itself, whose progress depends on researchers' accessing, using and building upon one another's findings.

The physicists have found the way around the access-barriers, but their way is not growing quickly enough, and is not spreading to other disciplines quickly enough. Exactly what is the physicists' way, and what might be wrong (or not right enough) about it? What the physicists are doing is self-archiving their preprints and postprints in a central, discipline-based Eprint Archive. But a discipline is not an interested party, insofar as research progress is concerned. It is researchers and their institutions who have interests. It is researchers and their institutions who share the publish-or-perish rewards of research impact (research funding, prizes, prestige). Hence institutional self-archiving is the natural complement to discipline-based self-archiving. Institutions can provide both the incentives and the means to spread the practise of self-archiving across disciplines and to accelerate its rate:

How institutions are facilitating the filling of Eprint Archives across disciplines:

(1) Installation of "OAI-compliant" Eprint Archives: Free software has been designed that creates Eprint Archives http://www.eprints.org that conform to the Open Archives Initiative (OAI) protocol for metadata harvesting http://www.openarchives.org. This guarantees that all such Eprint Archives will be "interoperable," as if all the papers therein were in one seamless global archive, accessible to and navigable by everyone, everywhere.

(2) Adoption of a university-wide policy that all faculty maintain and update a standardised online curriculum vitae (CV) for annual review.

(3) Mandating that the full digital text of all refereed publications should be deposited in the University Eprint Archives and linked to their entry in the author's online CV. (It is made clear to all faculty how self-archiving is in the interest of their own research and standing, maximizing the visibility, accessibility and impact of their work.)

(4) Offering trained digital librarian help in showing faculty how to self-archive their papers in the university Eprint Archive (it is very easy).

(5) Offering trained digital librarian help in doing "proxy" self-archiving, on behalf of any authors who feel that they are personally unable (too busy or technically incapable) to self-archive for themselves. They need only supply their digital full-texts in word-processor form: the digital archiving assistants can do the rest (usually only a few dozen keystrokes per paper).

(A policy of mandated self-archiving for all refereed research, together with a trained proxy self-archiving service, to ensure that lack of time or skill do not become grounds for non-compliance, are the most important ingredients in a successful self-archiving program. The proxy self-archiving will only be needed to set the first wave of self-archiving reliably in motion. The rewards of self-archiving -- in terms of visibility , accessibility and impact -- will maintain the momentum once the archive has reached critical mass. And even students can do for faculty the few keystrokes needed for each new paper thereafter.)

(6) Digital librarians, collaborating with web system staff, provide the proper maintenance, backup, mirroring, upgrading, and migration that ensures the perpetual preservation of the university Eprint Archives. Mirroring and migration are handled in collaboration with counterparts at all other institutions supporting OAI-compliant Eprint Archives.

Free online access to all refereed research output is both optimal and inevitable. Once the preprints and postprints of all 2 million articles appearing annually in the world's 20,000 peer-reviewed journals are openly accessible, research progress will become much more rapid and interactive: every article will be hyperlinked directly to each article it cites <http://opcit.eprints.org>, new forms of peer-commentary journals (e.g. <http://psycprints.ecs.soton.ac.uk/> and <http://www.bbsonline.org/>)  will promote and preserve research's collective, cumulative and self-corrective cycles of interaction, and new scientometric search engines (e.g. <http://citebase.eprints.org> and <http://citeseer.nj.nec.com/cs>) will provide rich new measures and predictors of research usage, direction and impact.

REFERENCES

Crow, Raym (2002) The Case for Institutional Repositories: A SPARC Position Paper. http://www.arl.org/sparc/IR/ir.html

Harnad, S. (1990) Scholarly Skywriting and the Prepublication Continuum of Scientific Inquiry. Psychological Science 1: 342 - 343 (reprinted in Current Contents 45: 9-13, November 11 1991). http://cogprints.soton.ac.uk/documents/disk0/00/00/15/81/index.html

Harnad, S. (1994) Publicly Retrievable FTP Archives for Esoteric Science and Scholarship: A Subversive Proposal. In: Ann Okerson & James O'Donnell (Eds.) Scholarly Journals at the Crossroads: A Subversive Proposal for Electronic Publishing. Washington, DC., Association of Research Libraries, June 1995. http://www.arl.org/scomm/subversive/toc.html

Harnad, S. (2001) The Self-Archiving Initiative. Nature 410: 1024-1025 http://cogprints.soton.ac.uk/documents/disk0/00/00/16/42/index.html

Harnad, S. (2001) "Research access, impact and assessment." Times Higher Education Supplement 1487: p. 16. http://cogprintssoton.ac.uk/documents/disk0/00/00/16/83/index.html

Harnad, S. (2001) For Whom the Gate Tolls? How and Why to Free the Refereed Research Literature Online Through Author/Institution Self-Archiving, Now. http://cogprints.soton.ac.uk/documents/disk0/00/00/16/39/index.html

Lawrence, S. (2001) Online or Invisible? Nature 411 (6837): 521. http://www.neci.nec.com/~lawrence/papers/online-nature01/

Pinfield, Stephen, Gardner, Mike and MacColl, John (2002) Setting up an institutional e-print archive. Ariadne 31: March-April 2002. http://www.ariadne.ac.uk/issue31/eprint-archives/.

Sponsler, E. & Van de Velde, E. (2001) Review: Eprints.org Software SPARC E-News, August-September http://www.arl.org/sparc/core/index.asp?page=g20#6