ABSTRACT: The Los Alamos Eprint Archive (LANL) is a public repository for a growing proportion of the current research literature in Physics. The Open Citation-linking Project (OpCit) is making this resource still more powerful and useful for its current physicist users by connecting each paper to each paper it cites; this can be extended to all the rest of the disciplines in other Open Archives designed to be interoperable through compliance with the Santa Fe Convention. A citation-linked online digital corpus also allows powerful new forms of online informetric analysis that go far beyond static citation analysis, measuring researchers' usage of all phases of the literature, from pre-refereeing preprint to post-refereeing postprint, from download to citation, yielding an embryology of learned inquiry.
Current Contents helped enfranchise scientists from (what was then called) the third (and second) world, as well as many from the less prosperous institutions of the first world. By bringing them at least the weekly contents pages of the vast pre-digital journal literature to which their institutions could not afford to subscribe (and for which their research activities scarcely afforded the time for the legwork of shelf-browsing in any case), Gene made it possible to let their fingers do the walking. Reprint requests could be mailed to the authors of the papers that non-first-world scientists needed to read (or, at the better-heeled first-world institutions, secretaries or students could be dispatched to the journal shelves to photocopy them).
But turnaround times were still slow and uncertain: The requested reprints were a long time in coming, if they came at all, in the non-first world; and even in the first, a lot of time and resources were wasted retrieving reprints for which a glance at their abstracts or full texts, once they were in hand, immediately revealed that it had been a false alarm. And even the relevant "hits" had to be committed to growing, groaning reprint shelves in labs, which eventually created storage, navigation and retrieval problems of their own.
Nevertheless, on balance, the increased access to the literature that this system of monitoring and requesting provided was undeniably a benefit to research and researchers, and increased both productivity and impact.
And Gene's other major contribution, Science Citation Index, made it possible to monitor and measure that productivity and impact. Gene Garfield did not invent the "publish or perish" metric of productivity, but he certainly fine-tuned it, with the citation-ratios of papers, authors and journals helping to supply promotion/tenure committees (as well as library serials-selection committees and sometimes even research funding committees) with a greater variety of beans to count -- supplementing the peer-review system itself, which, if I weren't a vegetarian, I would call the real meat of research assessment (Harnad 1998c).
Nor was citation analysis merely an evaluative metric: It was also a way of charting the present, past and future course of research, sorting out the pedigrees of ideas and findings, and in general doing what might be called the quantitative "embryology" of knowledge.
All that, despite the obstacles that both enterprises faced in the papyrocentric Gutenberg era. Journal contents pages had to be gathered, photo-copied, cut/pasted, collated, printed and mailed around the world by ISI every week; citation lists had to be laboriously retyped, compiled, analyzed, printed and again mailed around the world in vast, heavy, cumulative compendia. What a different world it would have been if all those data - journal contents pages and reference lists - had been digital to begin with! Nothing to photo-copy, key-in, scan or cut/paste; and, if we threw in the Net along with the bytes, nothing to print or mail either! And as long as we are digi-dreaming, why not throw in the full texts of all those articles too? Then it is not only ISI that no longer needs to bother with digitizing or mailing, but researchers too can burn all their remaining reprint-request cards -- and free their shelves of offprints!
If the entire journal corpus had been digital and online, far more intelligent and customized automatic alerting services could have been devised than the mere scanning of weekly contents pages; and instead of passively waiting to be alerted, researchers could actively search and navigate the entire journal literature -- not only through subject- keyword-, and even full-text-searching, but also through citation-searching of a completely interlinked corpus. Searchers could even set thresholds for the impact levels of the papers, authors and journals to which they wished to restrict their search. For the earlier embryological stages of papers, hit-rates (down-load frequencies) for the preprint could supplement citation-rates for the reprint. And this rich, dynamic and growing embryonic corpus would have been the database for Gene's pioneering bibliometric analyses, with online user-based measures such as citation-surfing, down-loading, and hit-immediacy to complement the offline author-based measures such as publishing, citing and citation-immediacy.
This digi-dream is now becoming a reality, thanks to the Open Archives Inititiative (http://www.openarchives.org), interoperable open archiving software (http://www.eprints.org) and the Open Citation (OpCit) Linking Project (http://opcit.eprints.org).
To implement this immediately, the
entire preprint and reprint literature would need to be available online
in a usable, unified form. It is not, yet. But in Physics a sufficiently
large and representative subset of it already is (see http://xxx.lanl.gov/cgi-bin/show_monthly_submissions
).
So the work on that subset is now
underway, and successful results based on it will not only generalize to
the rest of the literature, once it is all online, but they will help to
draw it all online more quickly.
The LANL Archive represents a substantial
body of literature in Physics, Mathematics and Computer Science, but the
full texts are archived in a variety of forms, from HTML to TeX to PDF
to PS (Figure 1), and the first problem that needs to be solved is designing
a way to integrate and navigate them seamlessly.
Figure 1. File formats in LANL Archive.
One especially important feature of full texts -- their reference list -- is arguably the most natural and powerful way of interconnecting and navigating this literature. The "links" are already provided by the authors themselves, and users already have a long, skilled tradition of navigating with them "offline" (looking up the references in paper).
The Open Journal Project (Carr &
Hitchcock 1995, Evans et al. 1998) and CogPrints (http://cogprints.soton.ac.uk)
successfully used citation linking to interconnect a small but interdisciplinary
"seed" database of full texts in the Cognitive Sciences with a much
larger 10-year set of abstracts and their reference lists from a subset
of the ISI (Institute for Scientific Information http://www.isinet.com/prodserv/citation/citsci.html
)
journal citation database in the Cognitive Sciences (Psychology, Neurobiology,
Computer Science, Linguistics, Philosophy). This work went some way toward
solving the problem of automatically recognizing and linking (within and
between texts) the finite but noisy set of existing citation formats (Hitchcock
et al. 1997a-c, 1998a,b; Giles
et al. 1998; Bollacker
et al. 1998). The reaction of users was exhilaration with citation-based
navigation, but frustration at accessing only abstracts. The obvious conclusion
to be drawn from this was that the real power of citation linking can only
be realized with full-text linking. OpCit is now doing this with the LANL
Archive.
Figure 2. Distribution of "hits" of different kinds.
The LANL Archive has an additional
interest from the standpoint of usage and citation analysis: It consists
of both unrefereed preprints and refereed reprints. Within
one year of being deposited, about 60% of the papers in LANL are updated
to include the full reference to the journal in which they have been accepted
for publication. It is not yet clear how many preprints are updated to
append the full final text of the refereed reprint, but LANL papers are
being updated as many as four times (Figure 3).
Figure 3. Multiple Updates by LANL Subfield.
A new form of citation has also appeared (in both the paper and the online literature): citing the LANL preprint number (Youngen 1998). This will no doubt become a standard practise and must hence be covered as a special case of citation linking, so that links can be dynamically updated and aliased to the common source as soon as the reprint takes the place of the preprint.
The emerging patterns of preprint vs. reprint citation and use are also a natural object of analysis in their own right; they represent a revealing microcosm of the overall transitional process that is taking place as this new medium evolves its own niche in scholarly and scientific research practise - the "scholarly skywriting" continuum - to reveal previously hidden embryological stages of learned inquiry and interaction (Harnad 1990).
LANL itself is upgrading to keep pace with new technical developments and with evolving practices among physicists. But now the adaptation has to go beyond the physics community, which is already accustomed to the present LANL self-archiving interface and procedure, to other disciplines that are familiar neither with LANL nor with eprint (= electronic preprint and reprint) self-archiving.
The Open Archiving Initiative (http://www.openarchives.org) has recently formulated the Santa Fe Convention (Van de Sompel & Lagoze 2000), agreeing to adopt a subset of the Dienst Protocol (Halpern & Lagoze 1999) for tagging and sharing metadata to ensure that all compliant archives are interoperable. Eprints (http://eprints.org), in collaboration with OpCit (http://opcit.eprints.org), has accordingly designed and released (free) open-archiving software for creating, customizing and maintaining Santa-Fe-compliant Open Archives. Eprints is designed for adoption by all universities and research institutions worldwide, so they can collectively become a distributed, interoperable, universal archive of the research literature.
The constraint of citation linking itself provides a shared skeletal structure to constrain the design and to unify the format of open archiving: The full range of variation in citation formats exists in all disciplines; hence this skeletal structure must be extractable from all texts, in a form that can be used for hypertext linking. The adaptations (in both the author interface and the infrastructure for depositing texts) dictated by the need to extract and link citations in the texts, their reference lists, and the texts they cite, for all formats, will then constrain the drafting of future texts, the formats in which authors are encouraged to submit them, and the way those formats are processed by the Eprints software. In other words: whatever it takes to make all deposits interoperable specifically for citation extraction and linking will also help to make them interoperable in other respects, because citation linking is a representative microcosm of text interlinking in general.
To the extent that these citation-specific
adaptations influence author practices, they should also help to speed
the standardization of formats and procedures that will eventually converge
on the optimal online universal resource for the learned research community
(Harnad 1999).
OpCit is making it possible to retrieve
LANL papers in a citation-linked format (currently HTML or PDF), so that
once a user has retrieved an entry-level paper, navigation of the entire
archive can continue via citation-links, with no need to launch another
top-down search (although the top-down capabilities -- keyword, author,
and eventually even full-text search of the archive -- continue to be available
at the paper level). Heuristics and algorithms for content classification
have been developed to do the citation linking (Hitchcock et al. 1997b;
Giles et al. 1998).
One of the many advantages of extending this work to LANL is that partial results can be used to hasten progress toward fuller results: A subset of the LANL Archive can already be fully citation-linked immediately, using the current linking tools, namely, those papers that have correctly specified, well-formed bibliographic citations and have been typeset by software which maintains the textual contents of the page. That subset has now been fully linked to all papers it cites that are likewise in the Archive (including those that are not yet themselves part of the fully interlinkable subset because their own references cannot be further linked outwards: their titles, author-names, abstracts and keywords are nevertheless enough to find and link inward into them).
Users will accordingly have a chance to experience and compare functionality under two conditions: when they retrieve papers whose full texts are also linked (http://arabica.ecs.soton.ac.uk), and when they retrieve papers that are dead-ends, like the abstracts that frustrated the users of the ISI/Open Journal Cognitive Science database (http://journals.ecs.soton.ac.uk/TryOJ.htm).
[Heavily used Open Archives like LANL allow author and user culture to evolve and converge very rapidly under the pressure of collective feedback about impact barriers: An attempt to retrieve a dead-end paper can be made to trigger an automatic email to the author of that paper indicating that a user has tried to "cite-visit" it, but that this was not possible because the author has not yet provided a version from which linkable citation data could be derived. This could be accompanied by clear instructions on how to self-archive such a version now; authors could indicate whether they wanted to see such access-failure reports for their deposited work instantly, weekly, monthly, semi-annually, or never; they could of course also request successful "hit" statistics too, thereby self-monitoring their brainchildren from their earliest embryological stages onward; Figure 4.]
Figure 4. Age of paper at download minus age of paper at citation. The negative region of the graph is papers that were downloaded first, then cited; the peak around zero is papers that were both downloaded and cited within the same short interval, and the positive region is papers that were downloaded after being cited.
This double inducement -- (i) from
experience as a user, able to fully cite-navigate some papers but unable
to do so with other papers because they had not yet been archived in a
form that could be citation linked, and (ii) from experience as an author,
learning of unsuccessful attempts to cite-navigate through one's work via
citation links -- should help to accelerate and focus changes in author
practises that will in turn ncrease the ratio of useable documents even
while OpCit is still working directly on extending the reference link extraction
tools beyond the immediately linkable subset in the current corpus.
For example, author-end citation patterns
are being analyzed to determine the scope of the LANL Archive: What proportion
of citations point to current papers that are in LANL? what proportion
to current papers that are not in LANL? or to papers that pre-date LANL
(Table 1)? to books? to papers in their unrefereed preprint form? to papers
in their final published form (Figure 5)? how do these patterns change
as the Archive's holdings grow, as its user-base grows, as its years of
coverage grow?
Table 1. LANL Citations to pre-LANL ("Antique") Papers
Figure 5. Citations to papers with
and without journal reference (Hep-th 1998-2000).
Reader-end citation-based navigation
patterns are being analyzed to determine how Open Archives are used. This
is entirely new informetric territory, because citation searching could
only be done off-line until now, so there was no automatic way to analyze
how readers actually go about it. Such data will be used to provide feedback
for optimizing the features of the Eprints interface (4.2), to monitor
and document open archiving and citation navigating practises, and to chart
the course of both knowledge creation and use along the entire scholarly
skywriting continuum(Figure 6).
Figure 6. Download Frequency vs.
Citation Frequency Across Time: After the initial download peak, the
(eventually) more highly cited papers show higher and more sustained
download frequency.
The general applicability of these techniques to interoperable digital library architectures (Lagoze & Payette 1998; Leiner 1998) is also being investigated. The Santa Fe Convention is establishing a set of standards for low-level interoperability, i.e., a means of communicating meta-data and meta-information not only between the existing mirror servers within the current network of online archives, but also between that network and other resources.
In particular, the problem of citations
that are associated once and for all with destination URLs must be addressed.
For practical flexibility, the recognition and analysis of citation information
must be separate from document format convention or locus (e.g., centralized
discipline-specific Open Archives like LANL or CogPrints, distributed university-based
Open Archives like Eprints, and primary/secondary publisher archives or
aggregation agents to which the citations may eventually be linked, perhaps
with the help of the emerging scholarly publishing standard, SLinkS [Hellman
1998]). It is also currently impossible to convert mathematical markup
to html; MathML (a realization of XML) is on the horizon and promises to
improve this situation.
Some commercial publishers have now
started to provide citation links between articles in their proprietary
journal databases and online bibliographic services. Partners in the Open
Journal project, BioMedNet (Hitchcock et
al 1998a) and ISI (Hitchcock et al 1998b),
were among the first to implement such links, along with The
Institute of Physics, which developed its HyperCite service (IOP Publishing
1996). Centralized discipline-based Open Archives like LANL and CogPrints,
however, do not have any of the financial firewall and access-barrier problems
that arise between proprietary databases; and in physics LANL has much
more comprehensive, self-contained coverage of the current literature.
This will also be true of the distributed Open Archives created through
the institutional author self-archiving initiative using Eprints (Harnad
1999).
The Distributed Link Service (DLS), which applied these links, was a WWW implementation of the hypertext techniques that had previously been demonstrated in Southampton University's Microcosm research environment (Carr et al. 1995, 1996a, 1998a). It made use of a WWW proxy environment to add links to HTML or PDF documents while they were delivered from a digital library (in plain, unlinked form) to a user's browser (with links integrated into them). The DLS software used various modules, called "agents" (because they have an "expertise" at automatically recognising particular kinds of information in the document) (Carr et al. 1998b):
7.1 Key-Word Agent. The keyword-agent is very simple and uses various databases of stand-alone links (which can be key words or other text strings), attaching them to the papers whenever those strings appear.
7.2 Name-Agent. The name-agent looks for different appearances of a name [e.g. "Eugene Garfield", "Garfield, E." or "Garfield et al."], possibly in a specified context.
7.3 Citation-Agent. The citation-agent recognises occurrences of citations in academic papers in a large (and extendable) variety of formats and analyses their contents to determine author, year, publisher, page range and the like. It uses this information from each citation to perform a lookup in a bibliographic database and to add a link to either the online full text of the cited article (if the database shows that it is available) or to the bibliographic record consisting of abstract and citations (from the ISI database).
Link Harvester: A stand-alone program that extracts pre-existing links from a document and adds them into a database. A new link-free version of the document is also generated.
Link Interpolater: A stand-alone program that inserts links from a database into a document. If different sets of databases are selected, the same document can be linked into different navigation strategies (e.g., citation, keyword, overlaid subject index).
Citation Harvester: A stand-alone program that extracts citations from a paper's reference lists for storage in a database.
Citation Interpolater: A stand-alone program that inserts links into a document based on the contents of a citation database. Links can be added to other documents in the same archive, to documents in other archives, or to generic bibliographic citation databases (such as ISI's Web of Science).
To view a fragment of the first page of an ACM DL library article with both keyword and person links added wherever interesting people and systems are mentioned, see: http://www.staff.ecs.soton.ac.uk/~lac/somewords/image4.gif
To view another fragment from an ACM
DL article that has been automatically populated with links to the ACM
library from any citation of CACM or an ACM Hypertext Conference, see:
http://www.staff.ecs.soton.ac.uk/~lac/somewords/image5.gif
9.1. Other Kinds of Links: Papers can be automatically provided with other kinds of links using distributed link overlays as demonstrated in the Microcosm and DLS technologies (Carr et al. 1996b; DeRoure et al. 1996). http://www.mmrg.ecs.soton.ac.uk/publications.html.These overlays can include links based on keywords, author-names (pointing to papers other than the explicitly cited ones), glossaries/indices, and even an inverted index for the corpus as a whole. Such services can be applied to the Open Archives data-bases (http://www.openarchives.org/sfc/data_provider_template.htm) by Open Archive service-providers (http://www.openarchives.org/sfc/service_provider_template.htm).
9.2. Revision/Update Linking: There is no reason a research report should remain in a "frozen" state after it is published. The published version, suitably tagged, is a permanent formal milestone, especially for citation purposes, but an interlinked Archive also allows authors to deposit updated and revised drafts. The automatic linking system can be adapted to accommodate this, providing automatic forward and backward linking between versions.
9.3. Commentary Links: For the same reason that links from unpublished preprints to refereed reprints to revised drafts of papers are of value, so are links to comments on papers, and authors' replies to comments, on the model of Behavioral and Brain Sciences (BBS) (http://www.princeton.edu/~harnad/bbs.html ) and Psycoloquy (http://www.princeton.edu/~harnad/psyc.html ) (Harnad 1979, 1984, 1998a).
9.4. Journal Links: There are several ways in which citation-linked Open Archives like LANL, CogPrints, and Eprints can be useful to the journals in which its papers are published. They can provide links to the version of a paper in the journal's own official online archive. They can also provide links to cited papers in the journal's online archive that do not appear in open archives. Authors might wish to have arrangements for official links with the published version in order to provide an authenticated draft, or one in which the paper page images can be viewed or cited by page and line.
9.5. Peer Review: Another useful service that Open Archives can provide to the journals in which their papers are published LANL is already beginning to provide: Authors submitting papers to the American Physical Society Journals (APS) can already do so by simply specifying the LANL version as their official submission. Referees can then be directed to that citation-enhanced draft in reviewing it. A password-controlled, non-public sector could also be created in LANL that would allow referee reports to be linked just as commentaries are in 9.4 above, but under the control of each journal. This would effectively be the implementation of online peer review (Harnad 1982, 1995, 1998a) for journals, and might be a model for the future relationship between refereed journals and open archives. Journals could also upload their final drafts to an open archive for distribution in their own formats with journal-specific identifying graphics, etc. The official journal version would then be part of the paper's overall revision "history," which could continue with comments, responses and updates; Harnad 1990, 1992).
(Nor is there anything to prevent journals from using interoperable open-archiving software such as Eprints as "closed" open archives, in that their metadata are open but their full texts are behind a financial firewall.)
9.6. Links to Proprietary Databases: Citation links leading out of Open Archives could also go to proprietary databases that charge for their services. These could include journal home archives, archives of scanned contents of back issues of journals, electronic books, and secondary publisher databases, such as INSPEC, MEDLINE or ISI. There are, however, strategic questions about whether OpCit should implement links that entail charges to the user (Bachrach et al. 1998; Harnad 1998b,d, 2000; Harnad et al. 2000).
9.7. Links to Other Public Archives: Provisions could be made for citation links to papers in public archives other than Open Archives, but it may be more useful to make other public archives Santa-Fe-compliant (as they are not competing in any sense, and only stand to benefit from interoperability, economies of scale, shared resources and development), perhaps through interfaces such as NCSTRL, into one seamless interconnected Archive; this too would provide constraints to help guide convergence into a unified, distributed, global archive (Hitchcock et al. 1997c).
9.8. Links to Authors' Home Server
Archives: Apart from mirroring, one useful form of redundancy that
discipline-based open archives might encourage is that all their authors
should also archive their papers on their home institutional servers, to
which the central archives would also be linked. This is why the Eprints
software has been developed. Links to the author's email address and URL
are also standard components of the central version (Harnad 1995).
The World Wide Web is predicated on
hypertext connections between documents, but for the scientific/scholarly
world the scholarly link par excellence is formal citation of one paper
by another. This is the way researchers have naturally been interconnecting
their writings all along, but until now it has only been possible to follow
those connections off-line, piece-wise, mediated by a great deal of real
footwork in between. Now the entire corpus can be navigated via citations
on-line.
Commercial
journal publishers, along with secondary
indexing/abstracting services, are exploring ways of interconnecting
the on-line journal literature, but those initiatives are intrinsically
and severely limited by financial firewalls (Bachrach et al. 1998, Harnad
1998b,d; Odlyzko 1995, 1998) that prevent free navigation across full texts
and their citations until and unless the access fees for each "hit" are
first paid through subscription, site-license or pay-per-view (S/L/P).
(To allow the full texts to be browsed for free would be equivalent to
giving away the literature for free in the on-line medium.)
Open Archives do not have this constraint;
citation linking within the Physics Archive, some of whose subfields are
virtually complete, yields seamless public access worldwide to the entire
corpus. OpCit has the citation linking tools and has applied them to completely
intralink the Physics Archive. A citation-linked online digital corpus
also makes possible powerful new forms of online informetric analysis that
go far beyond static citation analysis, measuring researchers' usage of
all phases of the literature, from pre-refereeing preprint to post-refereeing
postprint, from download to citation, yielding an embryology of learned
inquiry.
The huge, international usership of LANL, extended still further by Santa Fe compliance, guarantees that the proposed enhancements will not only be widely tested, but that, if successful, they will strongly influence the evolution of open archiving of the rest of the refereed literature in all disciplines. There is no question that radical changes in scholarly/scientific publishing and communication practices are poised to take place (with teaching and learning practises ready to follow suit; Light et al. 2000). Citation linking will help to guide and hasten them in the right direction.
ACKNOWLEDGEMENTS: Many thanks to Ian Hickman and Tim Brody http://opcit.eprints.org/opcitresearch.shtml for the ongoing citation analyses of which the figures in this paper are a sample.
REFERENCES
Bachrach, S., Berry, S.R., Blume, M., von Foerster, T., Fowler, A., Ginsparg, P., Heller, S., Kestner, N., Odlyzko, A., Okerson, A., Wigington, R., & Moffat, A. (1998) Intellectual Property: Who Should Own Scientific Papers? Science 281 (5382): 1459-1460. September 4 1998. http://www.sciencemag.org/cgi/content/full/281/5382/1459
Bollacker, K.D., Lawrence, S., Giles, C.L. (1998) CiteSeer: An Autonous Web Agent for Automatic Retrieval and Identification of Interesting Publications. Agents. 116-123 http://www.neci.nj.nec.com/homepages/lawrence/citeseer.html
Campbell, R.D. (1997) A Universal Citation Database as a Catalyst for Reform in Scholarly Communication . Firstmonday 2(4) April 27. http://firstmonday.org/issues/issue2_4/cameron/index.html
Carr, L., Davis H., Hall W., Hey J. (1996a) Turning the Web into a Library. In Proceedings of ELVIRA: The UK Digital Libraries Conference, De Montford University, UK. http://www.mmrg.ecs.soton.ac.uk/publications/archive/carr1996c/
Carr, L. and Hall, W. (1998) Linking as Applied Coherence, Presentation at the First International Workshop on the Use of the WWW for the Public Understanding of Science, CERN November 1998. <
Carr, L., De Roure, D., Hall, W., & Hill. G. (1998a) Implementing an Open Link Service for the World-Wide Web, WWW Journal, 1(2), Baltzer. http://www.mmrg.ecs.soton.ac.uk/publications/archive/carr1998b/
Carr, L., De Roure, D., Hall, W., Hill, G., (1995) The Distributed Link Service: A Tool for Publishers, Authors and Readers, World Wide Web Journal 1(1), 647-656, O'Reilly & Associates. http://www.mmrg.ecs.soton.ac.uk/publications/archive/carr1995/
Carr,
L., H. Davis, D. De Roure, W. Hall G. Hill (1996b) Open Information Services,
Computer Networks and ISDN Systems, 28 (7/11), 1027-1036, Elsevier.
http://www.mmrg.ecs.soton.ac.uk/publications/archive/carr1996b/
Carr, L., Hall, W., Hitchcock, S.,
(1998b) Link Services or Link Agents? In Proceedings of the Ninth ACM Conference
on Hypertext, Pittsburgh. June 1998 http://www.mmrg.ecs.soton.ac.uk/publications/archive/carr1998a/
Carr, L., Hitchcock, S. (1995) The Open Journal Project "http://journals.ecs.soton.ac.uk
Carr, L., Hitchcock, S., Hall, W. & Harnad, S. (2000) A usage based analysis of CoRR [A commentary on "CoRR: a Computing Research Repository" by Joseph Y. Halpern] ACM SIGDOC Journal of Computer Documentation. May 2000. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.halpern.htm
Chen C. and Carr L. (1999) Trailblazing
the literature of hypertext: author co-citation analysis (1989-1998). In
Proceedings of the Tenth ACM Conference on Hypertext, Darmstadt. February
1999 <
Davis, H., Hall, W., Heath, I., Hill,
G. & Wilkins, R. (1992) "Towards an Integrated Information Environment
with Open Hypermedia Systems" in the Proceedings of the ACM Conference
on Hypertext (ECHT'92), Milan, November 1992, ACM Press, pp 181-190
http://www.mmrg.ecs.soton.ac.uk/publications/archive/davis1992/
Davis, J. R. and Lagoze, C. (1999) "NCSTRL: Design and Deployment of a Globally Distributed Digital Library," to appear in Journal of the American Society for Information Science (JASIS) http://www2.cs.cornell.edu/lagoze/papers/NCSTRL-IEEE3.doc
De Roure, D.C., Carr, L.A., Hall, W. and Hill G.J. (1996) A Distributed Hypermedia Link Service. In Proceedings of the Third International Workshop on Services in Distributed and Networked Environments (SDNE96), Macau, June 3-4, 1996, IEEE Computer Society Press, pp156- 161. http://www.mmrg.ecs.soton.ac.uk/publications/archive/deroure1996a/
Garfield, E., (1955) Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Science 122: 108-111 http://www.garfield.library.upenn.edu/papers/science_v122(3159)p108y1955.html
Giles, C.L., Bollacker, K. and Lawrence, S. (1998) CiteSeer: An Automatic Citation Indexing System, The Third ACM Conference on Digital Libraries, ACM Press, 89-98. http://www.neci.nj.nec.com/homepages/lawrence/citeseer.html
Ginsparg, P. (1994) First Steps Towards Electronic Research Communication. Computers in Physics. (August, American Institute of Physics). 8(4): 390-396. http://xxx.lanl.gov/blurb/
Ginsparg, P. (1996) Winners and Losers in the Global research Village. Invited contribution, UNESCO Conference HQ, Paris, 19-23 Feb 1996 http://xxx.lanl.gov/blurb/pg96unesco.html
Hall, W., Davis, H.C. and Hutchings, G.A. (1996) Rethinking Hypermedia: the Microcosm Approach. Boston USA, Kluwer Academic Press 195pp.
Halpern, J. Y. and Lagoze, C. (1999) "The Computing Research Repository: Promoting the Rapid Dissemination and Archiving of Computer Science Research," (submitted to) Digital Libraries '99, The Fourth ACM Conference on Digital Libraries, Berkeley, CA. http://www2.cs.cornell.edu/lagoze/papers/dl99.pdf
Harnad, S. (1979) Creative disagreement. The Sciences 19: 18 - 20.
Harnad, S. (ed.) (1982) Peer commentary on peer review: A case study in scientific quality control. New York: Cambridge University Press.
Harnad, S. (1984) Commentaries, opinions and the growth of scientific knowledge. American Psychologist 39: 1497 - 1498.
Harnad, S. (1990) Scholarly Skywriting and the Prepublication Continuum of Scientific Inquiry. Psychological Science 1: 342 - 343 (reprinted in Current Contents 45: 9-13, November 11 1991). http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad90.skywriting.html
Harnad, S. (1991) Post-Gutenberg Galaxy: The Fourth Revolution in the Means of Production of Knowledge. Public-Access Computer Systems Review 2 (1): 39 - 53. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad91.postgutenberg.html
Harnad, S. (1992) Interactive Publication: Extending the American Physical Society's Discipline-Specific Model for Electronic Publishing. Serials Review, Special Issue on Economics Models for Electronic Publishing, pp. 58 - 61. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad92.interactivpub.html
Harnad, S. (1995) Universal FTP Archives
for Esoteric Science and Scholarship: A Subversive Proposal. In: Ann Okerson
& James O'Donnell (Eds.) Scholarly Journals at the Crossroads; A Subversive
Proposal for Electronic Publishing. Washington, DC., Association of Research
Libraries, June 1995.
http://www.arl.org/scomm/subversive/toc.html
Harnad, S. (1996) Implementing Peer Review on the Net: Scientific Quality Control in Scholarly Electronic Journals. In: Peek, R. & Newby, G. (Eds.) Scholarly Publishing: The Electronic Frontier. Cambridge MA: MIT Press. Pp. 103-118. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad96.peer.review.html
Harnad, S. (1998a) Learned Inquiry and the Net: The Role of Peer Review, Peer Commentary and Copyright. Learned Publishing 4(11): 283-292 http://citd.scar.utoronto.ca/EPub/talks/Harnad_Snider.html
Harnad, S. (1998b) On-Line Journals and Financial Fire-Walls. Nature 395(6698): 127-128. http://www.cogsci.soton.ac.uk/~harnad/nature.html
Harnad, S. (1998c) The invisible hand of peer review. Nature [online] (c. 5 Nov. 1998) http://helix.nature.com/webmatters/invisible/invisible.html
Harnad, S. (1998d) For Whom the Gate Tolls? Free the Online-Only Refereed Literature. American Scientist Forum. September 1998 http://www.cogsci.soton.ac.uk/~harnad/amlet.html
Harnad, S. (1999) Free at Last: The Future of Peer-Reviewed Journals. D-Lib Magazine 5(12) December 1999 http://www.dlib.org/dlib/december99/12harnad.html
Harnad, S. (2000) E-Knowledge: Freeing the Refereed Journal Corpus Online. Computer Law & Security Report 16(2) 78-87. [Rebuttal to Bloom Editorial in Science and Relman Editorial in New England Journalof Medicine] http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.scinejm.htm
Harnad, S., Varian, H. & Parks, R. (2000) Academic publishing in the online era: What Will Be For-Fee And What Will Be For-Free? Culture Machine 2 (Online Journal) http://www.cogsci.soton.ac.uk/~harnad/Temp/Varian/new1.htm
Hellman, E. (1998) Scholarly Link Specification Framework http://www.openly.com/SLinkS/
Hitchcock, S., L. Carr, W. Hall (1997a) Web journals publishing: a UK perspective, Serials, Vol.10, no.3, pp 285-299. (ISBN 0953-0460) http://www.mmrg.ecs.soton.ac.uk/publications/archive/hitchcock1997/
Hitchcock, S., Carr, L., Harris, S., Hey, J. & Hall, W. (1997b) "Citation linking: improving access to online journals". Proceedings of Second ACM conference on Digital Libraries, Philadelphia, pp115-122. http://www.mmrg.ecs.soton.ac.uk/publications/archive/hitchcock1997b/
Hitchcock, S., Quek, F., Carr, L., Hall, W., Witbrock, A., and Tarr, I. (1997c) Linking Everything to Everything: Journal Publishing Myth or Reality? ICCC/IFIP conference on Electronic Publishing 97: New Models and Opportunities, Canterbury,UK, April. http://journals.ecs.soton.ac.uk/IFIP-ICCC97.html
Hitchcock, S., Carr, L., Harris, S., Probets, S., Evans, D., Hall, W. and Brailsford, D. (1998a) Linking electronic journals: lessons from the Open Journal project, DLib Magazine, Dec 1998
Hitchcock, S., F. Quek, L. Carr, W.Hall, A. Witbrock and I. Tarr (1998b) Towards Universal Linking in Electronic Journals. Serials Review 24(1), 21-33.
Hitchcock, S. Carr, L., Jiao, Z., Bergmark, D., Hall, W., Lagoze, C. & Harnad, S. (2000) Developing services for open eprint archives: globalisation, integration and the impact of links. Proceedings of the 5th ACM Conference on Digital Libraries. San Antonio Texas June 2000. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.acm.htm
Lagoze, C. and Payette, S. (1998) "An Infrastructure for Open Architecture Digital Libraries," Cornell University Computer Science, Technical Report TR98-1690, June 1998 http://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690
Lassila, O., and Swick, R. (eds) (1999) Resource Description Framework (RDF) Model and Syntax Specification. W3C Proposed Recommendation (January 1999). http://www.w3.org/TR/PR-rdf-syntax/
Leiner, B.M. (1998) "The NCSTRL Approach to Open Architecture for the Confederated Digital Library," D-Lib Magazine, December 1998
Light, P., Light, V., Nesbitt, E. & Harnad, S. (2000) Up for Debate: CMC as a support for course related discussion in a campus university setting. In R. Joiner (Ed) Rethinking Collaborative Learning. London: Routledge (in press). http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.skyteaching.html
Odlyzko, A.M. (1995) Tragic loss or good riddance? The impending demise of traditional scholarly journals, International Journal of Human- Computer Studies, 42 (1995), 71-122. http://www.research.att.com/~amo/doc/tragic.loss.txt
Odlyzko, A.M. (1998) The economics of electronic journals. In: Ekman R. and Quandt, R. (Eds) Technology and Scholarly Communication Univ. Calif. Press, 1998. http://www.research.att.com/~amo/doc/economics.journals.txt
Okerson A. & O'Donnell, J. (Eds.) (1995) Scholarly Journals at the Crossroads; A Subversive Proposal for Electronic Publishing. Washington, DC., Association of Research Libraries, June 1995. http://www.arl.org/scomm/subversive/toc.html
Pope, S. & Miller, L. (1998) Using the web for peer review and publication of scientific journals. Conservation ecology [online], (c. 5 Nov. 1998) http://www.consecol.org/Journal/consortium.html
Probets, S., D. F. Brailsford, L. Carr and W. Hall (1998) Dynamic Link Inclusion in Online PDF Journals. In Proceedings of Seventh International Conference on Electronic Publishing, Document Manipulation and Typography. Springer-Verlag (Lecture Notes in Computer Science Series). April 1998. http://www.mmrg.ecs.soton.ac.uk/publications/archive/probets1998/
Van de Sompel, H., & Hochstenbach, P. (1999) Reference Linking in a Hybrid Library Environment. D-Lib Magazine Volume 5 Issue 4 http://www.dlib.org/dlib/april99/van_de_sompel/04van_de_sompel-pt1.html
Van de Sompel, H., & Lagoze, C. (2000) The Santa Fe Convention of the Open Archives Initiative. D-Lib Magazine Volume 6 Issue 2 http://www.dlib.org/dlib/february00/vandesompel-oai/02vandesompel-oai.html
Youngen, G. (1998) Citation patterns of the physics preprint literature with special emphasis on the preprints available electronically. UIUC Physics and Astronomy library [online] (c. 5 Nov. 1998) http://www.physics.uiuc.edu/library/preprint.html