Steve Hitchcock* and Wendy Hall**
IAM Research Group, Department of Electronics and Computer Science
University of Southampton SO17 1BJ, UK
Contact email: sh94r@ecs.soton.ac.uk
Author Web pages and more papers by the authors in the ECS Southampton archive:
*Web page, papers; **Web page, papers
Abstract. Influential scientists are urging journal publishers to free their published works so they can be accessed in comprehensive digital archives. That would create the opportunity for new services that dynamically interconnect material in the archives. To achieve this, two issues endemic to scholarly journal publishing need to be tackled: decoupling journal content from publishing process; defragmentation of the control of access to works at the article level. It is not necessary to wait for publishers to act. It was predicted that, enabled by links, e-journal publishing will become more distributed. (Hitchcock et al. 1998) An editorially controlled new model e-journal that links material from over 100 distributed, open access sources realises that prediction. Perspectives in Electronic Publishing (PeP) combines the functions of a review journal with original materials and access to full-text papers on a focussed topic, in this case on electronic publishing, in a single coherent package that indexes and links selected works. The paper describes the main features of PeP and how it can be used, and considers whether PeP contributes to the scientists' objective of a dynamic and integrated scientific literature.
Presented at the ICCC/IFIP 5th Conference on Electronic Publishing - ELPUB2001, University of Kent at Canterbury, UK, July 2001. This version of the paper is an html copy of that presented in the conference proceedings. Posted on the Web in June 2001. View this version for reference links.
Other Versions
This paper is not about the form of the archive, whether it should be a single centralised repository or distributed and harvested archives as proposed by the Open Archives initiative (OAi). It is about the 'dynamic', the innovations and services we might begin to see if the content of the archive or archives becomes openly accessible to third-parties. In particular, it investigates the benefits when works are freely available to readers in some form. 'Unimpeded access to these archives and open distribution of their contents', Roberts et al. continue, 'will enable researchers to take on the challenge of integrating and interconnecting the fantastically rich, but extremely fragmented and chaotic, scientific literature'.
That statement encapsulates what this paper, and the model it promotes, contend are the vital features of any system of electronic scholarly communication, which must be:
Thus, to get to the proposed model we first have to deal with two issues endemic to the existing scholarly journal publishing system:
The publisher leads stages 1 and 2. By stage 3 there is a third party, the author, whose interest is believed to be co-joined with the publisher. Successful review of a submitted paper assures the author that the paper is fit to be viewed by his or her peers. In fact, if fulfilled correctly, the journal review has principally established that the paper is consistent with the profile of the journal. Nevertheless, joint interests have been confirmed between publisher and author, and a small cost has been incurred.
What are the implications of the peer review process? The publisher has learnt from a non-contracted expert witness, the peer reviewer, that the work probably has some value in its market. That value may be imprecise, but it is something that can be exploited in the journal framework. In terms of the paper, the publisher so far has only a small cost to recover, but in terms of stages 1 and 2 a bigger investment needs to be recouped and the paper has to be able to make a contribution to those costs.
What happens next is pivotal. The publisher needs the author's consent to publish the work. For many publishers that is not enough and the author may be asked to assign all rights in the work, that is, transfer complete ownership and beneficial interest in the work to the publisher. In part this is not unreasonable. If the author's journal selection has been good and the review has strictly been associated with the journal, then that journal ought to be the most appropriate publication vehicle for the paper. In an era of multiple media, however, authors would be better off reserving some rights that journals alone cannot serve, as we see in the steps below.
With all rights acquired the publisher puts the paper into its journal production process, stage 4. This invariably incurs a larger cost than that of the peer review process and will be passed on to the end user. The end user is never consulted on the value of the production process. A journal may have especially high design and production standards, which can improve reading, but this is a sideshow. The main deal has already been done, and if the peer review has been effective the target end user needs to see the paper whatever the cost and effect of production. The evidence, presented by Tenopir and King (2000), is that while journals continue to be accessed at high levels, personal subscriptions to journals have been decimated over the last 30 years.
In print the journal publishing process is indivisible, but not electronically. Consider an author in a field that supports e-print archives. Prior to journal review the author deposits the paper in a freely-accessible archive. Within 24 hours the author's peers will have been alerted and are able to access and read the archive version of the paper. After review the author retains the right to self-archive the work and supplements the original e-print in the archive with a revised version that has satisfied the reviewers. The work - the words, the presentation - is entirely that of the author. The process of deposit in the archive takes a few minutes of the author's time so long as it was produced in a format that adheres to the requirements of the automated archive. Within 24 hours the author's peers are able to access the new version of the paper.
It is possible the purpose of the review might be diminished outside the scope of the journal. In practice the distinction is rarely made as a Journal-Ref, a note of the journal that has reviewed and accepted the paper, is added to the metadata for the paper in the archive.
The author benefits in every respect. Instant readership for the preprint is guaranteed in the archive; for the best papers the status conferred by the Journal-Ref tag assures continuing higher numbers of readings in the archive. (Harnad and Carr 2000)
The publisher feels short-changed. Should it? That status has been attained by its efforts over many years, but what the publisher is left with is a commodity, which is what the paper has become, of dubious value now that exclusivity has been lost.
This is not necessarily the case. In physics, the only discipline with a large, universal e-print archive, journal publishers have continued to thrive in exactly this scenario. Data on profit growth may be unavailable, but there have been no reports of exceptional fall-off in subscriptions to journals in physics since the archive was launched in 1991.
Other publishers in other fields are more sceptical. Physics is a special case, they argue. Some journals invoke an embargo at stage 2, denying consideration to papers that have been submitted elsewhere, such as e-print archives. Or, with all rights in hand, they could feasibly demand withdrawal of any archive versions prior to publication. Some publishers have set up preprint archives that make papers free-to-view, but only up to the point of journal publication. (Dessy 2000) It appears that non-physics authors might be denied the benefits of their physics colleagues.
Except authors can claim the prize for themselves, by doing exactly as the physics author does, and reserving the right to self-archive their own versions of their own works. Should authors feel any twinge of guilt towards the journal publisher they could surmise the following: if the journal has an appreciably high status - there seems little point in submitting if it has not - it has no doubt been achieved by fortune of exclusive publication to date and it no longer needs that advantage; also, that if publishing is difficult and the publisher adds significant value that the self-archiving author alone could not, then the publisher should be capable of competing with sources based on author self-archiving.
In effect, the paper has been decoupled not from the peer review process, which the author values most, but from the production process, a superficiality of less importance.
Other commentators have theorised on the process of decoupling and the consequent benefits. Describing the 'deconstructed' journal, Smith (1999) wanted to establish the idea that most of the activities involved in journal publishing are independent ('quality control activities are not concerned with distribution') and therefore that there is no obvious need for these roles to be controlled, and the resulting product owned, by a single publisher.
The Scholar's Forum proposal (Buck et al. 1999), which aimed to wrest control for peer review and publishing value-adding tasks from journals to a Los Alamos-like 'document database', seems to have been still-born but indicated the concerns of influential academics. Phelps (1998), a university vice-chancellor, similarly elaborated a national electronic archive as a means to introduce competition: 'we must find ways to introduce competition into every phase of the process that journals once performed as a bundled effort -- quality certification, editorial improvement, distribution, indexing, and archiving'.
A similar process of deframentation is underway in the online publishing space, reversing the process of specialisation forced on paper journals by page constraints and other factors, and the consequent stagnation in the ability of non-specialist users to access these works. The ACM saw that users wanted access to the whole corpus of its publications: 'The business model and marketing campaign ... de-emphasized subscriptions to individual journals in favor of a single, annual access fee for unlimited usage of our Digital Library'. Rous (1999) This has resulted in faster growth of individual subscribers, the ACM claims, reversing entrenched steep declines in personal subscriptions affecting most print journals (Tenopir and King 2000). The importance of integrating access to journals to offer comprehensive coverage within fields has been established in user studies (Borghuis et al. 1996) and seems to be the choice of scholars and librarians. (Bailey 1994)
The most vivid evocation of defragmented scholarly publication is 'skywriting' (Harnad 1991): 'just as if each contribution were being written in the sky, for all peers to see and append to'. Even this is but a step in the direction of Engelbart's (1975) remarkable NLS oNLine System, described as a 'workplace' for knowledge workers, supporting dialogue and collaboration as well as access to texts and information services: 'publication time is very much shorter; significant "articles" may be as short as one sentence; cross-reference citations may easily be much more specific (i.e., pointing directly to a specific passage); catalogs and indexes can be accessed and searched online as well as in hard copy; and full-text retrieval with short delays is the basic operating mode. The end effect of these changes is a form of recorded dialogue whose impact and value has a dramatic qualitative difference over the traditional, hard-copy journal system'.
Defragmentation has the flexibility to create new packages that build on journal branding but are not dependent on it. Examples of this are the subject-focus portals such as BioMedNet and more recently the so-called 'virtual journals' promoted in most cases by these portal operators. Virtual journals appear to offer new services to the user, notably personalisation, but in some cases are simply vehicles for pay-per-view. This is because access remains tied to the original journal or publisher, the journal that performed peer review and most likely obtained exclusive rights to publish.
At the journal level defragmentation is not new, and has been practised by journal aggregators and secondary (bibliographic) services for many years. Where they are able to provide full-text content as a third-party service or a supplement to their own data, however, that content has had to be delivered in the form of the original journal package. Subscriptions or site licenses buy access only to whole journals. Documents can be ordered individually but are typically delivered in a non-processable, non-enhanceable form such as a print facsimile.
As with the Windows example, we are concerned with the defragmentation of the control of access to works, in this case at the article level. In other words, dissemination of individual works need no longer be tied to single, exclusive journal packages. Without this distinction it could be argued that the e-journal model proposed in this paper is re-fragmenting the journal literature.
other sources. That model is part of a larger project than can be fully described here, but we can cover the essential features. First, a simple guided tour will show what the implementation can do. It is important to separate the general model from the specific implementation, but once familiar with the features we try to place this model among other emerging e-publishing models and judge whether it fulfils the criteria of a dynamic, integrated e-journal aspired to above.
At the front page you are presented with a list of the papers most recently added to the service. These papers, all editorially selected and focussed on the theme of electronic publishing, can have been posted anywhere on the Web. To be considered for inclusion, in this implementation, papers must be freely accessible.
PeP covers all aspects of electronic networked publishing, with an emphasis on academic publishing and on journals. It covers the publishers, the publishing process and intermediary services. Coverage extends to research and technical development that will impact on publishing, including some aspects of digital libraries. Changes to the legal framework of publishing for the network environment are another important component.
PeP covers all aspects of electronic networked publishing, with an emphasis on academic publishing and on journals. It covers the publishers, the publishing process and intermediary services. Coverage extends to research and technical development that will impact on publishing, including some aspects of digital libraries. Changes to the legal framework of publishing for the network environment are another important component.
When a paper is chosen for inclusion, a bibliographic record describing it is created in a database. This contains typical data on the paper - publication data, also versioning data, information on authors - as well as editorial comment and chosen extracts from the paper. Some of these data are to be presented to the user; a separate program that builds data for linking collects data from the record too.
At the time PeP was launched in the second quarter of 2001 the database described a few hundred papers, all accessible to the Web user. Users can find specific papers using search, or browse using an index. This index is presented in a second browser window, for reasons described below, so is referred to as the 'remote' index.
At this stage we have essentially a simple library-like catalogue. This is transformed by requesting the link data mentioned above. Technical constraints limit this part of the service to MIE5.x users. For those users, the browser downloads a link applet containing the link data. This WebLink applet interacts with pages subsequently downloaded by the user and attempts to insert links, which are displayed as small graphics to differentiate them from conventionally authored, underlined and coloured, text links.
There are two types of link:
Following the PeP or New link to the left of the listing in a contents page opens the PeP
record for the chosen paper (Figure 2). Alternatively, the graphical link inserted beside the
title of the paper retrieves the full text (Figure 3). Notice that although the paper is served
from the originating site - it is not copied to PeP! - it also has added links. This is an example
of a well-linked paper. The applet will try to add links to any page to be displayed in the
linked browser window - it is not discriminating - but it is more likely to find text fragments
matching link data, by definition, in a page with some relevance to electronic publishing.
Figure 2. A record for every paper entered in PeP, with bibliographic details, comment and notable extracts, and a link to the full text. |
Figure 3. Original full text with added links, or return to PeP using the remote index (unlinked browser and applet windows both minimised) |
It now becomes clear why the index is remote from the main text window. It would be inappropriate and potentially misleading to frame pages from sources other than PeP. Compared with a print journal the remote index can be seen to be acting as the permanent binding, or glue, of the journal.
Using the added links the user is able to explore the literature on a focussed topic from over 100 sources, one paper linked to another in a web-like form. Or, more likely for a small application, it could be viewed as a tree structure, using the index to return to the database when a web path leads nowhere. As the collection grows these paths should begin to reveal new insights and perhaps unanticipated relationships between works. No other service, as far as we are aware, presents such relationships between the full texts of papers.
As with many new Web applications, what PeP was designed to be and what it proves to be useful for may be different. That is down to users and is subject to ongoing evaluation. In this discussion PeP is considered to be a journal in its original and traditional sense:
Defn. journal
a record of current transactions; an account of day-to-day events: a record of experiences, ideas, or reflections kept regularly for private use: a record of transactions kept by a deliberative or legislative body.
PeP could equally well be on another topic. Focussing on electronic publishing works well in this framework because it is a discursive topic, amenable to some informality and not dependent on formal peer review. It is also interdisciplinary, appealing to a broad constituency with widely varying degrees of commitment. Every researcher has a stake in the effectiveness of publishing as a communication channel for his or her work, more so at times of change like the present. So while relatively few researchers would see electronic publishing as their primary interest, many will contribute intermittently, and when they do they will usually address their own peers, not always the broader community. In other words it is a highly fragmented literature that PeP can act on to the benefit of researchers.
Another application could be based on commercial journal papers, with appropriate agreements, including only refereed papers, say, although this runs the risk of breaking the access criterion. Despite the best efforts of journal publishers - site licences, virtual journals, database service providers and journal aggregators, CrossRef, etc. - the journals industry is manifestly unable to achieve breadth and access together, because the fee-based journal structure fragments access and prevents this. Ultimately, the effective researcher who is happy to pay a fee or use a library-based subscription requires that those combined subscriptions cover all that he or she may conceivably need to access. The effective large research university, meanwhile, probably requires access to absolutely everything. Clearly impossible as long as the fragmented journal model continues, but distinctly feasible if the electronic environment is utilised effectively.
The ideal platform for the PeP model is distributed archives advocated by the Open Archives initiative. (Van de Sompel and Lagoze 2000) For PeP these will most constructively be e-print archives, although OAi aspires to encompass broader types of materials that might be found in digital libraries. E-print archives provide access to raw author-created data, the ideal foundation to encourage competition for compelling 'virtual journals' and PeP clones.
Microcosm was among the first systems to show that within digital information systems links can be managed as separate entities from the other information content. Long heralded by hypertext developers, this gave the capability to interconnect materials created independently and in different media. The potential has become more evident with the emergence of networked information services such as the Web where sites are frequently presented as self-contained 'islands' of information rather than interconnected webs.
PeP links were specified to be simple but editorially controlled, recognising that integration based on links needs to be focussed on content. An early motivation for this approach was outlined by O'Reilly (1996), founder of the eponymous books publisher, who saw how the core skill of non-fiction publishing, creating 'information interfaces', could be applied more powerfully on the Web: 'In the old model, the information product is a container. In the new model, it is a core. One bounds a body of content, the other centers it.' O'Reilly urges publishers to reinforce the fundamentals of the Internet, 'participation, access, communication'.
Links and search are the computational tools provided to support the online user. PeP's editorially controlled link service is the bridge between the computational 'overlay' and integration at the article level. Raney (1998) argues that citation and search should be seen as tools, not alternatives to journals, but in the design of PeP they are integral.
Links also fulfil the requirement for time-critical data services. Works are fixed in time but links added to these works can point forward or backward from that moment in time. Links are interactive, not just for the user. Inserting links enables PeP to interact with the texts selected for inclusion.
PeP is a primary example of decoupling journal processing tasks. As currently structured, it can include papers with much lower cost overhead than conventional journals, because it doesn't perform all the functions - refereeing, editing and layout, etc. - of those journals. PeP has limited scope but unlimited space, which it can fill faster than any conventional journal. This reinforces the integration feature, which becomes more powerful as the information space grows.
It could be argued that PeP is parasitic on journals that perform valuable services for authors. It is true that among the best sources in PeP are professionally published open access e-journals, but PeP supplements these with its own original content in the form of review articles, and adds additional services for the user that aren't available elsewhere, the link 'perspectives' of the title. This is a manifestation of Hellman's call to allow layers of functionality to be added to journals.
By including content from over 100 sources, PeP is a beneficiary of the defragmentation of access to both well-known and little-known sources. While selection of articles is a component of PeP, thereby recreating fragments from the whole of the freely accessible literature, it does not own the selected works and therefore does not compromise the benefits of defragmentation for other services.
PeP has adapted to the dynamic of the online environment without relinquishing the editorial coherence of the conventional journal in favour of wholesale automation.
PeP is not unique in any single feature but is unusual in its combination of features. Its innovations may be less noticeable but ultimately more effective by responding to the needs of users.
What about the cautions outlined in the introduction? No economic arguments have been advanced for PeP as they haven't yet been evaluated. Whatever those arguments might be, they will not be that producing PeP is cost-free. Improved and cost-effective functionality for users will be the main determinant of the impact of PeP. It is less overtly appealing to authors in its present form, lacking formal refereeing and established identity, but unlike conventional journals it is not dependent on individual submissions but on the willingness of authors generally to make their works available from open-access journals or other freely accessible sources.
PeP is almost certainly not a universal model, but we have analysed why it works in this case and what features to look for in topics that might be amenable to similar treatment.
PeP is a journal for users today. It may prove to be built on an intermediate technology, but one that can embrace change. PeP is unlike other journals but recognises its debt to those journals and other publications from which it draws while adding its own original content and perspectives.
It may be presumptuous to draw parallels with Nelson's (1999) vision for a digital publishing system, but for all the reasons he expressed, PeP may not suit everyone: 'Those who see work as finished and closed, who see subjects as bounded and hierarchically related, who see research as final ... who don't worry about lost information, who don't worry about minority points of view, who don't see the need to express alternatives, who don't see a need for intercomparison, who don't see a need for literary continuity, have a great deal of difficulty with this paradigm.'
Anderson, K. et al. (2001) Publishing Online-Only
Peer-Reviewed Biomedical Literature: Three Years of Citation, Author Perception,
and Usage Experience. Journal of Electronic Publishing, Vol. 6,
No. 3, March
Bailey, C. W. Jr (1994) Scholarly Electronic Publishing on the
Internet, the NREN, and the NII: Charting Possible Futures. Serials
Review, Vol. 20, No. 3, 7-16
Bjork, B.-C. and Turk, Z. (2000) How Scientists Retrieve Publications:
An Empirical Study of How the Internet Is Overtaking Paper Media. Journal
of Electronic Publishing, Vol. 6, No. 2, December
Borghuis, M. et al. (1996) TULIP Project Final Report.
Elsevier Technical Report http://www.elsevier.nl/inca/homepage/about/resproj/trmenu.htm
Buck, A. M. et al. (1999) Scholar’s Forum: A New Model
For Scholarly Communication. Author's server http://library.caltech.edu/publications/ScholarsForum/
Bush, V. (1945) As We May Think. Atlantic Monthly, July
Butler, D. et al. (2001) Future E-access to the Primary Literature. Nature Web Debates, 5 April
Carr, L. et al. (1995) The Distributed Link Service:
A Tool for Publishers, Authors and Readers. Fourth World Wide Web Conference
(WWW4), Boston, December
Davis, H. C. et al. (1992) MICROCOSM: An Open Hypermedia
Environment for Information Integration. CSTR 92-15, Computer Science Technical
Report, University of Southampton http://www.bib.ecs.soton.ac.uk/records/1317
Dessy, R. (2000) Chemical E-Preprints: The Ostriches. Trends
in Analytical Chemistry, Vol. 19
Engelbart, D. C. (1975) NLS Teleconferencing features: the Journal,
and Shared-Screen Telephoning. IEEE Catalog No. 75CH0988-6C, pp.173-176
Harnad, S. (1991) Post-Gutenberg Galaxy: the Fourth Revolution
in the Means of Production of Knowledge. Public-Access Computer Systems
(PACS) Review, Vol. 2, No.1, 39 - 53
Harnad, S. and Carr, L. (2001) Integrating, Navigating and Analyzing
Eprint Archives through Open Citation Linking (the OpCit Project). Current
Science Online, Vol. 79, No. 5, 10th September
Hellman, E. (2001) Less is More! Re: Reader-Designated HyperLinking
In/Between/Among E-Journals. Web4Lib Electronic Discussion Archive, 13
February http://sunsite.berkeley.edu/Web4Lib/archive/0102/0218.html
Hitchcock, S. et al. (1998) Making the Most of Electronic
Journals. Computing Research Repository, cs.DL/9812016, 14 December
Kling, R. and McKim, G. (2000) Not Just a Matter of Time: Field
Differences in the Shaping of Electronic Media in Supporting Scientific Communication. Journal of the American Society
for Information Science, Vol. 51, No. 14
Nelson, T. H. (1999) Xanalogical Media: Needed Now More Than
Ever. ACM Computing Surveys, Vol. 31, No. 4, December
O'Reilly, T. (1996) Publishing Models for Internet Commerce.
Communications of the ACM, Vol. 39, No. 6, June, 79-86
Phelps, C. E. (1998) Achieving Maximal Value from Digital Technologies
In Scholarly Communication. ARL Proceedings, October
Raney, R. K. (1998) Into a Glass Darkly. Journal of Electronic
Publishing, Vol. 4, No. 2, December
Roberts, R. J. et al. (2001) Building A "GenBank" of
the Published Literature. Science, Vol. 291, No. 5512, 2318-2319,
23 March
Rous, B. (1999) ACM: a Case Study. Journal of Electronic
Publishing, Vol. 4, No. 4, June
Smith, J. W. T. (1999) The Deconstructed Journal - a New Model for Academic Publishing. Learned Publishing, Vol. 12, No. 2, April
Tenopir, C. and King, D. W. (2000) Towards Electronic Journals:
Realities for Scientists, Librarians, and Publishers. Psycoloquy,
Vol. 11, issue 084
Van de Sompel, H. and Lagoze, C. (2000) The Santa Fe Convention
of the Open Archives Initiative. D-Lib Magazine, Vol. 6 No. 2, February
Varmus, H. et al. (1999) Journals Online: PubMed Central
and Beyond. HMS Beagle: The BioMedNet Magazine, No. 61, September
3rd