The Distributed Link Service, a distributed system of link services for the World Wide Web [Carr et al 95, Carr et al 96], has been used to impose configurable navigation structures upon suites of static document resources in the World-Wide Web. This paper describes the integration of the link service functionality into a Web proxy server, and examines the implications for the service's user interface in terms of including, representing, discriminating, prioritising and traversing links.Key words: Proxy services, hypertext links, HTTP stream transducers, open hypermedia
The World Wide Web (WWW) is undoubtedly one of the more successful hypertext systems, but it is a largely closed system, dependent on the use of HTML document content for the provision of linking facilities. Although links may be created to documents other than those in HTML and image formats, such links are dead ends, and there is no way to follow any further links e.g. links from spreadsheet documents. There is also no way for additional links to be made available by third parties, as all link information is embedded in documents.
WWW embedded links and the external links provided by an open hypermedia system are described as locspecs and refspecs respectively, according to an extended version of the original Dexter model [Grønbæk & Trigg]. By applying refsepcs to the WWW it is possible to employ an open hypertext approach to the authoring and management of World Wide Web hypertext documents [Hill et al 95] and to provide more flexible facilities. This paper will show how we have provided a link service for the WWW, based upon the model used in the Microcosm open hypertext system [Hill et al 93].
The development of open hypermedia systems has highlighted a number of advantages over closed systems which embed link information into documents. The most significant examples are briefly described below.
In particular, the use of generic links allows common links to be authored only once - wherever the source selection of the link occurs, the link is available, including any documents subsequently made available. Typically such links would be created on names of people and places, or common terms, to provide access to more detailed information. In a closed system, such links need to be created wherever the source term appears in a document, and new documents also need to be linked into the system manually.
This form of linking also reduces maintenance requirements, as changes to links need only be made to the central link databases, and will immediately be effective wherever the link is available. This can reduce problems frequently encountered in the WWW, such as link fossilisation and decay [Hill et al 95]. Finally, a separate link database allows much more efficient automatic processing and editing of links.
In addition, the type of linking described in the previous section allows the user a more flexible approach to link traversal. Rather than rely on those links highlighted by the system, the user is also able to select arbitrary items and query the system for possible links - thus creating a 'reader-led' navigation paradigm
Readers may also be provided with the facilities necessary to create their own links, allowing them to annotate material which in other systems they would not be able to annotate and freeing them from a hypertext structure created purely by designated authors. If these databases may be shared with other users, collaborative authoring of hypertext resources is enhanced.
Another possibility is a separation between information provider and link provider. At present, hypertext material is usually delivered with links inextricably bound to the associated material. A link service can help to overcome this restriction, by providing the facility to apply completely different link sets to a set of documents, or conversely to apply existing links to new documents not available when the links were originally created. This makes it possible for third parties to offer pure linking services which end users may apply to any documents which they can access, breaking the common binding between content and link structure.
Finally, this facility can also aid in more efficient management of hypertextual information. If a variety of link structures are to be applied to a particular set of documents, changes to the document set are easier to make if the link information is managed separately. If link information had to be embedded in the documents, then many different document sets would have to be maintained in order to provide alternative link structures. Similarly, if new documents are introduced, existing link information need not be embedded in them to facilitate navigation, links are immediately available.
We have developed the Distributed Link Service (DLS) as such a system. It is able to work in conjunction with existing WWW resources to support an additional underlying link service, which is able to provide the features described in the previous section. This system is based upon our experiences developing the Microcosm hypertext system [Davis et al 94]. Like Microcosm, the DLS utilises a variety of link database processes to offer flexible hypertext functionality to a wide range of end-user applications.
The DLS [Carr et al 95] is composed of two parts: the server facilities
which are accessed via the WWW, and the client interface which work in
conjunction with a WWW browser.
The link server facilities of the DLS are implemented as modules of
a pseudo-WWW proxy server. It uses enough of the hypertext transport protocol
to allow normal interaction with a browser but also contains modules to
allow the creation, traversal and editing of links, which are stored in
a number of link databases. The databases use an SGML style mark-up, and
record the source and destination attributes of the link, the type of the
link, its creation time and a link description.
There are three kinds of data that can move between nodes in this diagram: the query, link data or the results of resolving the query. The nodes represent link processing agents, of which the link resolution agent (with local link data) is the only instance enountered so far; others will be introduced later.
In the simplest scenario, the link data is static: the query travels from left to right, is resolved at a node or nodes with the appropriate link data, and the results travel from right to left back to the user. There are three types of nodes which may exist separately or in combination:
Finally, the LRA itself may be mobile. Instead of the data moving to the agent, the agent can then move to the data. This model is a topic of research but has not be realised in any practical DLS implementations at this time.
Note that a simple client might talk to the first process in the diagram, but more sophistcated clients are possible which incorporate the functionality of any of the process types discussed above - these are 'heavyweight clients'. For example, the client may have knowledge about which link server to contact and may itself implement some concurrency or fault tolerance. In particualr, it might have link resolution functionlity so that link resolution can occur when the user is offline; this is particularly appropriate when the user is using mobile equipment.
The controller establishes a dynamic session (a binding of a user and host together with a set of link server parameters) which is used to control the behaviour of the link server from that point in time onwards for that particular user. It is intended that the user will invoke the controller just once to set their preferred configuration, and only again afterwards to adjust the configuration - links will always be added automatically to the documents according to the last settings of the controller.
The control panel provides a greater degree of control over the linking process, enabling the user to specify in some detail which link databases are switched on and off as the user browses in and out of a number of document resources, to control the kinds of linkbase that are used at such a point (e.g. internal navigation through a resource vs citation of documents external to the resource)
The Open Journal Framework [Carr et al 96] makes use of this kind of control panel to help the user navigate through large suites of collected but separate Internet resources, all integrated by the use of linkbases. By introducing a model of Internet resources (collections of documents and associated link databases) and aggregations of these resources (collections of collections of documents and associated link databases), it is possible to define the user's 'static location' in a document space, and hence to know what hypertext actions are applicable at each point in that document space. If the user travels outside all known resources (e.g. to a colleague's personal home page), then the option still remains to apply the most general links or else to have the link server refrain from applying any links.
Without this model the same sets of link databases are applied to any document which the user sees.
The recent standard for Cascading Style Sheets for HTML documents [Lie & Bos] allows the presentation of many document features to be controlled by visual parameters such as font, size and colour. WWW links in HTML documents are in normally tightly bound to previously marked-up anchors, and so a style-sheet's only option for parametrising link presentation is to change the typographic attributes of the (fixed) anchor. By contrast, the DLS has complete freedom to choose how to elaborate a link by binding it to any suitable anchor site in the document. The DLS may apply the link to any part of the document's contents, or may invent a new piece of content to act as an anchor (in the form of a distinguishing marker or a more general annotation).
DSSSL, a related standard for document styles and semantics [cite DSSSL], operates on a model in which documents are processed in two passes: firstly to rewrite and re-order their components and secondly to apply formatting operations to the revised components. This model allows new content to be created for a document as it is processed and is the kind of model which the DLS employs, in contrast to Cascading Style Sheets.
Using the controller, links can be formatted according to the following styles given that a fragment of the document's content has been chosen as a link site
Since at any one time there may be many dozens of link databases active, providing links of various levels of relevance an important task of the server is to throttle over-zealous link producers and allow the user to choose the overall proportion of link items to document content, the maximum number of links to appear on each key phrase, the maximum number of links to a particular resource and which link authors to take preference.
The accustomed user interface for links (click and go) is convenient in some respects: uncomplicated and immediate it allows the user directly to jump between information resources. However, it is in other ways a very unnatural activity when compared with the sequence of actions it mimics in the physical world away from the can-do atmosphere of cyberspace.
When readers attend to a journal article in a library, they do not immediately follow each citation and cross-reference that is encountered as the text is digested. Instead, they may make a mental note to follow it up at a more convenient time, even scribbling it down on a pad. Only when the paper has been read to the readers' satisfaction will the attention then be turned to the cited material.
In other words, users have prior experience with a reading model in which they evaluate the content before they evaluate the links, whereas the WWW and most other hypertext environments provide an environment in which the user is repeatedly interrupted, stacking up unfinished document contexts to be returned to later. Although the hypertext environment itself does not force the user to switch contexts to the linked material, the lack of support for any other browsing protocol often makes it the line of least resistance.
In computing terms, the hypertext browser imposes a stack-based document evaluation modality onto the user, replacing a natural queue-based information processing methodology. This stack-based approach is impossible in the real world because of the significant time taken to change document contexts when compared to the Web.
As well as providing mechanisms for controlling the prioritising and presentation of links, the DLS supplies a mechanism to help control the link following process, making it more like the real-world experience described above. It does this by providing an auxiliiary "navigation planner window" adjacent to the users' browser window such that users can drag link anchors from the browser window onto the planner window, where they are displayed as icons. The icons can be moved around the window, clustered together according to the user's own informal classification scheme and subsequently double-clicked to make the browser display the relevant Web document.
As such an electronic notepad has been created for jotting down interesting places to visit (a speculative bookmark list). However, a secondary fuction of the notepad is to pre-fetch the referenced documents while the user finishes browsing the main document, so that the reader really does get instantaneous access when the follow-up texts are examined. (In fact, the referenced URL and all embedded data must be fetched, so that documents containing frames or images will display without delay.)
A further function of the navigation notepad is to contextualise navigation i.e. make explicit the context in which the current document and its linked items are being read. Embodying the notion that reading is done not in an intellectual vacuum, but as part of a process of writing, of note-taking and of goals and strategies for creating other documents.
Some research in the WWW community [Brooks et al 95] has focussed on the use of transducers which intercept the flow of communication between a client and server, modifying the request or response in some way. Such transducers have been used to experiment with adding extra functionality to the document server in the form of annotations, indexes and change marks [Meeks et al]. The DLS server acts very similarly, modifying the WWW server response (the document) by adding extra data to it, but it does so as a 'mutant proxy server' rather than an extra processing node on the communications stream. The DLS breaks the transducer model by allowing the client to communicate directly with it by using the 'Link Remote Controller' described in section 3.2.
Others have investigated the use of independent meta-information servers but to provide collaborative annotations for WWW pages rather than links [Röscheisen et al]. This is actually quite similar in concept to the DLS, as links can be easily provided within an annotation framework and vice versa. The DLS itself provides support for annotations through the inclusion of extra metadata in the link databases.
Early work on spatial metaphors for organising and classifying hypertext material [Marshall 91] inspired a prototype of the link access facilities described in section 4.3 by one of the authors [Carr 95]. Further work in the hypertext community on the use of spatial hypertext [Marshall 93, Marshall 94] has resulted in a commercial system (Web Squirrel [Bernstein 96]) which provides improved link access facilities for the WWW.
The WAIBA project of the OSF [Brooks 96] also produced software tools for improving link access for users of the Web. In particular, a Table of Contents agent produced a structural overview of a Web hierarchy to help the user make decisions about how to browse that part of the information space.
We are continuing to develop the Distributed Link Service following our open hypermedia philosophy, adopting new browser and server technologies as they become available. Future work on the network protocols includes an investigation of client-side link resolution (the 'heavyweight client'), link caching on proxies and multicasting to multiple link servers.
From the user-interface side, user trials are scheduled to determine the exact practical usefulness of this link discrimination by colour in a world of varying browser, rendition and screen hardware technologies.
Tools which utilise the link service are being designed within specific projects, and we hope to make generic tools available in the future.
[Brooks et al 95] Application-Specific Proxy Servers as HTTP Stream Transducers, C. Brooks, M. Mazer, S. Meeks and J. Miller, Proceedings of The Web Revolution: Fourth International World Wide Web Conference, in The Web Journal 1(1), O'Reilly and Associates. [Brooks 96] Wide Area Information Browsing Assistance Final Technical Report, C. Brooks, Technical Report, The Open Group Research Institute, 20 September 1996. <URL: http://www.osf.org/www/waiba/papers/y2report/y2report.htm>
[Carr 95] Structure in Text and Hypertext, L. Carr, PhD Thesis, University of Southampton, UK (1995). <URL: http://journals.ecs.soton.ac.uk/lacethesis/>
[Carr et al 95] The Distributed Link Service: A Tool for Publishers, Authors and Readers, L. Carr, D. De Roure, W. Hall and G. Hill, Proceedings of The Web Revolution: Fourth International World Wide Web Conference, in The Web Journal 1(1), O'Reilly and Associates.
[Carr et al 96] Open Linking Services, L. Carr, D. De Roure, W. Hall and G. Hill, Proceedings of the Fifth International World Wide Web Conference.
[Davis et al 92] H. Davis, W. Hall, I. Heath, G. Hill, R. Wilkins, Towards an Integrated Information Environment with Open Hypermedia Systems, in ECHT '92, Proceedings of the Fourth ACM Conference on Hypertext, Milan, Italy, November 30-December 4, 1992, ACM Press, 181-190.
[Davis et al 94] H. Davis, S. Knight, W. Hall, Light Hypermedia Link Services: A Study of Third Party Application Integration, in Proceedings of the Sixth ACM Conference on Hypertext, Edinburgh, Scotland, September 1994, ACM Press, 41-50.
[Davis et al 95] H. Davis, A. Lewis, A. Rizk, OHP: A Draft Proposal
for an Open Hypermedia Protocol, presented at ACM Hypertext 96 Conference,
Open Hypermedia Systems Workshop, <URL: http://diana.ecs.soton.ac.uk/~hcd/
protweb.htm>
[De Roure et al 96] A Distributed Hypermedia Link Service D. DeRoure, L. Carr, W. Hall and G. Hill, Proceedings of the Third International Workshop on Services in Distributed and Networked Environments (SDNE96), IEEE Computer Society Press 1996
[De Roure et al 96b] Agents for Distributed Multimedia Information Management, D. De Roure, W. Hall, H. Davis and J. Dale, Proceedings of PAAM'96
[Grønbæk & Trigg] Toward a Dexter-based reference model for open hypermedia: Unifying embedded references and link objects, K. Grønbæk, R. Trigg, in the Proceedings of the Seventh ACM COnference on Hypertext, 1996
[Hall 94] W. Hall, Ending the Tyranny of the Link, IEEE Multimedia 1,1 pp 60-68 (1994).
[Hill et al 93] G. Hill, R. Wilkins, W. Hall, Open and Reconfigurable Hypermedia Systems: A Filter Based Model, Hypermedia, 5(2), 1993.
[Hill et al 95] Applying Open Hypertext Principles to the WWW, G. Hill, W. Hall, D. De Roure, L. Carr , International Workshop on Hypermedia Design 1995, Montpellier, France, 1-2 June 1995.
[Lie & Bos] Cascading Style Sheets : Designing for the Web, Hakon Wium Lie, Bert Bos, Addison-Wesley Pub Co, ISBN: 020141998X. 1997
[Malcolm et al 91] Malcolm, K.C., Poltrock, S.E., Schuler, D. Industrial Strength Hypermedia: Requirements for a Large Engineering Enterprise. In: Hypertext 91: Proceedings of Third ACM Conference on Hypertext, San Antonio, TX. ACM Press, 1991, 13-24.
[Marshall 91] Marshall, C.C., Halasz F.G., Rogers R.A., and Janssen, W.C., Aquanet: a hypertext tool to hold your knowledge in place. In Proceedings of Hypertext '91, (San Antonio, Texas December 16-18), 1991, pp. 261-275.
[Marshall 93] Marshall, C.C., Shipman, F. M. III. "Searching for the Missing Link: Discovering Implicit Structure in Spatial Hypertext." In Proceedings of Hypertext '93, (Seattle, Washington, November 14-18), 1993, pp. 217-230.
[Marshall 94] Marshall, C.C.; Shipman, F.M.; Coombs, J.H. VIKI: Spatial Hypertext Supporting Emergent Structure. In Proceedings of the ACM European Conference on Hypermedia Technologies (Edinburgh, Scotland, Sept. 18-23), 1994, pp. 13-23.
[Meeks et al] Transducers and Associates: Circumventing the Limitations of the World Wide Web, W. Meeks, C. Brooks, M. Mazer in Proceedings of the COnference on Emerging Technologies and Applications in Communications 96, Portland, Oregon.
[Østerbye & Wiil] The Flag Taxonomy of Open Hypermedia Systems K. Østerbye, U. Wiil, in the Proceedings of the Seventh ACM COnference on Hypertext, 1996
[Pearl 89] Pearl, A. Sun's Link Service: A Protocol for Open Linking. In: Hypertext '89 Proceedings, Pittsburgh PA, 1989, 137 - 146
[Röscheisen et al] Shared Web Annotations as a Platform for
Third-party Value-Added Information Providers: Architecture, Protocols
and Usage Examples M. Röscheisen, C. Mogensen, T. Winograd, Technical
Report CSDTR/DLTR, Computer Science Department, Stanford University, Stanford,
CA 94305, USA.
<URL: http://www-diglib.stanford.edu/rmr/TR/TR.html>