As an alternative it is possible to make the link service function transparently by integrating it into the document delivery service, accomplished by grafting the link service into a WWW proxy. This paper discusses the benefits to the user of such a system of backroom link services and compares it with the use of WWW transducers [Brooks et al] and open hypermedia shims [Davis et al 96].
Key words: Open hypermedia, link servies, WWW
The World Wide Web (WWW) is undoubtedly one of the more successful hypertext systems, but it is a largely closed system, dependent on the use of HTML document content for the provision of linking facilities. Although links may be created to documents other than those in HTML and image formats, such links are dead ends, and there is no way to follow any further links e.g. links from spreadsheet documents. There is also no way for additional links to be made available by third parties, as all link information is embedded in documents.
WWW embedded links and the external links provided by an open hypermedia system are described as locspecs and refspecs respectively, according to an extended version of the original Dexter model [Grønbæk & Trigg]. By applying refsepcs to the WWW it is possible to employ an open hypertext approach to the authoring and management of World Wide Web hypertext documents [Hill et al 95] and to provide more flexible facilities. This paper will show how we have provided a link service for the WWW, based upon the model used in the Microcosm open hypertext system [Hill et al 93].
The development of open hypermedia systems has highlighted a number of advantages over closed systems which embed link information into documents. The most significant examples are briefly described below.
In particular, the use of generic links allows common links to be authored only once - wherever the source selection of the link occurs, the link is available, including any documents subsequently made available. Typically such links would be created on names of people and places, or common terms, to provide access to more detailed information. In a closed system, such links need to be created wherever the source term appears in a document, and new documents also need to be linked into the system manually.
This form of linking also reduces maintenance requirements, as changes to links need only be made to the central link databases, and will immediately be effective wherever the link is available. This can reduce problems frequently encountered in the WWW, such as link fossilisation and decay [Hill et al 95]. Finally, a separate link database allows much more efficient automatic processing and editing of links.
In addition, the type of linking described in the previous section allows the user a more flexible approach to link traversal. Rather than rely on those links highlighted by the system, the user is also able to select arbitrary items and query the system for possible linksóthus creating a 'reader-led' navigation paradigm
Readers may also be provided with the facilities necessary to create their own links, allowing them to annotate material which in other systems they would not be able to annotate and freeing them from a hypertext structure created purely by designated authors. If these databases may be shared with other users, collaborative authoring of hypertext resources is enhanced.
Another possibility is a separation between information provider and link provider. At present, hypertext material is usually delivered with links inextricably bound to the associated material. A link service can help to overcome this restriction, by providing the facility to apply completely different link sets to a set of documents, or conversely to apply existing links to new documents not available when the links were originally created. This makes it possible for third parties to offer pure linking services which end users may apply to any documents which they can access, breaking the common binding between content and link structure.
Finally, this facility can also aid in more efficient management of hypertextual information. If a variety of link structures are to be applied to a particular set of documents, changes to the document set are easier to make if the link information is managed separately. If link information had to be embedded in the documents, then many different document sets would have to be maintained in order to provide alternative link structures. Similarly, if new documents are introduced, existing link information need not be embedded in them to facilitate navigation, links are immediately available.
We have developed the Distributed Link Service (DLS) as such a system. It is able to work in conjunction with existing WWW resources to support an additional underlying link service, which is able to provide the features described in the previous section. This system is based upon our experiences developing the Microcosm hypertext system [Davis et al 94]. Like Microcosm, the DLS utilises a variety of link database processes to offer flexible hypertext functionality to a wide range of end-user applications.
The DLS [Carr et al 95] is composed of two parts: the server facilities which are accessed via the WWW, and the client interface which work in conjunction with a WWW browser.
There are several different link database categories supported by the system, at the most general level are server databases, which apply whenever the system is queried. Link databases may also be provided for a group of documents, or a particular document. In addition, a variety of ëcontextí link databases are available which the user may select from. By choosing a different context, the user may adjust the available link set to best suit their current information requirements. The user is also provided with a personal link database in which they may create private links that only they have access to.
The server receives details from the DLS client of the userís selection, the document in which the selection was made, and the context selected. The followlink module determines which link databases are required, and gathers these together to satisfy the request. Like Microcosm, the system supports the use of generic links, which allows links to be applicable beyond the scope in which they were originally created.
The editlink module provides an HTML form which allows the user to select from the available link databases and edit the links contained. For example, changing the default link description, and updating the type of links. The createlink module accepts details of start and end points for a link, and enters a new link into the specified context link database if there is one. Otherwise, the link is entered into the user's personal link database. The context module provides a list of the different context link databases available on the server. This can be used by the client to present a menu of contexts to the user.
A major problem with the interactive client is the engineering requirements of producing and maintaining software that applies the available link services to a range of different viewing applications using a variety of WWW browsers on a range of different host operating systems. Hence an alternative, ëinterfacelessí approach was investigated: to make the link service transparent to its users by embedding it in the Webís document transport system, compiling links into documents as they were delivered to the user by a specially adapted WWW proxy server.
This approach requires no extra client software for the user, which is an immediate practical benefit, but it does suffer from a number of disadvantages. Firstly, the loss of interaction makes it impossible to create a link by the usual method of making a selection and choosing Start Link from the menu. It also changes (perhaps for the worse) the browsing paradigm from ìreader-directed enquiryî to ìclick on a predefined choiceî [Hall94]. Secondly, this behind-the-scenes link compilation is applicable only to documents which are delivered via the WWW and which are coded in well-understood document formats that can themselves support some form of hypertext link. These requirements abandon some of the advantages of the open system previously described, since there are relatively few document formats which can have links embedded.
There are three kinds of data that can move between nodes in this diagram: the query, link data or the results of resolving the query. The nodes represent link processing agents, of which the link resolution agent (with local link data) is the only instance enountered so far; others will be introduced later.
In the simplest scenario, the link data is static: the query travels from left to right, is resolved at a node or nodes with the appropriate link data, and the results travel from right to left back to the user. There are three types of nodes which may exist separately or in combination:
1. LRAs. These resolve the query against local link data and return the result.
2. Caches. The same query will return the same results from the same linkbases, so it is possible for processes to cache the result of queries in order to speed up response and promote scalability.
3. LRA proxies. These processes appear as LRAs but propagate the query to other nodes and aggregate the responses. There are two roles for these: implementing concurrent processing of queries, and providing redundancy to cope with failure of parts of the system.
The simple scenario extends naturally to the case where the link data is itself mobile. Here, the LRAs can request the link data from link data servers, which may be identical to document servers: link data is just a special document type. This means that link data can be cached using the smae techniques as for the caching of document data. The link data server is then another process type in the diagram, and there is a new type of LRA which can import link data from these servers.
Finally, the LRA itself may be mobile. Instead of the data moving to the agent, the agent can then move to the data. This model is a topic of research but has not be realised in any practical DLS implementations at this time.
Note that a simple client might talk to the first process in the diagram, but more sophistcated clients are possible which incorporate the functionality of any of the process types discussed aboveóthese are ìheavyweight clientsî. For example, the client may have knowledge about which link server to contact and may itself implement some concurrency or fault tolerance. In particualr, it might have link resolution functionlity so that link resolution can occur when the user is offline; this is particularly appropriate when the user is using mobile equipment.
The purpose of the controller is to give to the user the ability to choose how links are selected and displayed within the processed documents. The simple control panel in figure 3 gives the user the ability to choose which one of the serverís installed linkbases are to be combined with requested documents, as well as the opportunity to choose whether the links are displayed by underlining the link source text (the default), by inserting asterisks after the link source text (a footnote style) or by inserting citation markers after the source text and then appending a ìlink bibliographyî to the document as a whole. It is also possible to completely bypass the link compilation if a ìnormalî document viewing mode is required.
The controller establishes a dynamic session (a binding of a user and host together with a set of link server parameters) which is used to control the behaviour of the link server from that point in time onwards for that particular user. It is intended that the user will invoke the controller just once to set their preferred configuration, and only again afterwards to adjust the configurationólinks will always be added automatically to the documents according to the last settings of the controller.
A more complex control panel provides a greater degree of control over the linking process. This enables the user to specify in some detail which link databases are switched on and off as the user browses in and out of a number of document resources, to control the kinds of linkbase that are used at such a point (e.g. internal navigation through a resource vs citation of documents external to the resource) and to determine how the server is to cull links from a potentially over-annotated document. The Open Journal Framework [Carr et al 96] makes use of the control panel to help the user navigate through large suites of collected but separate Internet resources, all integrated by the use of linkbases. Since at any one time there may be many dozens of link databases active, providing links of various levels of 'pertinence' an important task of the server is to throttle over-zealous link producers and allow the user to choose the overall proportion of link items to document content, the maximum number of links to appear on each key phrase, the maximum number of links to a particular resource and which link authors to take preference. All these can be controlled from a more complicated version of the standard sessionís control panel.
By introducing a model of Internet resources (collections of documents and associated link databases) and aggregations of these resources (collections of collections of documents and associated link databases), it is possible to define the userís ìstatic locationî in a document space, and hence to know what hypertext actions are applicable at what point in that document space. If the user travels outside all known resources (e.g. to a colleagueís personal home page), they have the option of still applying the most general links or to have the link server refrain from applying any links. Without this model (in the case of the simple controller) the same sets of link databases are applied to any document which the user sees.
The use of an Open Hypermedia Protocol (OHP) for such an environment has been recently discussed [Davis et al 96]. The architecture to support this uses shims to convert between the native protocols of a link service and the native protocols of a client application so as to allow an application to make use of many different link services. A clientís shim communicates with each serverís shim by using the OHP standard and so receives its linking information independently of the implementation of the link service.
The shims work is not directly comparable with the situiation on the WWW as native client of the WWW speaks a standard protocol to a WWW server not to receive linking information, but to receive a document (which indirectly contains links). The DLS server masquerades as a WWW server in order to resolve an explicit request for a document and an implicit request for links and translate the results back into data that the client is expecting, i.e. a document. This can be seen as a variant on the standard OHP architecture where the client shim is actually co-located with the link server. In this situation the link server has the additional responsibilities of procuring the document and merging the links into the document; in the OHP scenario these two tasks are accomplished by the client. Although the two situations are not exactly congruent, it is possible that the OHP protocol could be effectively used between the components of the distributed server described in section 3.1.
Research in the WWW community [Brooks et al] has been focussed on the use of transducers which intercept the flow of communication between a client and server, modifying the request or response in some way. Such transducers have been used to experiment with adding extra functionality to the document server in the form of annotations, indexes and change marks [Meeks et al]. The DLS server acts very similarly, modifying the WWW server response (the document) by adding extra data to it, but for reasons of efficiency it does so as a ìmutant proxy serverî rather than an extra processing node on the communications stream. The DLS also breaks the transducer model by allowing the client to communicate directly with it by using the ìLink Remote Controllerî described in section 3.2.
Others have investigated the use of independent meta-information servers but to provide collaborative annotations for WWW pages rather than links [Röscheisen et al]. This is actually quite similar in concept to the DLS, as links can be easily provided within an annotation framework (in fact the DLS provides support for annotations through the inclusion of extra metadata in the link databases).
We are continuing to develop the Distributed Link Service following our open hypermedia philosophy, adopting new browser and server technologies as they become available. Future work includes an investigation of client-side link resolution (the ëheavyweight clientí), link caching on proxies, multicasting to multiple link servers and experiments on controlling the presentation of links. Tools which utilise the link service are being designed within specific projects, and we hope to make generic tools available in the future.
[Carr et al 95] The Distributed Link Service: A Tool for Publishers, Authors and Readers, L. Carr, D. De Roure, W. Hall and G. Hill, Proceedings of The Web Revolution: Fourth International World Wide Web Conference, in The Web Journal 1(1), OíReilly and Associates.
[Carr et al 96] Open Linking Services, L. Carr, D. De Roure, W. Hall and G. Hill, Proceedings of the Fifth International World Wide Web Conference.
[Davis et al 92] H. Davis, W. Hall, I. Heath, G. Hill, R. Wilkins, Towards an Integrated Information Environment with Open Hypermedia Systems, in ECHT '92, Proceedings of the Fourth ACM Conference on Hypertext, Milan, Italy, November 30-December 4, 1992, ACM Press, 181-190.
[Davis et al 94] H. Davis, S. Knight, W. Hall, Light Hypermedia Link Services: A Study of Third Party Application Integration, in Proceedings of the Sixth ACM Conference on Hypertext, Edinburgh, Scotland, September 1994, ACM Press, 41-50.
[Davis et al 95] H. Davis, A. Lewis, A. Rizk, OHP: A Draft Proposal
for an Open Hypermedia Protocol, presented at ACM Hypertext 96 Conference,
Open Hypermedia Systems Workshop, <URL: http://diana.ecs.soton.ac.uk/~hcd/
protweb.htm>
[De Roure et al 96] A Distributed Hypermedia Link Service D. DeRoure, L. Carr, W. Hall and G. Hill, Proceedings of the Third International Workshop on Services in Distributed and Networked Environments (SDNE96), IEEE Computer Society Press 1996
[De Roure et al 96b] Agents for Distributed Multimedia Information Management, D. De Roure, W. Hall, H. Davis and J. Dale, Proceedings of PAAM'96
[Grønbæk & Trigg] Toward a Dexter-based reference model for open hypermedia: Unifying embedded references and link objects, K. Grønbæk, R. Trigg, in the Proceedings of the Seventh ACM COnference on Hypertext, 1996
[Hall 94] W. Hall, Ending the Tyranny of the Link, IEEE Multimedia 1,1 pp 60-68 (1994).
[Hill et al 93] G. Hill, R. Wilkins, W. Hall, Open and Reconfigurable Hypermedia Systems: A Filter Based Model, Hypermedia, 5(2), 1993.
[Hill et al 95] Applying Open Hypertext Principles to the WWW, G. Hill, W. Hall, D. De Roure, L. Carr , International Workshop on Hypermedia Design 1995, Montpellier, France, 1-2 June 1995.
[Malcolm et al 91] Malcolm, K.C., Poltrock, S.E., Schuler, D. Industrial Strength Hypermedia: Requirements for a Large Engineering Enterprise. In: Hypertext 91: Proceedings of Third ACM Conference on Hypertext, San Antonio, TX. ACM Press, 1991, 13-24.
[Meeks et al] Transducers and Associates: Circumventing the Limitations of the World Wide Web, W. Meeks, C. Brooks, M. Mazer in Proceedings of the COnference on Emerging Technologies and Applications in Communications 96, Portland, Oregon.
[Østerbye & Wiil] The Flag Taxonomy of Open Hypermedia Systems K. Østerbye, U. Wiil, in the Proceedings of the Seventh ACM COnference on Hypertext, 1996
[Pearl 89] Pearl, A. Sun's Link Service: A Protocol for Open Linking. In: Hypertext '89 Proceedings, Pittsburgh PA, 1989, 137 - 146
[Röscheisen et al] Shared Web Annotations as a Platform for
Third-party Value-Added Information Providers: Architecture, Protocols
and Usage Examples M. Röscheisen, C. Mogensen, T. Winograd, Technical
Report CSDTR/DLTR, Computer Science Department, Stanford University, Stanford,
CA 94305, USA.
<URL: http://www-diglib.stanford.edu/rmr/TR/TR.html>