The World-Wide Web is an open system: its formats and protocols are well-documented and are negotiated in an international open forum. However, its current use as a hypermedia system is closed in the sense that the link information is hidden within a document's data, working against the aims of large-scale hypermedia as mentioned above. Since this 'closedness' is not a fundamental design feature of the Web but a consequence of current practise in Web document design, then it is more than possible to augment the technology to provide the kind of link service described above. This article discusses the ways and means of combining link service capabilities with the World Wide Web.
The experience that most other hypertext systems provide is in the realm of individual documents or local document collections with a controlled environment and context. The design of the WWW project has kept the node and links model of these traditional closed hypertext systems intact but extended the node addressing scheme to allow remote nodes and defined a node transport mechanism to allow the hypertext to be extended across a network.
This simple node-links model and the familiar authoring paradigm that it accompanies has particular implications for the scaleability and maintainability of a very large and highly distributed corpus.
The problem of topic-based navigation of the Web is similar to the problem of finding a file on a particular subject on the Internet's anonymous FTP service. In that environment at first enthusiastic volunteers published regular lists of sites and kinds of files at each site. Some sites also used to provide a file containing a complete list of all the files available from their machine. Eventually a single site provided a database of the names of files available at all of the well-known anonymous FTP sites; an interactive query service (known as archie) allowed any user to find out where a file was archived given a fragment from that file's name. This service has now been replicated across several dozen sites across the whole Internet, so that any user can obtain a list of potentially relevant files as long as the name of the file is indicative of its contents. A similar system could be applied to the Web; already software is available to allow the administrator to automatically catalogue each of the Web server's files.
Link fossilisation is a significant disadvantage of WWW and occurs because link specifications have to be published as part of the document and cannot be changed without revising the document. Link decay is also seen since links refer to their destination anchors via a specific machine name and path name. Any change to the position of the destination requires every source document which refers to it to be changed once published a document can never be moved or deleted. Although this is not an insurmountable problem in a locally controlled context, WWW used as a world-wide publishing mechanism assumes that every document is forever associated with its published address. Dead ends frequently occur in WWW because only native WWW documents can have embedded links. If traversing a link leads to a foreign document being displayed by a foreign application (a spreadsheet file displayed by Excel) then no WWW links may be followed from it.
The flexibility of Microcosm link sources provides a reversed hypertext authoring paradigm: which other nodes may be linked to the current node, instead of where can the current node lead to? Effectively, the author, using the generic link mechanism, is labelling the document with key words or key phrases. Thus the authoring paradigm has become declarative in nature, describing the data rather than the processes involved in document links.
Hypertext packages are frequently difficult to author in a scaleable or generic fashion which allows for expansion or economic re-use for different purposes. The links, authored for a particular purpose, are fixed inside the document content and fixed to specific destinations.
Updating a Microcosm hypertext by adding new nodes involves one of two scenarios. If the nodes are new general resources (primary materials) then a group of new generic links must be added which will retrospectively apply to the existing hypertext components. If instead they are new secondary materials (e.g. student essays or teacher commentaries on the primary materials) then they will already be affected by the existing links. In this respect the Microcosm hypertext model is incrementally scaleable.
Changing the purpose of the hypertext may involve keeping the collection of nodes substantially the same, but reworking links to provide different structures of access. In many hypertext environments including the Web, changing the links means rewriting the texts because the links are embedded in the texts. In Microcosm it simply means applying a new set of linkbases to the same material, in a similar way to Intermedia's use of webs. Another advantage of Microcosm is that material which is added during the repurposing process will be automatically affected by any retained linkbases. Since many hypertext environments provide embedded point-to-point linking (i.e. from here you can go here) they fail to offer such expandability or maintainability.
As a particular example of the advantages of this authoring paradigm, consider setting up a multiple-choice test based on material in a standard course text. In a normal environment containing only specific links between nodes, for each possible wrong link (i.e. wrong answer) a separate correcting explanation must be written for the user, recalling the material in the original sources. Using Microcosm, the question, the text of each answer and any explanations written will automatically be linked back to the concepts in the original sources.
Microcosm does not suffer from some of the problems of the Web. Dead ends do not occur because almost any program can be used as a Microcosm viewer for many different kinds of data: links can be followed not only between text and graphic files, but between word processed documents, CAD documents, spreadsheets, databases, video documents and simulations etc. Links do not get fossilised because they are not embedded in the documents to which they refer, and they are less prone to decay because they represent rules for linking sets of documents together, rather than specific hardwired document references.
We are experimenting with various approaches to combining Microcosm and the World-Wide Web. The first approach is to treat the Web as just another application which the user can control from their personal information environment and in which Microcosm acts as the 'glue', linking together information from Web pages and local documents. In this scenario the local information environment, controlled by Microcosm, is the primary focus and the WWW viewer co-operates to provide the usual link following and authoring services to the user, so that the user can follow Microcosm links to and from Web pages as well as clicking on buttons in Web pages.
The second approach is to treat the two environments as distinct, but to provide a conversion from sets of Microcosm hypertexts into the appropriate Web formats. This allows the hypertext author to create documents and links in Microcosm's flexible environment and then have them compiled together with the documents by mcm2html into sets of static HTML files for the Web. The end user then sees only a set of WWW documents with embedded buttons for navigation.
A third approach is to provide Microcosm's flexible link services to WWW users who do not have a local Microcosm environment. This is achieved by mimicking both Microcosm's architecture and external link databases in the Web. The Microcosm architecture consists of individual requests marshalled through a chain of processes: these requests are implemented as HTTP messages, received by a CGI script and then routed through a set of processes on the server. Each of these processes may try to satisfy a link request by accessing a particular link database, or by matching some data in an external resource such as a dictionary or a set of manual pages. Any results get returned to the user as an answer to the HTTP request.
The user's interface to this third approach is in the form of an adjunct to the standard WWW browser, an icon which allows the user to bring up a menu of link options (follow/create/show links). This icon may be attached to the browser's title bar itself (as in the Microsoft Windows version) or may be a part of the desktop (the X11 version), but it is required to allow the user access to Microcosm's selection/action link following paradigm: the user may select a piece of text in the WWW client (or any other) window and choose follow link from the adjunct's menu. The adjunct causes the WWW browser to send an HTTP request to the server (using the CCI standard, if available) and after a short delay the client receives an HTML document with a list of possible destinations that were determined by the server. This document (titled "Available Links") contains a set of standard HTML buttons which link to the destinations given in the link databases, and allows the user to choose from the set of possible destinations.
An open hypermedia system like Microcosm has an intrinsically different feel from closed hypermedia systems. The onus is on the user to interrogate the system in order to ask for more information rather than expecting the system to announce to the user that there is more information about a particular subject. The infinitely more flexible model allows us to customise the hypertext environment to the user's needs [Hall 94].
Currently, in order to access a piece of information on the Web it is necessary either to know its address or to be able to find a document that contains a link which references it. In an environment which has no alternative methods of navigation (e.g. a hierarchical structure) this can cause considerable problems, especially if documents are revised. Although a problem in a localised hypertext environment, this is especially significant in a global, uncoordinated information system. Using generic links the reader can instead select any relevant text to act as a link to the required information.
Similarly, WWW authors would have greater freedom in the authoring process: instead of providing explicit buttons for navigation to every relevant piece of material, generic and other dynamically generated links can be used to provide a range of services across a whole domain of information.