The use of hypermedia links to navigate through multimedia information collections has increased dramatically with the increasing popularity of the World Wide Web. Typically the links in the web are from one point to another and the source anchors of links tend to be text based.
During the last six years the Multimedia Research Group in Southampton University has developed Microcosm, an open architecture for hypermedia systems handling large multimedia information collections[4, 7, 6, 11]. The architecture is three layered: an applications layer in which general and special purpose media viewers, together with third party applications packages, provide the user interface to the multimedia collection, a link service layer in which any number of communicating processes provide the hypermedia functionality and a storage layer where the documents, link information and other databases are maintained.
One of the important features of the architecture is that link information is held in link databases, separately from the documents being linked, so that documents remain in their native format and may be prepared and viewed via third party applications packages. This also means that links may be established with media on read only devices such as CD-ROM.
A second important feature is the generic link. Generic links differ from the more common point to point links in that, once a link has been authored between a source selection and a destination point, it may be followed from every occurrence of the source selection in any document which has access to the linkbase. Thus a generic link from the word Amsterdam to a video of the city may be followed from any occurrence of the word Amsterdam. The generic link works at the link authoring stage by storing the content of the source selection (ie the word Amsterdam) as the source anchor of the link and at the link following stage, by matching the user selection with source anchors in the linkbase.
We have recently extended the Microcosm architecture to create MAVIS (Microcosm Architecture for Video, Image and Sound) which provides a framework for authoring and following generic links from non-text media[11]. Thus, a user may select part of an image by dragging a rectangle over the required region with the mouse, and may specify this selection as the source anchor for a generic link. A user making another sub-image selection in another image may follow the link if the selection is similar to the source anchor on which the link was authored. The generic links in non-text media work by extracting a variety of representations or signatures from the selection and using pattern matching between signatures at the link following stage.
It should be clear that the generic link works by matching representations of the selection content. Hence we can refer to navigation with generic links as content based navigation.
In the next section we briefly refer to related work and in section 3 we give more detail of generic link following with MAVIS. In section 4 we discuss how a digital thesaurus could enhance generic link following from text and then extend the approach by describing, with a prototype example, how a multimedia thesaurus could provide enhanced generic link following from text and non-text media. The paper concludes with some final comments and a note on future work.