Figure 1: A User Interface for the MMT with MAVIS
The thesaurus described so far is similar to thesauruses which have been used in information retrieval systems for many years. However, the integration of a digital thesaurus and generic link following offers a significant enhancement in the potential of generic link following from text. More importantly, we propose a multimedia extension to the basic digital thesaurus concept which will provide enhanced generic link following from non-text media as well as from text.
The multimedia thesaurus(MMT) consists of a network of representations. Many of these will be text, corresponding to a traditional digital thesaurus, but some will be representations of terms extracted from other media. We have seen that in order to provide generic links from non-text, indexes of features such as shape or texture description vectors or sound representations are accumulated. In the multimedia thesaurus these will be associated with equivalent text representations. A new set of relations in the network will, at a minimum, consist of broader representation, narrower representation, equivalent representation and related representation as generalisations of the text term relations mentioned earlier. The MMT will also include preferred representation indicators which will typically be attached to a text representation as this will offer maximum storage efficiency and matching efficiency when link following. An is_a relation will also be used to indicate representations of specific instances of objects. For image based representations, thumbnails of the image selection will be associated with the node to facilitate more informed navigation around the MMT.
In a natural extension to the MMT, other relations could be introduced such as is-part-of, is-older-than etc making the MMT a semantic network of media based representations and allowing more powerful search facilities to users at the link following stage.
It is possible to view the MMT structurally in two parts. The first is a network of concepts and their relationships which is the MMT but with only the preferred terms. This contains all the important conceptual/semantic relationships in the MMT. The second part consists of all the equivalent representations which are linked to their preferred term nodes in the concept network. We have already prototyped a concept network as an aid to enhanced link following[2].
By associating media based representations with equivalent term nodes in the MMT at the generic link authoring stage (if they are not already in the MMT) the following new levels of functionality become possible.
When a generic link is authored from a representation of an object which is in the MMT, it will be possible to follow the link from any of the media based representations for the object which are also present. Thus, a generic link could be authored from an image of a Greek amphora using a shape representation and followed from a text document via selection of the text term Greek amphora.
In particular application areas, multiple views of objects may be available. For example in a multimedia museum application, multiple views of an exhibit may be provided and representations of these could be associated with each other via the MMT. A generic link from one view could then be followed from any of the other views of the same artefact.
As in the case of text described earlier, it will be possible to use the MMT at the link following stage to broaden or narrow the specificity of the attempt at generic link following by navigating the broader and narrower representation relations in the MMT.
Although the use of an MMT offers significant enhancements for generic link following, the construction of the MMT is a non-trivial activity and will involve substantial time overheads. However, for some applications the text thesaurus may already exist or the the time investment may be deemed worthwhile.