In the Microcosm architecture, a link in the linkbase, which has been authored from text, consists of information about the source anchor and the destination anchor of the link. For a specific or point to point link the source anchor includes the selection content, the file in which it occurs and the location in the file at which it occurs. To be able to follow a specific link from a particular selection in a particular document, all three items in the source anchor must match those for the selection. When authoring a generic link from text only the selection itself is recorded in the linkbase for the source anchor. To follow a generic link from a particular selection, only the selection needs to match the selection in the link. Thus, a generic link may be followed from any instance of the source selection in any document. This gives a substantial reduction in link authoring effort, but until the introduction of MAVIS the generic link was only available from text.
In order to provide generic links from non-text media, the MAVIS architecture was developed. The architecture recognised that it was not simply a matter of recording the media based selection in the link source anchor and then matching selections when link following. Text selections are matched exactly, but with non-text selections, similarity estimation is required. The MAVIS architecture also recognises that in order to provide content based retrieval and navigation from non-text media it is necessary to be able to extract and match a wide range of representations from a range of different media types. This functionality is provided by modules controlled by a media table subsystem. Each module is responsible for handling all the processing associated with one particular representation for one particular medium and must contain algorithms to extract the representation, to index the representation for rapid retrieval and to estimate a similarity value, given two representations to be matched.
The main modules implemented so far include a colour histogram module based on the Tek HVC colour representation, a texture module which uses statistical geometric texture statistics [3], a shape module which uses rotation, scale and translation invariant moments, a further shape module which uses chord length distribution and a demonstrator sound module which uses the Fourier transform of the selection from a digital sound file as the representation. The media table subsystem maintains an association between the different media and the modules which can process selections from them. The user can control which modules are active via the subsystem and the weighting associated with different representations if retrieval or link following is to be based on more than one representation. The architecture is designed so that additional modules may be added to support other representations as required. More details of the MAVIS architecture and its use for content based retrieval and navigation have already been published [11].