A Model for Authoring and Costing an Industrial Hypermedia Application.

Wills GB, Heath I, Crowder RM, Hall W^.

Technical Report No. 98-6 November 1998.

ISBN Number 085432685-5

Abstract.

This report critically examines the general authoring methodologies presented in literature and thus proposes an authoring methodology suitable for large-scale hypermedia applications for the industrial environment. This report then examines the information required to estimate the cost of producing an industrial application using the methodology presented.

Contents.

1. Introduction

2. Authoring Methodology for an Industrial Environment.
2.1 Authoring Methodology For An Industrial Environment
2.2. Information Audit
2.3 Automatic Dissection of Large Text Documents.
2.4 Automatic Link Generation.
2.4.1 Automatic Explicit Link Generation.
2.5 Information Reuse.
2.5.1 Mini-Hypermedia Applications and Information Reuse
2.6 Existing Information
2.6.1 Compatibility between Different Electronic Asset Systems
2.6.2 Dealing with Asset Paper Information.
2.7 Link Integrity.
2.7.1 Updating the information.

3. Effort Estimation.
3.1 Effort Estimation for the industrial application.
3.2 Learning Curve Effect
3.3 Legacy Paper information.
3.4 Legacy Electronic Information
3.5 New Information.
3.6 Structural Linking.
3.7 Cognitive and Pedagogical Linking.
3.8 Records, Reports and Statistics.

4. Authoring Cost Model
4.1 Overheads.
4.2 Cost of Employment
4.3 The Equipment Cost
4.4 Additional Process Cost
4.5 Total Authoring Costs.

5. Summary

1. Introduction

This research presents an authoring methodology for use in producing an industrial hypermedia application. Hence, tauthoring methodology needs to be flexible enough to accommodate the current and any future philosophies used by the manufacturing industry.

Hypermedia authoring is an unfortunate turn of phrase as it could be confused with industrial authoring, which conjures up an image of a person writing a book, manual or procedure. Put simply, hypermedia authoring is actually the process by which the association between pieces of information is made using hyperlinks. However hypermedia authoring is the term used in the hypertext community and will be used in this report.

In an industrial environment, there already exists an established information space, often consisting of several different information systems. These will already have structure, content and access methods. Therefore, existing information systems need to be encompassed into the design of any industrial hypermedia application [Wills 98]. In addition the personnel used to implement the authoring methodology in all probability are unlikely to be computer scientist or IT specialists. That is the implementation will be carried out by clerical assistants for most of the structural linking and with other specialist (procurement managers, engineers, process owners, quality etc) brought in to carry out pedagogical and conceptual linking. Hence in this report, the author has set out an authoring methodology that takes these into account the structures and constraints that exist in an industrial environment, and the background of the people who will have to author and implement the system.

Back to Contents List.

2. Hypermedia Authoring.

Ginige et al [Ginige 95] state that similar issues arise in developing hypermedia applications, as in paper based systems:

How to structure the information, which will inherently depend on the underlying structure.
The scale and intended use of the hypermedia application will determine the methodology used.
The information structure is determined by:

The material that will be included.
The underlying structure of the information and any subsets that are to be supported.
Navigation accessing method: - table of contents, index, full text, and key word search.

They also suggest that creating hypermedia can be divided into two main areas:

Authoring: The processes of working and storing the information in a fashion appropriate to the intended use of the information.
Publishing: The process of presenting the information to the user, including issues such as task and feel, screen layout and usability.

In addition they gave three common authoring approaches, which are: programme language; screen-based; and information-based. The one most applicable to this research is the information-based approach where the content is obtained from existing information. The information is structured and stored on a database and the author then links related concepts and decides how the information should be presented on the screen. Keeping the original structure of the information allows the viewpoint of the original author to be kept. This is well suited to the development of large hypermedia information systems that require a well-defined process.

Structuring the information involves [Ginige 95]:

Identifying key concepts that best describes the information. The key concepts identification will help later when searching for specific information
Breaking down the information into nodes. Each node should contain only a single theme. Nodes typically contain information that would lose all usefulness if broken down further.

Salton et al [Salton 94] have shown that by dividing the articles of an encyclopaedia into sections and subsections, each identified by an appropriate section heading, a query can be successfully refined. This retrieval strategy is applicable to a large collection of text.

Perlman [Perlman 89] also found that when "(re)structuring information, it is useful to consider the goals of the users. Different tasks may be better supported by different chucking and structuring". In addition he noted that experienced users avoided getting lost by using bookmarks. Hutchings [Hutchings 93] suggests that for along with navigational aids, some form of guided tour would further enhance the ability of the naive user to familiarise themselves with the application.

The greatest challenge when linking a large industrial hypermedia application, is the need to mentally manage all the existing nodes within the information space. This can be a problem when the information space, typically grows above 100 nodes. This mental management of a large number of links produces a cognitive overload to the author. On smaller hypermedia applications, this problem can be overcome by using link maps, which show (sometime pictorially) the source and destination of the links.

It can be argued that authoring really is a production process, in which conventional authoring (making the links) is but a small part [Ginige 97]. The rational behind the argument is that, authoring involves gathering and digitising the information, dissecting large documents, generating the links (manually or otherwise) within the information space and storing the information, see Figure 2-1. This does not accurately represent the process required for an industrial hypermedia application, as the information content cannot generally be changed (edited) without going through a change request procedure. In addition some of the information will already be in propriety electronic information systems. The process represent in Figure 2-1 is for a one-off process, whereas in the industrial environment information will need to be add or removed with relative ease. Ideally this process should be mechanised.

Figure 2-1. The actual writing of a hypermedia application has been described as a production process.

2.1 Authoring Methodology for an Industrial Environment.

While the general authoring methodologies, address some of the requirements of an industrial environment. The size and complexity of an industrial environment aggravates some of the problems found in general hypermedia authoring, that is:

Time required to author (hence cost).
Reducing the cognitive burden to the author.
Maintainability of the information in terms of document control standards.
Information reuse to ensure that the growth of the system is controlled.
Existing information system (both paper and electronic.)

Hence, an authoring methodology is required that takes into account these considerations. The authoring methodology for the industrial environment comes directly from the design methodology [Wills 98] and will include aspects of the best practice of paper-based documentation management system and hypermedia authoring.

New hypermedia documents are written as a collection of nodes and linked together to give some form of structure. Linking within the an industrial hypermedia applications should ensure that:

All relevant procedures, standards, fault trees, drawings, tool list, safety instructions etc, are included.
Links that can be grouped into cognitive frames should be on different Linkbases.
The procedural information will be linked together so that a pedagogical structure is in place.
In the event of a fault condition, the user will be provided with clear and safe instructions.

The summary of the complete authoring methodology for an industrial environment is shown in Figure 2-2. The different aspects are expounded upon below, while a practical 'How to' guide is included in a technical report [Wills 97b].

When authoring a paradox exists, the documents (or nodes) need to be as short as possible while maintaining their meaning. This will aid updating and revision control. While at the same time, the documents need to be long enough so that the system is not continually accessing the storage medium. This is especially important if the storage medium is not physical part of the delivery system. Therefore, existing ‘large’ text documents are to be ‘dissected’ into smaller nodes. The nodes will be linked together, so that the original structure is still available.

Figure 2-2 The Authoring methodology for an industrial hypermedia application.

2.2 Information Audit.

It is essential to identify what information is to be included in to the application. Hence an information audit of the specific working environment is required. This can be conducted at the same time as the contextual review, as in the case with this research. However, the information audit is still required even in those situations in which a contextual review is not applicable, that is when the people producing the industrial hypermedia application work in and understand the environment into which the application to be used. The aim of the information audit is to identify:

Existing electronic systems and the level of integration required. This is to identify the required effort to enable the different systems to communicate to with each other.
Existing electronic information and the level of conversion. Within each electronic system the information will be have been written for displaying in a certain format. Therefore, if the hypermedia is to have a consistent look and feel the authors will need to identify which piece of information needs to be modified to confirm to the procedure/guidelines for producing new documents. The type of changes will be dissecting very long and commonly used documents into useful nodes, left justifying text, removing large amounts of 'white space', using formatting and colours in text documents, using blocks in drawings etc.
Existing information that is not currently in documentation control. This includes information that is kept in logbooks or other 'useful' information as reminders of good work practice.
Additional information not yet included in the documented information space. This would include menus for navigation, information that could now be included video, sound, photographs, background or educational material, etc.

Once the information audit is complete, an estimate of the effort to produce the application can be made.

2.3 Automatic Dissection of Large Text Documents.

Paper documents have to present the information within the page size of the document. Therefore a significant time and effort is spent on formatting/pagination of documents, for example to ensuring that images do not span across page breaks. This is not a problem in electronic documents as the concept of page size is lost. However, reading large electronic technical documents by scrolling through the pages can be awkward and difficult to read especially if the text refers to tables or technical drawings elsewhere within the document. Similarly engineering drawings are often drawn or conceptualised as one large drawing then with careful drafting dissected in to small drawings for use in maintenance, installation, etc.

While new or amended documents can be written so that they consist of several small files, enabling the users to effectively access the information required, it is necessary to dissect the existing large documents into smaller sections or nodes. These smaller nodes will contain a reasonable sized chuck of information, so that the node will make sense in their own right. In practice, nodes will vary in length equivalent to a large paragraph through to a section of several screens lengths. This not only enables the user to get to the information they need easily, it also gives the author flexibility in laying out the screen for the novice user [Wills 97a]. That is, the user will be able to navigate a technical manual for example, by using a contents page, or step through the nodes as if they were pages, or jump straight to a node from another document.

An example of the automatic ‘dissecting’ of a manual using macros has been completed. The original document was created using a template and consisted of 24 chapters, 320 pages. The chapters were held in different files. The procedure used was to:

First put each file for each chapter into their own separate directories
Then working on each of the chapters in each of the directories.
1. Copy the contents page into a separate file and de-link the fields.
2. Copy the information between the Style Heading 2 and the next Style Heading 2.
3. These files were saved as the name of the original chapter and the Heading 2 name.
4. At the end of each new file is appended the Eurotherm symbol and the manual name.

The files were then registered with Microcosm [Heath 94].
A guided tour was created to ensure that the original structure was available, the user would believe that each screen shot was like turning a page in a book.

It was necessary to program the macro for human error. Common errors were mainly having a style heading with no associated text or on a page break. A file was appended to each section giving the company logo, company name and title of the document. The RTF viewer does not display the headers and footers of a document, as the concept of page is lost when viewing electronically. The information contained in the header and footer therefore needs to be 'attached' as in the case above. In addition it is possible to include common navigation buttons into this attached file, caution is need with adding buttons if they are too specific in nature the buttons may cause problems with information reuse.

2.4 Automatic Link Generation.

Automatic generation of links is a current area of research [Bulford 96, Cleary 96]. This becomes more significant as the system gets larger and the cognitive burden for the author increases. Allen [Allen 96] has suggested that pattern matching can be used to find the structural links of a document. Structural links are those links that represent the layout or possible structure of a document. Carr refers to the structure of technical document as having a superstructure, that is "despite the variety of texts, the author is leading the reader towards a particular conclusion via a particular interpretation of the facts--a directed presentation " [Carr 94]. Carr was able to use the structure and a mark-up language to produce embedded links into a document. That is, the author had to mark-up the document in one application prior to running the process that created the links in another application package.

2.4.1 Automatic Explicit Link Generation.

The largest cost in producing a hypermedia application is authoring. The high effort and cost results from the time spent by a person experienced in the topic area, selecting and collating the information, and then to manually link the data into a cognitive and pedagogical structure that is easy to navigate [Crowder 97]. Therefore as much of the process of authoring needs to be automated, thereby reducing the time required of the ‘expert’ to link the hypermedia application and hence reducing the cost.

The hierarchical structure of many technical manuals and procedures comes to the aid of the industrial author. As most chapters, sections, and subsections, etc are formatted using heading styles, or at least are numbered or in bold if not underlined and normally of a different size font. Yet, the majority of authoring effort is spent on producing these structural and explicit links, in an almost administrative role, to link -

The table of contents (or index) to the relevant section.
A list of tables (or figures) to the relevant table (or figure) and their reference in the text.
Explicit references to other procedures, data sheets, etc, be it on the local system, or on the World-Wide-Web.

Therefore, the majority of these links can be automatically generated using macro languages available in many modern word-processing packages. Macros have been written using Visual Basic for Applications in Microsoft Office to automatically link a large existing manual that had been dissected into smaller nodes, by using the method described in section 2.1. This enables a user to access the major section of a document directly, without having to go to the contents or index page, with little or no authoring input, again reducing the time and cost of authoring.

Automatic generation of links is made easier by the use of templates, procedures and guidelines for construction of the documents. These guidelines, procedures and templates are a result of the design stage. In practice several templates, procedures or guidelines are required (i.e. manuals, memos, reports, specifications, etc). In many companies these will already exist, either as part of the company’s quality procedures or just good practice as they improve efficiency and encourage consistency. The small authoring effect required to complete the linking process was mainly due to human manipulation of the automatically generated contents list, that is the headings were changed in the contents list from that of the actual heading of the different sections. In addition the author had to create a linkbase beforehand, similar to the procedure used if they were to manually generate the links. Figure 2-3 shows the procedure for automatically dissecting the Eurotherm manual and creating the structural links.

To ensure that comparisons of timing for automatically and manually producing structural links and dissection of the manual is valid, the following areas are not included:

The decision on how to dissect the manual into meaningful sections and deciding on the titles to be given to each section. As there is no difference between manually and automatically carrying out the task there is no time saving involved.
The checking and correcting of the results at the end. Although a comparison of the number of errors is worth noting, the manual linking becomes tedious after a while and hence more errors are produced by the person manually linking the documents.
The manual production of the appropriate link base and logical types in Microcosm, as this is the time taken is the same whether manually or automatically linking the application.
The writing of the macro to dissect and structurally link the manual. The rationale here is that the time taken to write and test the macros only becomes an issue where the numbers of documents are small in size and quantity.

Hence only the following area are included in the evaluation:

The time taken to actually cut, paste and save the new section into RTF format (with the appropriate title).
The time taken to produce the structural links.
The time taken to produce the guided tour. This ensured that the original structure was maintained.

The macros dissected, labelled each section, and linked a 320 page manual together in less than three minutes. After considerable experience it still took the author just over four hours to manually link a smaller 143 page training manual and produce the guide tour. The advantage of this method other those reported [i.e. Carr 94], is that the user/author is not required to do anything different from their normal task. All that is required is that they have used the functionality of the word-processor. In addition, the macro only requires the skills of a competent programmer and not a specialist, i.e. an engineer that can write or record a basic macro.

Figure 2-3 The flow diagram for automatically dissecting the existing electronic Eurotherm Manuals into useful size nodes and creating structural links.

2.5 Information Reuse.

Information in an industrial environment is rarely used in one place only. For example a process line will often consist of several major pieces of equipment that are integrated together. These major pieces of equipment will often consist of sub-system at the same revision as other sub-systems used elsewhere on the process line or within the factory. Similar in other parts of the organisation procedures, reports, etc. are used by more than one department. Information therefore needs to be reused.

Garzotto et al [Garzotto 96] have set out a classification for different type of node reuse. While their work is mainly on the classification of reuse, they do give a number of areas that need to be considered when reusing information.

Reuse is not free, that is it requires careful implementation and design.
To avoid ambiguities a schema is required at the start of the design process to describe the choice of reuse.
To avoid confusing the user with links like 'Next Page', care needs to be taken, as 'Next Page' may have different meanings depending on the context it is used in.
As a general rule unnecessary changes of presentation should be avoided.
When a node is developed, reference to elements outside the node itself should be avoided, as these are liable to create problems when the node is placed into a different context.

Hardman also found that reusing or duplicating objects such as 'show me where' buttons also caused problems [Hardman 89].

2.5.1 Modular-Hypermedia Applications and Information Reuse

The information relating to the equipment on a process line has the advantage that it essentially can be associated with a physical object and not just a concept. Hence, reuse can be considered free in an industrial hypermedia environment, for objects at the same revision level.

A process line will very often consists of several major pieces of equipment that are integrated together and this lent itself to zonal sectioning. As shown in Figure 2-4 the different components of a complete caterpuller are manufactured by different vendors.

A sheathing line will generally contains two caterpullers, each consisting of a drive unit. The drive unit consists of a motor, gearbox and drive cabinet, and drive controller. In addition, the drive controller is also used on other equipment within the line and on other process lines within the factory. The information for these individual components can be viewed as objects of information (see Figure 2-4).

Figure 2-4 The diagram shows how the MHA of a drive can be assembled using pre-authored components from different manufacturers.

The information for each pieces of equipment was authored separately in a hypermedia application and consisting of all the necessary:

Microcosm specific: Filters, Linkbases, and Computed Links.
Pirelli Specific: Engineering Data, Documents, Pictures, Videos, Procedure, Safety instructions etc.

As the size of these application are small compared to the eventual size of an industrial application the author has called these Modular-hypermedia applications (MHAs). MHAs will save time and hence cost in not having to re-author the same information for units at the same revision level and modification state. MHAs also enabled portability and modularization of components and subsystems on the process line. Hence, a MHA may itself contain several other separately authored MHAs.

A similar process is then used to include all the other information within the factory as a whole, building the information space out of MHAs. These Modular-hypermedia applications can be created ‘off-line’, building a library of pre-authored MHAs. It is envisaged that in the information relating to equipment supplied by OEMs will be electronic hypermedia manuals, pre-linked and ready to integrate into the hypermedia application.

If a slice through the information space is taken, the structure will appear hierarchical. However, in practice the growth of the information space will be dendritic in nature. By creating the larger applications out of Modular-hypermedia application (MHA), the author is able to use current tools and techniques normally limited to smaller hypermedia applications, for example link maps. Also the smaller application becomes more manageable and therefore reduces the cognitive burden to the author.

The hierarchical structure shown in Figure 2-5 above can also be represented in Microcosm using logical types. These are similar to folders in File Manager or Windows Explorer. However, Microcosm does not store the files in these logical types but only a pointer to where they are in the file management system. Hence, the same file can be referred to many times in different logical types, and yet only one copy will exist.

Figure 2-5 Diagram showing an example of how different linkbases are used to aid maintenability, resuability and authorability of a MHA

Each concept or pedagogical structure is represented by a different group of links, each group being held in a separate linkbase. In addition links referring to information outside (external to) the MHA are kept separate from those linking information only within (internal to) the MHA. This separation of the links this will also increases the maintainability of the MHA. In addition, the separation of internal and external links enables the danger pointed out by Garzotto et al [Garzotto 96], when reusing information to be easily identified.

2.6 Existing Information.

As the volume of electronic information increases within an organisation, there is a need to share information between disparate computer system. Barron presents a wide-ranging survey of the practical issues involved in producing portable documents, including multimedia and hypermedia documents [Barron 97]. However, he did not mention DXF documents. He concludes that "we should accept that for the foreseeable future there are going to be multiple approaches to portability, and concentrate our efforts on achieving a "good enough" solution".

2.6.1 Compatibility between Different Electronic Asset Systems

To enable compatibility between different word processing applications, use of intermediary format like Rich Text Format (RTF) was made. In addition, Word 97 is able to import the RTF format for further processing using macros. However, there can still be problems with using intermediary formats, for examples tables do not always translate correctly especially where text that has been arranged using tabs as the tab settings are often different between packages. The intermediary format used with technical drawing was Data Exchange File format (DXF). These format were chosen for the following reasons:

The files are then readable in a read only viewer, and therefore cannot be changed without the correct permissions and loading the original application package.
These are well-used standards and most modern systems will have the facility to change the documents into these formats.
There are fully aware views for these formats in Microcosm.

When authoring technical drawings in AutoCAD, use is made of the ability to define blocks, on which links can be made to the name or attributes of the block. Therefore, short yet unique descriptive names were used for each block. As the link is to the name (or attribute) of a block and not on a physical location, when the drawing is updated, and if the block is moved within the drawing, the links referring to that block do not require re-linking. Unfortunately, it is not so easy to solve the problem of re-linking with digitised raster images, as a component within the image is represented by a collection of pixels bound by a set of rectangular co-ordinates. If a drawing is updated, and an item moved, the link will need to be updated.

2.6.2 Dealing with Asset Paper Information.

Authoring hypermedia applications for an industrial environment is a relatively straightforward task, when using the above strategies and if all the information is in electronic format. However, in many industrial environments, documentation, especially manuals from Original Equipment Manufacturers (OEMs) are still in paper format. Hence it was necessary, for viewing and integration into the system to convert the information held on paper into an electronic format. However, once the process of scanning and converting a text document into an electronic format is completed the documents are easier to update.

The scanning process can be time consuming and hence costly. In this research, the length of time required was largely depended on the quality of the paper documents, the quality of the software and hardware used. The largest portion of the time was spent on checking and correcting the information after conversion. Hence it is important to obtain the 'master copy' of a documents and not photocopies as this will affect the quality and hence time and cost of conversion.

Where the documentation was of poor quality (legibility) it was easier to enter the information manually i.e. to retype the text documents. In the case of poor quality drawings it is often more cost effective to redraw or trace the drawing using computer aided drafting, rather than scan the documents to obtain the electronic version. Whether by raster-vector conversion or manual entry (retyping, redrawing), when a document is converted to electronic format, it in effect becomes an updated document and therefore should be subject to the same quality control procedures as any other updated document.

Where forms are to be used in the electronic system, it may not be sufficient just to scan the details into electronic format, as data entry is required. The method used to produce the form will depend on its use. If the form is for information only a simple template can be written for the form, the users then call-up the template fill-in the details and e-mail the form. However where forms are used for data entry and the data used in subsequent operations use of a database or spreadsheet is more applicable. Other programs can then access the information for further manipulation, i.e. producing charts, report, etc. The author has demonstrated this in using a commercial available package (Microsoft Access) forms have been created and the data entered is stored in a database, for example the weekly current checks carried out by the maintainers.

2.7 Link Integrity.

Many people have felt the frustration at following a link in a World-Wide-Web document that ends in an error message as the destination no longer exists. This may happen for a variety of reasons the most common being that the document has been deleted or moved, and results in what is called a dangling link. This is very common in distributed systems and is overcome by insisting that the server system and human administration inform any link service that a document has moved or deleted [Risk 97]. However, links can still be become invalid when a document is edited.

Davis has shown how some of the link integrity issues may be solved in Microcosm [Davis 95]. Davis suggests that within the description of the link two new tags be added for the source and destination document. The tag will give the operating system's date and time stamp when the link was created. At the end of the filter chain [Heath 94] resides a 'checker' to see if the links for the document are older than the document date. If this is the case, that is the document has changed, the system will try to ensure that the links are still valid and inform the user that the document has changed and that the links may not be valid. Davis refers to the problem of invalid links as the editing problem. He suggest that the editing problem is more serve than the problem of a dangling link, as given a close file or virtual file system it is possible to prevent dangling links. He also suggested the simplest way to deal with concurrency problems is not to allow them to happen, that is documents and linkbases are read only. He then describes a crude locking and notification procedure to ensure that only one author edits a document at a time. Davis concludes by saying "it is necessary to impose some conditions on the use of the system if {link} integrity is to be assured."

2.7.1 Updating the information.

As stated earlier the maintainability of the system is improved and revision control of the links made easier by using the hierarchical structures of the information space and the authoring using MHAs. That is, as a MHA is amended only the linkbases associated with the MHA needs to be ‘frozen’ and stored as part of the change process. In the same way, a copy of the document prior to the amendment is kept as part of the document change procedure. In addition, only the linkbases that have been effected by the change and not all of the linkbases associated with the MHA needs to be ‘frozen’, thus reducing even further the amount on information stored after an amendment to a MHA. The method used is to carry out the changes off-line and the re-import the changed MHA back into the compiled system. If the change is not a global change, the author is responsible for identifying which occurrences of the information the change applies too. Alternatively the author can also mark which occurrences the change does not apply too, if this is easier.

The issue of concurrency, that is more than one author needs to update a node, document or MHA at the same time is handled in the same way as a paper-based system. That is, no two people can edit the same document at the same time, this is a simple matter of configuration control. The Document/MHA is 'checked out' as being authored when the change-note is being approved.

Configuration control can also solve the problem of when a shared document is updated. For example when a modification takes place on a drive unit and there are a number of instances of that drive unit, but the modification only effects some of the drive units. A new MHA is created for the modified drives, this will reference the updated documents plus the unaffected documents that are still shared with the unmodified drives. The reason a new MHA is created is that this is the smallest granularity that makes a physical object or concept unique. Note a MHA is not necessarily large, it may only consist of a few nodes.

Back to Contents List.

3. Effort Estimation.

The effort that is required to author a hypermedia application would directly effects the cost of an applications. However, the authors are unaware of any effort models for industrial hypermedia.

3.1 Effort Estimation for the industrial application.

In order to estimate the effort and then cost of authoring an industrial hypermedia application using the authoring methodology described in this report, a model of the process is need. The basic model is based on the process model in British Standard (BS) 6143 part 1 [BSI 92]. The basic model is show in Figure 3-1 while the complete process model is shown in Figure 3-2.

Figure -1 Basic Process Model

The model can be describes as:

Inputs: Materials or data that are transformed by the process to create the output.
Outputs: The results of the transformation of inputs. The output includes material or data that conform to the requirements, waste and process information.
Controls: Inputs that define, regulate and/or influence the process. This embraces procedures, methods, plans, standards, policies, legislation and strategies.
Resources: Contributing factors, which are not transformed to become outputs. That is people, equipment, materials, accommodation and environment requirements.

Figure 3-2 Authoring Process Model

The amount of effort will depend on the how thoroughly the information audit was carried out, and this represents a trade-off, since more effort on the audit will reduce the authoring process to become a mechanised process. That is the operator only needs to follow the procedure and not decide how or where to save information, as the audit has identified all the information required and the file structures required to store the information.

In addition the effort will also be governed by the choices taken by the team responsible for the authoring process, some of the possible choices are explored below and equations for the effort required to author the application is given. The complete authoring effort will be the sum of the individual efforts detailed below, these are:

Dealing with asset paper documents (E_P).
Dealing with existing electronic documents (E_E).
Creating new electronic documents(E_N).
Creating the structural links (E_SL).
Creating cognitive and pedagogical links (E_CP).
Maintaining records, reports and statistics (E_M).

3.2 Learning Curve Effect

When people are first introduced to a task they frequently take longer to perform that task than when they have repeated the task a number of times, this is known as the learning-curve effect [Drury 96]. The learning curve effect can only be applied to direct labour or variable overheads that are directly effected by labour effort. The learning curve is expressed as:
3-1

Where T is the cumulative average of the time required to carry out the task X times, a is the time required to carry out the task the first time, X is the number of times the task is to be carried out.

The exponential b is defined as:

3-2

The learning curve is based on real world observations and hence the relationships described are empirical [Arnold 90]. The learning rate can very between 65% and 90% in the early stages of production, and levels out to reach a steady state in which no further reductions in the time to perform the tasks by learning can be achieved. To use the learning rate one can use the formula or standard tables. The tables give the unit time and total time for different learning curves against number of times the task is performed.

3.3 Asset Paper information.

The main effort in this part of the authoring process is changing the paper information into electronic information. This task can be sub-divided into four main activities:

Gathering the information to be converted.
Converting the information from paper to a standard raster format and saving the information with appropriate file names and file hierarchy.
Cleaning-up the information after conversion before further processing.
Processing the information into a vector format.

Not all documents require further processing. That is some documents only need to be in the raster format supplied by the conversion process, while others will need to be converted into a vector format, for example ASCII text using OCR techniques. In the case where paper documents are retyped, drawings electronic redrafted, or forms converted to electronic data entry, these can be classed as new electronic documents and hence the effort is calculated separately.

In the case where paper documents are retyped, drawings electronic redrafted, or forms converted to electronic data entry, these can be classed as new electronic documents and hence the effort is calculated separately.

The conversion process can be sub-contracted to any of a number of companies that offer a bureau service. These companies will also do some of the post conversion tidying-up and the effort for correcting the converted documents is reduced. However the cost is not zero and will be taken into account in the cost model. Where there is a significant number of documents are converted into electronic information, sampling can be introduced to the checking process. The sampling should be carried out in accordance with a recognised statistical process, for example BS 6001 [BSI 96].

Therefore the effort required to deal with the paper information (Ep) is the sum of the effort for gathering, converting, correcting, and process the paper information. When estimating the time to carry out each process it is assumed that time is included for verifying the work.

These individual processes that make up the changing of paper held information to electronic information depend on: -

The number of single sheets of paper to be involved (S_i), which may be different for each process.
The cumulative average time taken to perform the tasks with learning curve (T_i).

3-3

3.4 Asset Electronic Information

The existing electronic information needs to have a common look and feel, in addition the majority of existing electronic information has been written for the paper paradigm. That is the documents need to be printed on paper in order for the reader to fully gain the benefits from the layout of the document. Hence essential documents need to conform to the templates set by the designers, this will not only ensure that they are readable in the electronic format but also aid link generation and screen management. Another part of the effort will be to dissect long documents into information nodes and assign meaningful labels to each node. In addition the effort required to ensure that the converted paper documents conform to the required standard need to be included within this calculation. The two basic processes for asset electronic information are conversion and dissection. This gives the equation for the effort required to change the existing electronic (E_E) information as:

3-4

where Da is the number of asset electronic documents, this may not be the same number for conversion and dissection.

It is assumed that the length of documents are statistically normally distributed, and hence the average size can be used to calculate the effort required. If this is not the case then a separate time for the average length of document in each class needs to be calculated.

3.5 New Information.

A significant amount of information with the working environments held as corporate knowledge, that is information that is known to a person or group of people yet is vital for the smooth running of the organisation, but is not actually recorded. Therefore this information has first to be collected, then entered into the system. In addition information that is useful to have but not available can now be added, obviously it is up to the management team to ensure that this does get out of hand. The effort would also include paper documents that are retyped, drawings that are electronically redrafted, and forms that are converted to electronic data entry. Therefore the effort will vary with the number of types of information and then the number of pieces of information for that type. N is the number of new nodes that are required to be created.

The effort required to produce the new information (E_N) can be calculated as:

3-5

3.6 Structural Linking.

Once the information is in the correct format linking may begin. The first and easiest is the structural links, as this is an administrative task. This should be an automated process especially where the documents conform to the templates. The structural linking is carried out usually on a document set or a group of nodes hence the effort required is on average the same irrespective of the size of the document. Therefore the symbol Ds is used to represent the number of documents. In addition there is the effort required to supervise and organise the process, and there will be the need to carryout some manual structural linking. The manual linking is required to link menus and other small group of documents that do not conform to any standard templates and depend on the number of links (L) required. In addition effort is required to make a number of link databases (LB) for both the manual and automatic linking processes. All documents will need to be registered with the hypermedia document management system and grouped into a number of MHAs (M). In the case of Microcosm this involves creating appropriate logical types and setting the path variables.

3-6

3.7 Cognitive and Pedagogical Linking.

The effort here involves the experts from the relative discipline and departments to make the associations that are not obvious. The effort will be dependent on the number of links (L) require to be made and the number of link bases (LB_cp) required. Included in this effort is the time taken to produce the link clusters (LC) [Crowder 98].

3-7

3.8 Records, Reports and Statistics.

In an industrial environment it is essential that the documentation audit trail is maintained. Hence all documents that have been change or created need to be recorded in the company's document control system, Nc is the number of nodes changed. Similarly if the effort model is to be managed effectively and kept within budget, it will need to be monitored and actual times compared with forecasted times. Hence the need to record statically information, Stat is the number of statistical records kept. In addition it is essential for auditing purposes to report on the progress of the process, R is the number of reports that will need to be written. Therefore this effort needs to be included into the model.

3-8

Back to Contents List.

4.0 Authoring Cost Model for Industrial Hypermedia

The authors are unaware of any other cost models for industrial hypermedia. However, within the literature there are estimates of effort and cost for design of multimedia application [Lechtenberg 98]. The obvious difference is that this report focuses on the production of the hypermedia application and with particular emphases on the linking aspects of the production. In addition the multimedia application tend to be smaller and hence the cost model includes the design process.

The method of allocating costs will vary between organisations. The cost method used in this research is generally termed the engineering cost method [Arnold 90], and is used where a product or process is not part of the companies' normal business activity. The engineer will estimate the total time, labour required, the materials used and the capital equipment needed to perform the activity. The difficulty comes in estimating the indirect costs of such items as insurances, maintenance and power. However the engineering cost method leads to a very accurate predication of future costs. In addition the cost of the Information Audit should be included. The rationale being that the effort required to carryout the audit is an integral part of the authoring methodology and has a direct effect on the efficiency of the authoring process. A poor audit will result in a greater effort in the authoring process.

4.1 Overheads.

Overheads are those costs that cannot be directly assigned to the cost object such as product, process, or customer group [Drury 96]. Included in this cost are the cost of the service, this includes, lighting, heating, building maintenance, rent for floor space and the according proportion of the business rate. The traditional method of allocating these costs is to divide the overhead cost among the various cost centres of the organisation. Each cost centre will then further proportion the cost among each of its activities. This method works well for cost accounting especially where the overheads are small in relation to the direct cost. Hence, a small but increasing number of engineering companies are changing over to activity based costing to allocate the overheads, especially when costing is to be used for management decisions [Moore 98]. Another method commonly used in estimation of costs is to allocate a figure for overhead costs based a function of the cumulative labour costs i.e. an additional forty percent for example. What is clear is that different companies will use different methods of allocating these costs. However the final result is still the same, which is a fixed figure that represents the overhead costs (C_O).

4.2 Cost of Employment

There is a cost of employment other than the salary paid to the employees. These costs include the employer's National Insurance contribution, pension fund contributions, health and other insurances. It is preferable to calculate an average hourly rate of employment and added this to the hourly rate of the workers salary to produce a cost of employment [Drury 96]. This needs to be calculated for each of the salary scales or bands used to pay employees. In the first instant it may be sufficient to assume that the people employed in the task are paid the same. However this is really the case due to factors such as full or part time employment, length of time served, seniority, etc. Hence the extent to which these factors are taken into account will be related to the accuracy of the cost estimation required.

4.3 The Equipment Cost

In cost accounting depreciation is used to spread the cost of equipment over a number of years [Reynolds 92]. Where the 'life term' of the equipment is chosen based on the nature of the equipment, with computer equipment this is general relatively short that is less than five years. The type of depreciation can be a constant amount that allows for scrap at the end, or a constant fraction of the residual amount, producing larger depreciation values in the early stages. However for cost forecasting it makes more sense to include the full cost of the equipment (C_E), include the cost of maintenance agreements, shipping, insurances, etc. In addition the actual cost of purchasing equipment can be spread by the use of lease-purchase agreements, in which the company leases the equipment for a set period. If the company keeps the equipment to the end of the agreement they will own the equipment, prior to which they can return the equipment as in any other leasing agreement.

4.4 Additional Process Cost

The addition process costs (C_P) are variable overhead costs, in that these are overhead unique to the process itself. These include such cost as the cost of the information audit, managing and supervising the process, the materials and power consumed in the process, the cost of any subcontracted work etc.

4.5 Total Authoring Costs.

The total cost of the authoring process is:

4-1

An example costing the effort model is given in Appendix A. Using a different number of people will not effect equation 4-1. That is the effort will be divided by the number of people used and at the same time the cost of employment is multiplied by the number of people used, hence they cancel each other out. However each person carrying out the task will be subject to the learning curve effect. For example in the scanning process on S number of sheet, if the learning curve is applied directly it assumes a time base on the same person carrying out the task and hence getting quicker. Therefore the more people assigned to the task the less sheets they actual scan and hence the time they require to carry out the task will not be as quick. Common sense is required here, as it is not possible to allocate more people to a task then there is equipment to complete the task. The simplest method is to divide the total number of paper sheets to be scanned by the number of people directly involved in the process. Hence the equation 4-1 can be written as: -

4-2

where P is the number of people used in the each separate part of the process. This will have the effect of slightly increasing the effort time allowing for the individuals learning effort. Similarly, there is a natural mechanical limit to the learning rate. This mechanical limit is a result of:

Ergonomics, that is the number of movements by the human to perform the operations will result in the limit to the speed of operation.
Mechanical limits of the machine, for example the scanning process will be limited by the speed of the scanner to scan a document and by the speed of the processor/program to convert from a raster to a vector format.

Therefore the learning curve must take this into account. Hence

4-3

where C_ML is a constant representing the Mechanical Limit of the process.

Back to Contents List.

5. Summary

This report has presented the general methods of authoring hypermedia application in the literature. In addition it has also presented some of the problems faced by author of hypermedia systems. Industrial hypermedia applications are typically classed as large-scale hypermedia applications due to their size and complexity. The areas that require special consideration are the scalability and reuse of information, the maintainability of the hypermedia application, and the cognitive burden to the author.

The largest cost in producing a hypermedia application is the authoring costs. Hence the industrial authoring methodology given in this report has demonstrated that by using modern word processing packages the majority of the structural links can be automatically produced. Using the hierarchical structure of most technical information facilitated the concept of Modular-hypermedia or modular applications to be developed. Using MHAs allow the complex application to be come more manageable, enables the system to be scaled, encouraged reuse of information and reduced the cognitive burden to the author.

Based on the authoring methodology presented in this report, a model that allows an estimation of the effort or time to produce the hypermedia application is presented. The actual figures and paths through the model will depend largely on the decision made by the team responsible for the authoring process. The total effort is the sum of each of the individual efforts given in equations 3-3 to 3-8. Once the effort has been calculated the cost of the process can then be ascertained. The factors that effect the cost of the hypermedia application are mainly the factory overhead costs, the cost of employment, the equipment costs and additional overhead cost applicable only to the process.

Acknowledgements.

The authors acknowledge the EPSRC (Engineering and Physical Science Research Council) for funding the work under grant number GR/L/10482.

References.

[Allen 96]	Allen J. Automatic Hypertext Link Typing. The Seventh ACM Conference on Hypertext, HYPERTEXT ’96 Washington DC March 16-20 1996. pp 42-52
[Arnold 90]	Arnold J, Hope T. Accounting for Management Decision 2nd edition, Prentice Hall 90
[Barron 97]	Barron DW. Portable documents: problems and (partial) solutions. Electronic Publishing. Origination, Dissemination and Design. Volume 8 Issue 4 December 1997 pp 343-367.
[Buford 96]	Buford JF. Evaluating HyTime: The Seventh ACM Conference on Hypertext, HYPERTEXT ’96 Washington DC March 16-20 1996. pp 105-115
[BSI 92]	BSI. BS 6143 Part 1 1992 Guide to the economics of quality Part 1 Process Cost Model. British Standards Institute. http://www.bsi.org.uk
[BSI 96]	BSI BS 6001 attribute sampling system British Standards Institute. http://www.bsi.org.uk
[BSI 97]	BSI. BS EN ISO 9000-3:1997 - Quality management and quality assurance standards. Guidelines for the application of ISO 9001:1994 to the development, supply, installation and maintenance of computer software. British Standards Institute http://www.bsi.org.uk.
[Carr 94]	Carr L, Hall W, Davis H, and De Roure D, The microcosm Link Server and application to the World Wide Web" Presented at the 1994 WWW Conference, in Geneva. Available at http://wwwcosm.ecs.soton.ac.uk/publications
[Cleary 96]	Cleary C., Bareiss R. Practical Methods for Automatically Generating Typed Links. The Seventh ACM Conference on Hypertext, HYPERTEXT ’96 Washington DC March 16-20 1996. pp 31-41
[Crowder 96]	Crowder RM, Hall W, Heath I, Wills GB. Requirements Specification: FIRM: Factory Information Resource Management, EPSRC Grant : GR/L/10482, 21^st June 1996, University of Southampton.
[Crowder 97]	Crowder RM, Wills GB, Heath I, Hall W. The Application Of Hypermedia In The Factory Information Environment. IEE 5^th International Conference on FACTORY 2000, Cambridge, UK 2-4 April 1997.
[Crowder 98]	Crowder RM, Wills GB, Heath I, Hall W. Hypermedia Information Management: A New Paradigm. 3rd International Conference on Managing Innovation in Manufacture, University of Nottingham, 6-8 July 1998, pages 329-334. 1998
[Davis 95]	Davis HC. Data integrity problems in an Open Hypermedia Link . PhD Thesis University of Southampton November 1995.
[Drury 96]	Drury C. Management and Cost Accounting 4th edition. Thomson 1996
[Eurotherm 96]	590 Digital Product Manual, Firmware Vision 4. Eurotherm Drives Limited 1996.
[Garzotto 96]	Garzotto F., Luca Mainetti L., Paolini P. Information Reuse in Hypermedia Applications. The Seventh ACM Conference on Hypertext, HYPERTEXT ’96 Washington DC March 16-20 1996. pp 93-104
[Ginige 95]	Ginige A, Lowe DB, Robertson J. Hypermedia Authoring. IEEE Multimedia. Vol. 2 No 3 Winter 1995. pp 24-35.
[Ginige 97]	Ginige A, Lowe D. Hypermedia Engineering: Process for developing large hypermedia systems. Tutorial at The Eighth ACM Conference on Hypertext. Southampton, UK. , 9-11April 1997.
[Hardman 89]	Hardman L. Evaluating the Usability of the Glasgow Online Hypertext. Hypermedia Vol. 1 Number 1 1989 pp 35-63.
[Heath 94]	Heath I, Hall W, Crowder RM, Pasha MA, Soper P. Integrating a Knowledge Base with an open Hypermedia System and its application in an Industrial Environment. Proceedings for the 3rd International Conference on Information and Knowledge Management CIKM'94 Workshop on Intelligent Hypertext Nov 29-Dec 1, 1994, Nist Gaithersburg, Maryland, USA.
[Hutchings 93]	Hutchings GA Patterns of Interaction with Hypermedia Systems: A study of Authors and users. PhD Thesis University of Southampton. 1993.
[Lechtenberg 98]	Lechtenberg S, Joubert GR, Effort Estimation for Multimedia Information System Development. awaiting publication, e-mail joubert@informatik.tu-clausthal.de.
[Moore 98]	Moore M. As Useful as ABC? IEE Manufacturing Engineering Vol. 77, No. 2 April 1998 pp 92-94
[Perlman 89]	Perlman G. Asynchronous Design/Evaluation Methods for Hypertext Technology Development. Hypertext'89 ACM conference November 1989 pp 61-81.
[Reynolds 92]	Reynolds AJ. The finance of Engineering Companies, An introduction for students and practising Engineers. Edward Arnold 1992.
[Risk 97]	Risk A, Sutcliffe D. Distributed link service in the Aqarelle project. Hypertext 97. ACM conference on Hypertext Southampton 6-11 April 1997 pp 208-209.
[Wills 97a]	Wills G.B, Heath I, Crowder R.M, Hall W. Evaluation of a User Interface Developed for Industrial Applications. University of Southampton Technical report No M97-4 ISBN-0854326499 at http://www.mmrg.ecs.soton.ac.uk/publications.html
[Wills 97b]	Wills G.B, Heath I, Crowder R.M, Hall W. Hypermedia Authoring in an Industrial Environment. University of Southampton Technical report No M97-6 ISBN-0854326626. At http://www.mmrg.ecs.soton.ac.uk/publications.html
[Wills 98a]	Wills G.B, Heath I, Crowder R.M, Hall W. Industrial Hypermedia Design University of Southampton Technical report No. M98-2 ISBN:- 0-585432-668-5. At http://www.mmrg.ecs.soton.ac.uk/publications.html

Back to Contents List.

Appendix A Sample Costing for the Authoring Process

The application information resource based used in the prototype to demonstrate the principles and used in the evaluation consisted of 20 MHAs, consisting of over 760 nodes, this equates to 276 Mbytes of information. Over 2300 structural links were produced and held in 38 different linkbase, with over an additional 500 Generic links held in a further 22 linkbases, and over fifty clusters where produced. In addition, there are 18 compute-link linkbases, and 11 guided tours

Effort	Task	Number	Initial Time (Hours)	Effort Hours (with Learning Curve @ 95 %
Asset Paper
Gathering Information	All	320	0.27	55.68
Conversion	Text	120	0.10	8.42
	Images	200	0.12	15.77
Cleaning up	Text	120	0.17	14.03
	Images	200	0.25	33.78
Processing	OCR	120	0.08	7.02
	Correcting OCR	120	0.17	14.03
	thumbnail	30	0.20	4.66
	Scaling Images	200	0.08	11.26
New Information
	video	4	4.00	14.44
	Drawings	2	1.00	1.90
	Lists/Menus	6	0.42	2.19
Asset Electronic
Look & Feel	Text	221	0.17	24.70
	List	11	0.10	0.92
	Drawings	277	0.25	45.67
Structural Linking
	Automatic	1	0.17	0.17
	Manual	2600	0.08	121.08
	Linkbases	72	0.20	10.49
	MHAs	20	0.25	4.01
	Guided tours	11	0.50	4.61
Administration
	Document Registration	760	0.05	23.26
Total Hours Admin				418.11
Cognitive Linking
	Manual	350	0.08	18.91
	Linkbases	5	0.25	1.11
	Clusters	68	0.42	20.73
Total hour Expert				40.75

The total effort equates to three months of a persons time. The effort assumes that only one person will carry out each task and that all work is undertaken 'in-house'. It is also assumed that the design phase and information audit has identified all the information required and the file structures required to store the information. That is the operator only needs to follow the procedure and not decide how or where to save information.

The cost of the effort can be calculated by using the standard university pay scales for cost of employment and the costing of the overheads is the same method as used by the university when submitting bids for research grants.

The costs can be broken down to: -

The person carrying out the administrative type tasks will be on the administration scale.
The expert linking is calculated on the scale as a research assistant.
The additional process cost (C_P) is calculated as twenty percent of the cost of each effort activity. Plus the cost of auditing which is 600 person hours at he expert cost of employment.
The overhead cost (C_O) are based on the standard university costing of forty percent of the sum of the addition process cost and the cost of each of the effort activity.
The equipment costs (C_E) is based on: -

Two networked Pentium II with about a 7GB hard drive.
Data capture hardware and software for paper and microfiche asset information.
Appropriate software, this includes word-processing, databases, computer aided drafting etc.
The cost of PC support.

Cost	Process	4251
	Additional	850
	audit	9000
	Overheads	5641
	Equipment	5000
	Total	24742

Back to Contents List.