next up previous

Data Structures for Image Processing in C

M.R.Dobie and P.H.Lewis
Department of Electronics and Computer Science
University of Southampton
Southampton SO9 5NH

January 1991

Abstract:

This paper describes a library of data structures and functions written in the C language which are designed to provide a framework for implementing image processing algorithms.

A flexible and efficient approach to handling images and associated data in an object-oriented manner is described. Various design and implementation issues are discussed, with examples of using the structures provided and implementing new functions within the general framework.

Keywords: data structures, image processing, C, object oriented programming.

Introduction

Image understanding is a very broad area of study which places many demands on computers. It usually involves complex calculations on large amounts of data, and the data usually has a wide range of different representations. Consequently, there are many data structures in use for handling the different types of information in different representations. As an example one has only to look at the large number of formats used for storing images on disk or tape. Many have their own advantages (e.g. low storage requirements), but the big disadvantage is that to use them you need to use (or write) a new set of routines to convert and manipulate each new structure.

The same is true of internal image processing data structures. There are many different types of data (e.g. colour images, edge maps, convolution masks) and many different representations of the data (for example, a colour image can be represented by RGB, HSV, YUV, YIQ models etc).

In our laboratories we have a variety of projects dealing with images on several different types of hardware. Images can be stored on disk, captured in real time from a camera or stored on LaserVision disks. There is a need for low level image processing operations (e.g. thresholding, convolution) and for medium level processing (e.g. matching algorithms and Hough Transform) using monochrome and colour images. The work described in this paper is designed to provide a framework which is capable of supporting these requirements and which is easy to extend, allowing easier exploration of higher level, image understanding techniques.

In this specific case, our aim was to implement image processing routines which are suitable for image sequence analysis. These routines need to cover areas such as image acquisition, display, differencing, edge detection, tracking algorithms and handling sequences. The data structures need to be flexible enough to cope with different types of images (for example, monochrome, colour and edge maps) and to deal with arbitrarily shaped images. Ideally, it should be easy for anyone else to apply the routines to another problem and to implement new routines.

The decision was made to implement everything in the C language as a library. This gives an easy interface to our image acquisition and display hardware and allows easy integration with existing image processing software in the department (also written in C) and with most system software (like a graphical user interface). There are two parts to the design, the data structures and the function interfaces.

Data Structure Design

In [4], Piper and Rutovitz discuss data structures for image processing. Their main conclusions are that a systematic and organised object oriented approach to data structures allows a very flexible system to be built. The use of pointers to refer to objects and the ability to process arbitrarily shaped images are two features which help to facilitate this. Some example structures are shown which demonstrate how processing functions can be implemented. The paper does not explore the problems of handling multiple data types in C.

Piper and Rutovitz describe how their routines, together with a suitable intermediate file format, have been compiled into filter programs for use in the UNIX shell environment to provide an interactive set of image processing tools.

The work of Piper and Rutovitz has been used as a starting point by the Numerical Algorithms Group in the development of a standard image processing algorithms library (IPAL) [1]. At the moment there are no freely available image processing libraries as there are for other application areas (databases or numerical analysis, for example). Unfortunately, the IPAL library is still very much under development and it didn't supply the functionality we required for image sequence analysis, so we decided to implement our own data structures incorporating some of the features of IPAL; specifically, the notion of a flexible image structure capable of handling many types of image data and easily extensible.

Following the lead of Piper and Rutovitz [4] we decided that the data structures would be referenced via a single pointer. The main advantage of this technique is that we only ever pass a single pointer between routines, which is fast and requires little stack space. Another advantage is that one routine can deal with different types of object referenced via the same type of pointer. All the information which a routine might need about the object (type, sizes etc.) is available via the pointer.

This idea is applied inside the objects as well. Many objects contain pointers which can point to a number of different structures depending on the type and representation of the object itself. This approach also cuts down the memory requirements of the system, since there are never any redundant fields in the objects to allow for more complex structures which may not be present.

This leads to elegant interfaces to the routines with all the detailed information about the object contained within the object itself, rather than being passed in a huge parameter list (which is confusing to the reader and inefficient). Because of the simplified interfaces to the routines, very little learning is required before a newcomer can use the routines in an application.

This approach does have a disadvantage. When implementing a routine using the data structures, the routine must check each object and execute a relevant piece of code for that type of object, or signal an error condition if an object has been supplied for which the routine is inappropriate. The flexibility and ease of use from the caller's point of view place a burden on the routine implementor, since he or she must cope with all the possible types of objects that may be passed to the routine.

Function Design

As already mentioned, the input parameters to the library functions have been simplified by the design of the data structures. Several schemes for returning the output parameters from library functions were considered. In particular, it is consistent that functions return pointers to objects for their output, since we have already decided that their inputs will be pointers to objects. There are several types of functions with different needs for their interfaces. We have functions that return new objects (e.g. threshold image A and put the result in image B), functions that return information about existing objects (e.g. calculate the area of image A) and functions that don't return anything (e.g. destroy image A). Some functions need the ability to communicate error conditions and others don't.

If a function returns an object, there are several ways it can be done.

The third technique can be eliminated straight away. Although some image processing operations can be performed `in situ', many give a result of a completely different type to the inputs. This method would involve an object changing its type halfway through a program (which is asking for confusion) and would require explicit copying of the input objects by the programmer if they were to be preserved. The other two methods look more promising. Taking the thresholding example from above and assuming a structure type IMAGE we can compare the code required to call a library function in figure 1.


  
Figure 1:
\begin{figure*}
The first technique results in this code:
{\normalsize
\begin{te...
..... ) ;
if (err == ...) ...
}\end{verbatim}\end{tex2html_preform}}
\end{figure*}

The first technique needs one more line of code than the second, but it handles the error code neatly. Its big disadvantage is that the programmer needs to know the relationship between the input and output images of the threshold function (e.g. are they the same size or type?) in order to create the correct empty IMAGE object.

Using the second technique, this relationship is encapsulated within the threshold function itself, so the function creates the correct output image depending on the input image and the programmer doesn't have to worry about it. The second technique achieves a higher level of abstraction which makes it easier to use. This is the method chosen for the library. The handling of error conditions is discussed in section 4.

If a function returns several results (e.g. some edge detectors return an edge magnitude image and an edge direction image) these can all be returned in one IMAGE object. The type of the object will reflect the fact that it contains both the edge magnitude and edge direction images. This is appropriate for logically related results. Independent multiple results can be handled in the traditional C fashion by passing several parameters by reference, which can be altered by the function. An example is a function which returns several statistics about an image. Both methods are used in the library.

Another point to note is that with the second technique you can cascade function calls. This is illustrated in figure 2. The pointer returned by acquire_image points to the input image for threshold and the thresholded image is the parameter for display_image. The notational convenience of this code increases the usability of the functions. One consequence is that each function should fail gracefully if its input parameters are NULL to prevent a program crashing within a library routine.


  
Figure 2:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim}{
display_...
...ge( ... ),
... ),
... ) ;
}\end{verbatim}\end{tex2html_preform}}
\end{figure*}

One thing to watch for is that this code will create structures which are not subsequently destroyed and therefore the free memory will be used up. This is fine if you have lots of memory. A garbage collection system or automatic destruction (like that in C++) would solve the problem.

To solve the problem in standard C, the intermediate pointers have to be saved in variables and the intermediate objects destroyed explicitly. A simple technique is shown in figure 3.


  
Figure 3:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim}{
IMAGE *t...
...1) ;
destroy_image(tmp2) ;
}\end{verbatim}\end{tex2html_preform}}
\end{figure*}

Error Handling

 

There are several ways in which functions could notify their caller of an error condition. The neatest way is for the function to return an error code, but this is ruled out by the function design, which returns pointers to objects as the function return value.

An alternative is for the caller to pass in the address of a variable which the function can set to the error code. This was used in the example above. This is not very readable and unnecessarily complicates the arguments to the function.

We chose a method similar to that used by the standard C run-time library [3]. A global variable (called image_error_code) is provided which is set to an error code if an error occurs. An error condition is signalled by a function returning an invalid value. Functions which return pointers will return NULL if there is an error. Functions which normally have no return value will return a boolean, with FALSE indicating an error.

Error messages can be optionally displayed on the standard error output for information or debugging, as well as the program taking appropriate action after examining the image_error_code. This is similar to the technique used in the IPAL library being developed by the Numerical Algorithms Group [1].

Note that functions which take pointers to objects as arguments do not attempt to check whether the pointer points to a valid object, since there is no reliable way of doing this. It is up to the caller to ensure the integrity of pointers passed into the library functions. The most a function can do is check whether the type of object that it is passed is sensible for that function's operation.

A Small Example

Figure 4 shows a small example program demonstrating how the library functions are called in practice. This program acquires a monochrome image from the current video source and thresholds the centre portion of it, displaying the results in the frame store. The result is shown as figure 5.


  
Figure 4:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim} ...

The program declares some variables. There are two IMAGEs and one WINDOW. The IMAGE structures hold an image and the WINDOW structure defines the shape and size of an area. These structures are discussed in more detail in section 6.1. The first function call acquires a full screen monochrome image from the frame store. It creates the structure and returns a pointer to it, which is saved in big_image. Next we create a WINDOW structure which represents the centre portion of the image. The coordinates give a square area and the NORMAL indicates a rectangular window (rather than an arbitrary shape).

Next we have a cascaded function call. fg_copy_mono copies the area from the frame store and this image is used as the argument to the thresholding function. The area is thresholded at a level of 128 using the monochrome plane of the image (specified by PLANE_MONO) and the resulting binary image is saved in small_image.

After clearing the frame store we display the original image and the thresholded area on top of it. Figure 5 shows this applied to a picture of a cobra.


  
Figure 5: The output of the thresholding example program
\begin{figure*}
\par\centerline{\framebox{\psfig{figure=datapic.ps,width=2.5in,height=2in}}}
\par\end{figure*}

Implementation

The library has been implemented in C running under UNIX. This combination allows large data structures to be dynamically created and is ideal for the data structures described above. The library can be split into two parts. There are routines which create and manipulate the data structures and routines which operate on the data itself. These are summarised in appendix A.

Implementing the Data Structures

 

Each data structure is encapsulated in one source file and one header file. The header file declares the data structure itself and all the corresponding types and the functions which operate on the structure. The source file defines the functions.

Each source file has routines to create, copy and destroy a data structure. Often there are file I/O routines too. Many structures have routines to retrieve and set some of their fields, and to perform simple operations on the object they represent.

Most of the data structure manipulation routines are written in terms of common low level memory management routines, which deal in the blocks of memory that make up a data structure. These routines are invoked through macros with the type of the object as a parameter. This increases readability of the code since the programmer doesn't need to cast the pointer returned or calculate the size of the object, the macro puts this in automatically.

The macro is defined as in figure 6, where create_structf is a function which calls the system to allocate some memory and checks to see if enough was available.


  
Figure 6:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim} ...

This allows the implementor of a new structure to write the code in figure 7,

  
Figure 7:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim}{
NEW_STRU...
...TURE) ; /* very readable */
}\end{verbatim}\end{tex2html_preform}}
\end{figure*}

A similar abstraction is available for copying objects and for reading and writing objects to files. Destroying objects is all done by the same function which calls the system to free the memory.

The main data structures available are IMAGE and SEQUENCE. An IMAGE looks like,

IMAGE image type data storage type number of frames image size WINDOW pointer to the raw data

The image size WINDOW defines the size, shape and position of the image. A number of separate images (frames) can be stored in a single structure (although this is not commonly used, as memory is at a premium). The data storage type defines the type of the raw image data (for example, bytes, integers, floating point) and the image type defines the type of image in the structure.

Several types of images have been incorporated into the library. These are monochrome images, colour images (both Red, Green, Blue and Hue, Saturation, Brightness), edge maps, thresholds and Hough transform accumulators. Depending on the type, an image may have several planes in each of its frames. A colour HSB image has three planes, one each for Hue, Saturation and Brightness. An edge map has two planes, one for edge magnitude and one for edge direction. A threshold image is simply a binary image of 0s and 1s which is generated by some functions to indicate which areas of an image satisfy some criterion. Such an image can be used by other functions to apply an operation to a specific area of an image. A Hough transform accumulator image currently holds a two dimensional Hough accumulator array and is generated by the Hough transform functions.

The WINDOW structure which defines the size and shape of an image has the following structure,

WINDOW window type bounding coordinates BOUNDARY

The bounding coordinates specify the smallest rectangle enclosing the image and the BOUNDARY structure represents an arbitrary shape if necessary. The window type defines whether or not a BOUNDARY structure is present.

A BOUNDARY structure is a head for two linked lists. It contains pointers to the first and last elements of each list. One list contains elements which represent horizontal segments of scan lines (i.e. two x coordinates and a y coordinate). The segments are stored in left to right, top to bottom order and together represent the area of the shape. The actual image data that is stored for such an area is that contained within the bounding rectangle of the shape. This makes traversing the image data more efficient. (see section 6.3)

The other list contains elements which represent vertices of a polygon. This list can be used to draw the outline of the shape. Both lists are created and maintained independently as conversion between them would be unnecessarily complex.

At the same level as an IMAGE is the SEQUENCE. A SEQUENCE is an abstract representation for a set of IMAGE s. It is an abstract representation since it does not contain the images themselves, but it knows where to get them from.

SEQUENCE sequence type frame number data sequence source

The sequence source points to an IMAGE object or a VIDEODISC object, depending on the sequence type, where the frames are stored. The frame number data holds information about the length of the sequence and where to find it. The same code can access individual IMAGEs from the sequence whatever device the frames are stored on. This abstraction is very useful for dealing with `real world' images from laserdiscs (which we could not hope to store on magnetic disc) and short artificial test sequences which we create ourselves. A SEQUENCE can be converted to an IMAGE object (containing all the frames) for processing of all the frames together, which is useful in some cases.

The error handling code is in another source file. This defines the global image_error_code variable and all the values it can take. It also provides functions to make error messages visible or invisible.

Routines which interface with different types of hardware are in separate source files. There are routines to capture and display images using frame grabber hardware, to digitise images from a video disc player and to display images on a SUN screen.

Implementing Image Processing Routines

 

The actual image processing routines are in a separate source file for each routine. These routines are quite complicated as they have to deal with all possible types of data structure passed to them. Although this makes things easier for the library user, it complicates matters for the function implementor.

There are two main ways in which the IMAGE structures differ. The first is the image type. Some functions (for example displaying and copying) are applicable to all types of images whereas other functions only make sense for some types. Each function must examine the type of the image and decide how to process it, even if it just means signalling an error condition.

The other difference between IMAGE structures is the storage type. This affects how large the structure is and how the function has to access the image data.

A function implementing a typical image processing operation would have the structure shown in figure 8.


  
Figure 8:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim}IMAGE *proc...
...}
return (new_image) ;
}\end{verbatim}\end{tex2html_preform}}
\par\end{figure*}


  
Figure 9:
\begin{figure*}
{\normalsize
\begin{tex2html_preform}\begin{verbatim} ...

In the final switch statement which deals with the storage type, the process_type macro is invoked for each case. The type for accessing the image data is the first parameter of the macro. This allows the macro (which does the processing) to be written in a type independent way and reduces the scope for errors in duplicated code.

The process_type macro might be defined as in figure 9. 1

At this level we set up two pointers to the source and destination image data. The macro iterates over all the sub images in the structure (usually only one), processing the planes according to the type of the image. Here, process_plane would be a macro that is given the source and destination pointers and some information about the size of the plane and sets pixels in the destination image as some function of the source image data. It is here that all the indescribable pointer manipulations take place which are the key to an efficient implementation of a function. It leaves the pointers ready at the start of the next plane, so that this example would perform the same operation on each plane of each frame in the IMAGE structure.

This type of structure has been flexible enough to implement many image processing functions. These cover image acquisition and display, edge detection, image differencing, Hough transforms, image scaling, image thresholding and conversions between different image types.

Efficiency Considerations

 

One of the aims during the design and implementation of this library has been efficiency. There are trade-offs to be made between types of efficiency. In the design of the function interfaces pointers have been used to pass whole image structures to functions. This gives efficiency of expression as well as a speed advantage when compared to functions with many parameters. The main area where efficiency gains can be made is in the implementation of the processing routines themselves. Here a decision was made to achieve speed efficiency over space efficiency.

It is well known that sequential access to memory using pointer manipulation is faster than similar access using an array [3]. Given the static nature of arrays in C, we decided to use pointers and dynamic memory allocation throughout the library. This allows the implementation of arbitrary sized images without wasting memory and without a maximum size constraint imposed by the library.

Pixel Manipulation

In the lowest level macro of a function implementation (like the process_plane macro in section 6.2) all the pixel manipulation is performed with pointers. Typically there is at least one pointer to source image(s) and a pointer to a destination image.

For a rectangular image the method used to traverse the image data depends on the operation being applied. For a simple pixel operation (like global thresholding) one loop is sufficient, incrementing the pointers $width \times height$ times. For an operation using x and y coordinates, two nested loops are used with the pointers being incremented in the inner loop. For a neighbourhood operation, surrounding pixel data is accessed by adding and subtracting offsets from the pointers. An offset of 1 allows data one pixel to the left and right to be accessed and an offset of width allows data one pixel above and below to be accessed. This easily extends for larger neighbourhoods.

For an arbitrarily shaped image there is a BOUNDARY structure associated with the image's image size WINDOW which is described in section 6.1. The actual data stored corresponds to the rectangular bounding area which encloses the shape of the image. There is a some space redundancy here, as more data is stored than strictly required but this is offset by the fact that the simple pointer manipulations just described can still be used for arbitrarily shaped images, leading to simpler, faster and more robust code. The only difference need be in the loops which control the traversal of the data.

These loops consist of an outer loop (to follow the linked list in the BOUNDARY structure) and an inner loop to move along each horizontal line segment. However, if it is not vital that only the area of the image is processed, then the whole rectangular bounding area can be processed as though it were a rectangular image. For example, to display an arbitrarily shaped image it is vital that only pixels within the image are displayed. To globally threshold the image it doesn't matter if all the pixels in the bounding rectangle are processed, as long as all the image pixels are thresholded.

Clearly, there are more trade-offs here. Treating the image as rectangular allows the same (simpler) code to be used for all cases. The disadvantage is that you are processing more data than you need to. The cost of processing the extra pixels needs to be weighed against the extra complexity of the traversal loops. This depends on the cost of processing an individual pixel and the particular shape of the image.

Note that for the rectangular case, no special action has to be taken to account for edge effects in neighbourhood operations, whereas the traversal code for a shape would have to be altered to allow a border around the arbitrary area. So far, arbitrarily shaped images have been treated as rectangular except where necessary (image display, template matching and Hough transform matching).

Future Developments

The library is constantly having new functions added as they are needed. It is hoped that a number of other projects will use the data structures (or the ideas behind them) so that eventually a large base of useful software will be developed.

Since the library was first described (in [2]) many functions have been added and a few more image types have been created. No major changes have been required to the basic design of the data structures and functions.

The library code is written in an object-oriented fashion, which makes it an ideal candidate for implementation in C++. The C++ source code would be a lot neater and more robust. Memory management would be greatly improved as each object would be automatically destroyed as soon as it goes out of scope.

Data Structures and Functions

 

This appendix describes the functions and data structures that the library contains. More detailed documentation and source code are available from the authors (e-mail: mrd@ecs.soton.ac.uk).

Bibliography

1
M. K. Carter, K. M. Crennell, E. Golton, R. Maybury, A. Bartlett, S. Hammarling, and R. Oldfield.
The design and implementation of a portable image processing algorithms library in FORTRAN and C.
In Proceedings of the 3rd IEE International Conference on Image Processing and its Applications, pages 516-520, 1989.

2
M. R. Dobie and P. H. Lewis.
Tools and data structures for image processing in C.
1990 Research Journal, Department of Electronics and Computer Science, University of Southampton, pages 61-63, 1990.

3
Brian W. Kernighan and Dennis M. Ritchie.
The C Programming Language.
Prentice Hall, 1978.

4
Jim Piper and Denis Rutovitz.
Data structures for image processing in a C language and UNIX environment.
Pattern Recognition Letters, 3:119-129, March 1985.

About this document ...

Data Structures for Image Processing in C

This document was generated using the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998)

Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -split 1 data_prl.

The translation was initiated by Mark Dobie on 1998-06-08


next up previous
Mark Dobie
1998-06-08