Acknowledgment: Part of this work was presented at MILCOM '90 (IEEE), Monterey, CA, USA and International SIAM Conference, Washington, DC, June 1991.

20th October 1992

F. Marvasti and C. Liu\* (Communications Research Group, Dept. of Electronics and Electrical Eng., King's College London, University of London, Strand WC2R 2LS, London, United Kingdom)

\* Presently at Robotics, Skokie, IL, USA

## References

- 1 MARVASTI, F.: 'Spectral analysis of random sampling and error free recovery by an iterative method', *Trans. IECE Jpn.*, February 1986, E69, (2), pp. 79-82
- 2 MARVASTI, F., ANALOUI, M., and GAMSHADZAHI, M.: 'Recovery of signals from nonuniform samples using iterative methods', IEEE Trans., April 1991, ASSP-39, (4), pp. 872–878
- 3 MARVASTI, F.: 'A unified approach to zero-crossings and nonuniform sampling of single and multidimensional signals and systems' (NONUNIFORM Publication, Oak Park, IL 1987)
- 4 TAYLOR, A.: 'Functional analysis' (John Wiley, New York, 1958), p. 164

## DEVICE FOR GENERATING BINARY SEQUENCES FOR STOCHASTIC COMPUTING

M. van Daalen, P. Jeavons, J. Shawe-Taylor and D. Cohen

Indexing terms: Binary sequences, Logic and logic design, Information theory

A novel technique for the generation of high speed stochastic bit streams in which the 'l' density is proportional to a given value is presented. Bit streams of this type are particularly useful in bit serial stochastic computing systems, such as digital stochastic neural networks. The proposed circuitry is highly suitable for VLSI fabrication.

Introduction: The use of stochastic bit streams allows a dramatic simplification of the circuitry required to implement many devices [1], because the multiplication of two values may be performed by computing the bit-wise conjunction of two corresponding streams. The highly pipelined digital design described in this Letter was developed for use with high speed digital stochastic neural networks, as described in References 2-4

The proposed stochastic bit stream generator is highly pipelined and thus potentially extremely fast. The individual pipeline stages, known as modulators are simple, and the number used defines the overall resolutions of the generated bit stream. The approach taken is to synthesise the required stochastic bit stream by appropriately combining many independent stochastic bit streams with a bit probability of 0.5.\* These fixed value bit streams, referred to as carrier streams, are easy to generate using linear feedback shift registers [5], as described later.

Complete stream generator: The generator is constructed as a pipeline of k series connected single bit modulators, one for each bit of resolution in the required probabability value. The input consists of a k bit binary value, representing a probability in the range  $0_2$ – $0.1111 \dots 111_2$ . The individual k binary bits of this value will be called 'modulation bits', and they are each connected, in sequence, to one of the k bit modulators.

Bit modulator: The circuit diagram below (Fig. 1) shows the logic required to implement a bit modulator. Each bit modulator processes a bit stream from the preceding stage (the very first stage is supplied with an all zero stream of bits) according to the value of the modulation bit which is passed to it. The modulation bit is connected to the terminal marked mod bit.



Fig. 1 General bit stream modulator element

The output of the bit modulator is connected to a clocked flipflop which allows these devices to be cascaded in series, thus forming a pipeline, producing one new output bit with every clock cycle. Such a bit stream generator would also require a holding register to provide the respective modulation bit inputs with the appropriate binary probability value.

The particular logic operation implemented by the general bit modulator depends on whether the value of the modulation bit is '1' or '0'. If it is '1', then the modulator effectively calculates the bit-wise OR of the input stream and the carrier stream. If the modulation bit is '0', then the modulator calculates the bit-wise AND of the input stream and the carrier stream. Thus stream generators that are intended to generate fixed stochastic values may be efficiently implemented.

Furthermore very efficient usage of electrically reconfigurable FPGAs would be possible [6], as the required binary probability value can be directly encoded into the stream generator as a series of '1' or '0' bit modulators. To change this value, the relevant part of the FPGA may be rapidly reconfigured.

To understand the effect of each modulator on how the overall stream generator works, consider the probability of a bit being set in the output of a particular modulator on any clock cycle when the probability of a bit being set in the input stream from the previous stage is p. If the modulation bit is '1', then it follows from the independence of the input stream and the carrier stream that the probability of the output bit being set is  $\frac{1}{2}p + \frac{1}{2}$ . On the other hand, if the modulation bit is '0', then the probability of the output bit being set is  $\frac{1}{2}p$ .

The combined effect of the sequence of modulators with the modulation bits taken from the desired binary probability value, is to construct a bit stream in which the probability of a bit being set is equal to this probability value.\*

Bit stream resolution: All modulators multiply their input bit streams by  $\frac{1}{2}$ . A consequence of this is that in a given stream generator each modulator acquires a particular weighting factor with respect to the final output stream. The kth modulator, that is also the one furthest from the final output, has the smallest weighting factor. This then defines the resolution of a k modulator bit stream generator as being  $1/2^k$ .

To accurately represent a value as a stochastic bit stream of this type, it is important that the appropriate number of stream bits are processed such that any inaccuracies due to random variance errors are eliminated. As the bit stream is a Bernoulli sequence, its variance takes the form of a binomial distribution, and is a function of the encoded bit probability. The worst case variance occurs when  $p = \frac{1}{2}$ , and it can be shown that the number of stream bits required to achieve the maximum level of accuracy is given by the following expression:  $n = 2^{2v-2}$ , where v is the number of bits in the binary probability value.

Generating the carrier streams: A chip containing n, k modulator, bit stream generators will require kn statistically independent carrier streams. Ideally each stream should each be based on a truly random source of bits, but this is difficult to arrange and thus impractical within a digital device.

The solution described in this Letter is to make use of a

<sup>\*</sup> JEAVONS, P., COHEN, D., and SHAWE-TAYLOR, J.: 'Generating binary sequences for stochastic computing', submitted to IEEE Trans. Information Theory

linear feedback shift register, configured to generate a maximal length pseudorandom bit sequence [7]. A criterion for maximal length PRBS generators ( $[2^n - 1]$  bits) is that the shift register polynomial of degree n must be irreducible over the Galois field of order 2 where n is prime. Tables of useful register sizes and tap positions that lead to such sequences are given in References 7 and 8.

Multiple streams from small number of taps: One solution to the problem of generating many independent sequences has been described in Reference 6, where a single large PRBS generator is used. The multiple carrier streams are all derived from it by adding together in different combinations the outputs from a small number of taps taken from appropriate points along the shift register. These distinct carrier streams are each part of the main pseudorandom sequence, but shifted so that they start from different positions.

It is suggested in Reference 5 that to minimise unwanted correlations between the derived streams the tap positions and combinations used to produce them should be organised in such a way that the resulting sequences start from well-spaced locations in the original sequence. However, it should be noted that this is not sufficient to ensure that the derived sequences are not highly correlated. In fact, if the number of derived sequences is greater than the number of taps then this method is bound to introduce correlations between them.

To illustrate this point consider using just two tap positions yielding two sequences,  $(s_i)$  and  $(t_i)$ , and combine these to derive a third sequence  $(s_i \oplus t_i)$ . If three carrier sequences are uncorrelated then the sequence obtained by taking the bitwise AND of all three will have bits set with probability 0.125. However, the bitwise AND of the sequences  $s_i$ ,  $t_i$  and  $(s_i \oplus t_i)$ is the all zero sequence, which indicates that these three sequences are highly correlated.

Multiple streams from successive taps: The method presented in this Letter for generating the carrier streams makes use of a single maximal length PRBS generator. The streams are simply obtained directly by tapping successive elements of the PRBS shift register. These bit streams are highly overlapping, but almost perfectly uncorrelated.\*

Any realistic device will require a large number of carrier streams, such that typical shift register lengths will be of the order of 1000 bits (lengths of this order will also produce extremely long PRBS sequences). Such a shift register would be sectioned up, with each portion supplying carrier streams to the local stream generators. In this way a large PRBS shift register can be conveniently and efficiently distributed across a chip.

To ensure that the carrier stream inputs to successive modulators in a bit stream generator do not coincide we simply clock the PRBS shift register and the stochastic bit stream generators such that they produce their bit streams in opposite directions relative to each other. In this way the elements of the PRBS sequence (si) which are used for the generation of the jth bit from a bit stream generator with k modulators are  $s_j, s_{j+2}, \dots, s_{j+2k-2}$ . Provided  $2k \le n$ , where n is the length of the PRBS shift register, then all possible kbinary sequences will be generated equally often at these positions, as the PRBS shift register is clocked, with the exception of the all zero sequence which will have a probability deficit of  $1/(2^n - 1)$ . Hence for large values of n, the inputs to the modulators will be effectively uncorrelated and will have almost the required frequencies of 0s and 1s.

Conclusion: This Letter has shown how it is possible to build a device capable of simultaneously generating many stochastic bit streams for use in bit serial stochastic computing systems [1-3]. The overall design is relatively simple, and is highly pipelined, thus allowing extremely fast operation. The initial prototypes are expected to produce bit streams at a rate in excess of 50 MHz.

M. van Daalen, P. Jeavons, J. Shawe-Taylor and D. Cohen (Connection Science and Machine Learning Group, Royal Holloway and Bedford New College, University of London, Egham, Surrey, United

## References

- GAINES, B. R.: 'Stochastic computing systems', Advances in Informa-tion Systems Science, 1969, 2, pp. 37–172
- VAN DAALEN, M., JEAVONS, P., and SHAWE-TAYLOR, J.: 'Probabilistic bit stream neural chip: Implementation'. Int. Workshop on VLSI for Artificial Intelligence and Neural Networks, Oxford University, September 1990
- SHAWE-TAYLOR, J., JEAVONS, P., and VAN DAALEN, M.: 'Probabilistic bit stream neural chip: Theory', Connection Science, 1991, 3, (3),
- STANFORD TOMLINSON, M., JUN., WALKER, D. J., and SILVILOTTI, M. A.: 'A digital neural network architecture for VLSI'. IJCNN, San Diego, 1990, II, pp. 545-550
- ALSPECTOR, J., GANNETT, J. W., HABER, S., PARKER, M. B., and CHU, R. A VLSI efficient technique for generating multiple uncorrelated noise sources and its application to stochastic neural networks', IEEE Trans., 1991, CAS-38, (1), pp. 109-122 Toshiba Ltd: 'Field programmable gate array'. TC9900, 1992'
- WATSON, E. J.: 'Primitive polynomials (mod 2)', Math. Comp., 1962, 16, pp. 368-369
- ZIERLER, N.: 'Primitive trinomials whose degree is a Mersenne exponent', Inf. Control., 1969, 15, pp. 67-69

## **GRATING ASSISTED VERTICAL** COUPLER/FILTER FOR EXTENDED TUNING **RANGE**

L. L. Buhl, R. C. Alferness, U. Koren, B. I. Miller, M. G. Young, T. L. Koch, C. A. Burrus and G. Ravbon

Indexing terms: Optical couplers, Multiplexing

Broadly tunable, narrowband wavelength filters that can be integrated with other photonic devices are important elements for WDM multi/demultiplexing systems. Recently, we have demonstrated a broadly tunable, narrowband wavelength selective grating assisted vertical directional coupler with a 215 Å tuning range. In the Letter, an improved version of the vertically stacked buried rib InGaAsP/InP integrable wavelength filter that has a tuning range greater than 370 Å

Broadly tunable, narrowband wavelength filters that can be integrated with other photonic devices (i.e. lasers, detectors, amplifiers) are important elements for WDM multi/ demultiplexing systems. Recently, we demonstrated a broadly tunable, narrowband wavelength selective grating assisted vertical directional coupler with a 215 Å tuning range [1]. Although this initial demonstration indicated enhanced tunability compared with, for example, reflection grating filters, the enhancement was less than expected in an optimally designed device. Essentially the enhanced tunability of the grating assisted vertical coupler filter results because the normalised change in the tuned centre wavelength,  $\Delta \lambda/\lambda$ , depends on the tuned induced index change normalised by the difference in index between the two waveguides rather than the index of either waveguide as it would for a grating reflector. We report an improved version of the vertically stacked buried rib InGaAsP/InP integrable wavelength filter that has a tuning range greater than 370 Å.

The schematic diagram for the improved structure, which is essentially an inverted version of the earlier design, is shown in Fig. 1. The device consists of two buried rib waveguides with very different effective indices ( $\Delta N = 0.11$ ) which can be easily achieved by different waveguide material compositions. In this case the bottom buried rib waveguide is of  $\lambda_g = 1.1 \,\mu\text{m}$ bandgap material and the top waveguide is  $\lambda_g = 1.3 \,\mu\text{m}$ . The coarse period grating  $(\Lambda_g = 16 \,\mu\text{m})$  used to phase match the evanescent coupling between the two waveguides at a particular  $\lambda_0$  is located above the top waveguide and is of the same material composition as the top waveguide.

This present device offers several advantages over the previous implementation described in Reference 1. Most importantly, placing the higher bandgap wavelength waveguide on