DIGITAL INSTRUMENTATION FOR WALSH-FOURIER SPECTRUM MEASUREMENT

by


A Thesis
Submitted to the Faculty of Graduate Studies in Partial Fulfilment of the Requirements for the Degree
Doctor of Philosophy

McMaster University
August 1979
DIGITAL INSTRUMENTATION FOR WALSH-FOURIER SPECTRUM MEASUREMENT
DOCTOR OF PHILOSOPHY (1979)  McMaster UNIVERSITY
(Electrical Engineering)  Hamilton, Ontario

TITLE: Digital Instrumentation for Walsh-Fourier Spectrum Measurement

AUTHOR: K. Muniappan, B.E. (Madras, 1966)
        M.Eng. (McMaster, 1976)

SUPERVISOR: Professor R. Kitai

NUMBER OF PAGES: xvi, 187
ABSTRACT

A review of Walsh Spectral Analysers (WSA) using direct and fast transform methods is presented. A serial processor is developed that uses long shift registers to perform a Hadamard transform. This is extended by adding additional hardware to yield coefficients in dyadic and sequency order. Incoming data is thereby stored in a permutated manner, followed by sequential retrieval and transfer to a Hadamard transform processor. This scheme is faster than an earlier processor described by Geadah and Corinthios. Also, a new pipeline structure with identical stages is evolved.

The design of a Microprocessor-based Walsh-Fourier Spectral Analyser is given in detail. It uses an off-the-shelf single board microcomputer System 80/10 in conjunction with a special purpose board. The latter includes A/D conversion circuitry, a direct memory access controller, and a frequency multiplication module (FMM). The purpose of FMM is to generate \(2^n\) sampling pulses (64 in the system built) within one cycle of the input signal; this feature is required to compute the Fourier/Walsh coefficients of a periodic signal without leakage error. The new FMM circuit is a substantial improvement over earlier designs and permits a higher frequency of operation at a low clock
frequency.

Two methods (one by Siemens and Kitai and the other by Tadokoro and Higuchi) of Walsh to Fourier conversion are reviewed. The conversion process is compared with the Cooley-Tukey FFT methods with respect to the number of multiplications, memory requirements, and effects of finite word length in computation.

The Walsh to Fourier conversion process is implemented through software to obtain 64 sine and cosine components. The instrument is interfaced to a HP 2647A intelligent graphic terminal for the display of Walsh and Fourier coefficients.
ACKNOWLEDGEMENTS

I would like to express my gratitude to Dr. R. Kitai for his supervision of the entire work and the valuable help in the writing and presentation of this thesis.

Thanks are also due to Dr. A.S. Gladwin, Dr. S.K. Sarna and Dr. P.C. Yip who, as members of the Supervisory committee, have offered encouragement and many useful suggestions.

Special thanks are due to my friend Dr. M. Nallasamy at the Indian Scientific satellite Project, India for his initial financial assistance and encouragement which made this work possible.

The completion of this work would not have been possible without the financial assistance of McMaster University and of the National Science and Engineering Research Council of Canada.

The author wishes to thank Miss. P. Dillon of the Word Processing Centre for typing the thesis.

Finally, I am grateful to my father-in-law, Mr. G. Ayyavoo, my wife Kasturi and my son Ashok for their cooperation and constant encouragement during the course of the work.
TABLE OF CONTENTS

ABSTRACT iii
ACKNOWLEDGEMENTS v
LIST OF FIGURES viii
LIST OF TABLES xi
LIST OF SYMBOLS AND ABBREVIATIONS xii

CHAPTER 1: INTRODUCTION 1

CHAPTER 2: WALSH FUNCTIONS: DEFINITION AND ORDERING 9
  2.1 Introduction 9
  2.2 Walsh functions: Definition 9
  2.3 Properties of Walsh Functions 13
  2.4 Walsh and Fourier Expansion 16
  2.5 Ordering of Walsh Functions 17

CHAPTER 3: WALSH SPECTRAL ANALYSER 23
  3.1 Concepts of the Walsh Spectral Analyser (WSA) 23
  3.2 A Review of Walsh Spectral Analysers 24
  3.3 Fast Transform Algorithms and their Hardware Implementations 33
  3.4 New Design Approach for a Digital WSA to obtain Coefficients in Natural, Dyadic and Sequency Orderings 59
<table>
<thead>
<tr>
<th>TABLE OF CONTENTS (continued)</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>CHAPTER 4:</strong> MICROPROCESSOR-BASED WALSH SPECTRAL ANALYSER</td>
<td></td>
</tr>
<tr>
<td>4.1 Design Principles</td>
<td>63</td>
</tr>
<tr>
<td>4.2 System Organization</td>
<td>64</td>
</tr>
<tr>
<td>4.3 Hardware Design</td>
<td>70</td>
</tr>
<tr>
<td><strong>CHAPTER 5:</strong> WALSH-TO-FOURIER CONVERSION</td>
<td>117</td>
</tr>
<tr>
<td>5.1 Introduction</td>
<td>117</td>
</tr>
<tr>
<td>5.2 Walsh-to-Fourier Conversion due to Siemens and Kitai</td>
<td>118</td>
</tr>
<tr>
<td>5.3 Walsh-to-Fourier Conversion Method of Tadokoro and Higuchi</td>
<td>124</td>
</tr>
<tr>
<td>5.4 A Comparison of the Walsh-to-Fourier Conversion Process with the Cooley-Tukey FFT Algorithm for Computing the DFT</td>
<td>132</td>
</tr>
<tr>
<td><strong>CHAPTER 6:</strong> SOFTWARE DESIGN</td>
<td>147</td>
</tr>
<tr>
<td>6.1 Introduction</td>
<td>147</td>
</tr>
<tr>
<td>6.2 WFORM (Fast Walsh Hadamard Transform) Subroutine</td>
<td>147</td>
</tr>
<tr>
<td>6.3 Walsh-to-Fourier Conversion Subroutine (WFCON)</td>
<td>153</td>
</tr>
<tr>
<td>6.4 List of Subroutines for WSA</td>
<td>169</td>
</tr>
<tr>
<td>6.5 Main Program</td>
<td>170</td>
</tr>
<tr>
<td>6.6 Computational Errors</td>
<td>173</td>
</tr>
<tr>
<td><strong>CHAPTER 7:</strong> SUMMARY AND CONCLUSIONS</td>
<td>177</td>
</tr>
<tr>
<td><strong>REFERENCES</strong></td>
<td>180</td>
</tr>
<tr>
<td>Figure</td>
<td>Description</td>
</tr>
<tr>
<td>--------</td>
<td>-----------------------------------------------------------------------------</td>
</tr>
<tr>
<td>1.1(a)</td>
<td>Display of Walsh spectrum of 1 kHz sinusoid</td>
</tr>
<tr>
<td>1.1(b)</td>
<td>Fourier amplitude spectrum of the same signal after Walsh to Fourier Conversion</td>
</tr>
<tr>
<td>2.1</td>
<td>The set of eight Walsh functions</td>
</tr>
<tr>
<td>2.2</td>
<td>Walsh-ordered discrete Walsh functions for N = 8</td>
</tr>
<tr>
<td>3.1</td>
<td>Schematic diagram of an Analog Processor for computing B(1) from given eight data points X(0) \ldots X(7)</td>
</tr>
<tr>
<td>3.2</td>
<td>Schematic diagram of a N-coefficient Direct-type Walsh Spectral Analyzer</td>
</tr>
<tr>
<td>3.3</td>
<td>FWHT Signal flow-graph for N = 8</td>
</tr>
<tr>
<td>3.4</td>
<td>Organization of a Pipeline FWHT Processor</td>
</tr>
<tr>
<td>3.5</td>
<td>Processor organization using long shift registers</td>
</tr>
<tr>
<td>3.6</td>
<td>Pipeline Processor</td>
</tr>
<tr>
<td>3.7</td>
<td>FWT signal flow-graph for N = 8</td>
</tr>
<tr>
<td>3.8(a)</td>
<td>Normal butterfly operations</td>
</tr>
<tr>
<td>3.8(b)</td>
<td>Reverse butterfly operations</td>
</tr>
<tr>
<td>3.9</td>
<td>Mapping of X(\ell) to \tilde{X}(\ell) for sequency ordered transform by S_W(\cdot)</td>
</tr>
<tr>
<td>3.10</td>
<td>Mapping of X(\ell) to X(\ell) for dyadic-ordered transform by S_D(\cdot)</td>
</tr>
<tr>
<td>3.11</td>
<td>Special purpose Permuting Module</td>
</tr>
<tr>
<td>FIGURE</td>
<td>Description</td>
</tr>
<tr>
<td>--------</td>
<td>-------------------------------------------------</td>
</tr>
<tr>
<td>4.1</td>
<td>Organization of a Microprocessor-based WSA</td>
</tr>
<tr>
<td>4.2</td>
<td>Circuit for modification of six 1sbs of offset</td>
</tr>
<tr>
<td>4.3</td>
<td>System 80/10 Block diagram</td>
</tr>
<tr>
<td>4.4(a)</td>
<td>Circuit diagram of LCD</td>
</tr>
<tr>
<td>4.4(b)</td>
<td>Output waveform of LCD</td>
</tr>
<tr>
<td>4.5</td>
<td>Block diagram of the Frequency trippler</td>
</tr>
<tr>
<td>4.6</td>
<td>Waveforms of Frequency trippler circuit</td>
</tr>
<tr>
<td>4.7</td>
<td>Block diagram of Digital Frequency Multiplier — Earlier method</td>
</tr>
<tr>
<td>4.8</td>
<td>PDM characteristics for a clock frequency of 100 KHz</td>
</tr>
<tr>
<td>4.10</td>
<td>Control circuitry of FMM</td>
</tr>
<tr>
<td>4.11</td>
<td>Fractional counter and accumulator</td>
</tr>
<tr>
<td>4.12</td>
<td>Integral counter</td>
</tr>
<tr>
<td>4.13</td>
<td>Digitally controlled Pulse Duration Modulator</td>
</tr>
<tr>
<td>4.14</td>
<td>Pulse width vs control voltage characteristics</td>
</tr>
<tr>
<td>4.15</td>
<td>Triggering and load control</td>
</tr>
<tr>
<td>4.16(a)</td>
<td>Circuit diagram of A/D converter</td>
</tr>
<tr>
<td>4.16(b)</td>
<td>A/D Converter Timing diagram</td>
</tr>
<tr>
<td>4.17(a)</td>
<td>DMA controller circuit diagram</td>
</tr>
<tr>
<td>FIGURE</td>
<td>Description</td>
</tr>
<tr>
<td>--------</td>
<td>------------------------------------------------------------------------------------------------------------</td>
</tr>
<tr>
<td>4.17(b)</td>
<td>Address data interface between 8257 and System 80/10 in programming mode</td>
</tr>
<tr>
<td>4.18</td>
<td>Timing diagram of DMA operation</td>
</tr>
<tr>
<td>4.19</td>
<td>Control circuit</td>
</tr>
<tr>
<td>5.1</td>
<td>Walsh to Fourier Transformation Process</td>
</tr>
<tr>
<td>5.2</td>
<td>Number of multiplications required to compute the Fourier coefficients by the FFT and Walsh to Fourier Conversion method</td>
</tr>
<tr>
<td>5.3</td>
<td>Number of multiplications required in the Walsh to Fourier conversion method when the number $L$ of desired components is less than $N$</td>
</tr>
<tr>
<td>5.4(a)</td>
<td>Signal flow-graph of the decimation in time FFT algorithm</td>
</tr>
<tr>
<td>5.4(b)</td>
<td>Signal flow-graph of the decimation in frequency FFT algorithm</td>
</tr>
<tr>
<td>5.5</td>
<td>Improvement of the computational errors of the Walsh to Fourier Conversion process</td>
</tr>
<tr>
<td>6.1</td>
<td>Flowchart of WTPORM (FWHT) subroutine</td>
</tr>
<tr>
<td>6.2</td>
<td>Pattern of sign changes of the conversion matrix elements for cosine components from the corresponding conversion matrix elements for sine components</td>
</tr>
<tr>
<td>6.3</td>
<td>Flowchart of Walsh to Fourier conversion subroutine</td>
</tr>
<tr>
<td>6.4(a)</td>
<td>Flowchart of main Walsh spectral analyser program in Intel 8080 Assembly language</td>
</tr>
<tr>
<td>6.4(b)</td>
<td>Flowchart of BASIC program for hp 2647A graphic terminal</td>
</tr>
</tbody>
</table>
## LIST OF TABLES

<table>
<thead>
<tr>
<th>TABLE</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.1</td>
<td>I/O addresses of SBC 80/10</td>
<td>73</td>
</tr>
<tr>
<td>4.2</td>
<td>System Bus</td>
<td>74</td>
</tr>
<tr>
<td>4.3</td>
<td>Timing error of FMM</td>
<td>84</td>
</tr>
<tr>
<td>4.4</td>
<td>FMM characteristics</td>
<td>91</td>
</tr>
<tr>
<td>5.1</td>
<td>First calibration coefficients of sin $2\pi i/N$</td>
<td>128</td>
</tr>
<tr>
<td>5.2</td>
<td>Conversion matrix $F_b$ for $N = 16$</td>
<td>130</td>
</tr>
<tr>
<td>6.1</td>
<td>$F_b$ matrix elements</td>
<td>159</td>
</tr>
<tr>
<td>6.2</td>
<td>Compensation matrix elements</td>
<td>168</td>
</tr>
<tr>
<td>6.3</td>
<td>Calibration coefficients of a sinusoidal signal peak to peak 4 volts</td>
<td>174</td>
</tr>
<tr>
<td>6.4</td>
<td>Amplitude and phase spectrum of a sinusoidal signal peak to peak 4 volts</td>
<td>175</td>
</tr>
</tbody>
</table>
LIST OF SYMBOLS AND ABBREVIATIONS

\{ \phi \} 
- set of orthogonal functions

x(t) 
- analog input signal

T 
- time interval

M_n 
- subspace formed by the first n elements of \{ \phi \}

\phi_k 
- kth member in the set \{ \phi \}

WSA 
- Walsh Spectral Analyser

\{ \phi \}_W 
- set of sequency ordered Walsh functions

wal(k,t) 
- Walsh function of index k

cal(s,t) 
- even symmetry Walsh function of sequency s

sal(s,t) 
- odd symmetry Walsh function of sequency s

i_k, t_k 
- kth bit of the binary representation of i and t

W^k_j 
- Boolean representation of sampled Walsh function of index k, j represents sampling instant

\bar{g_i}^k 
- ith bit of Gray code representation of k

\bar{W} 
- Walsh matrix

\oplus 
- modulo 2 sum or Exclusive-OR

B(k) 
- kth sequency ordered Walsh coefficient

A_s 
- sth cal coefficient

B_s 
- sth sal coefficient

a_k 
- kth cosine component

b_k 
- kth sine component

\{ \phi \}_p 
- Paley ordered set of Walsh functions
LIST OF SYMBOLS AND ABBREVIATIONS (continued)

Pal(k,t) kth member of \( \{ \phi \}_p \)
rad_k(t) kth Rademacher function
sgn signum
b(i) Gray code to binary conversion of i
\( \{ \phi \}_h \) Hadamard-ordered set of Walsh functions
wal_h(k,t) kth member of \( \{ \phi \}_h \)

\( \text{<i> } \) Bit-reversal of i
N number of samples

\( X(i) \) Discrete \( i \)th sample of \( x(t) \)
L(t) output of delta sigma modulator

\( D_{j^+} \) Pulse width proportional to \( X(j) \)
\( D_{j^-} \) Pulse width proportional to \( -X(j) \)

\( \Delta \theta \) \( T/N \)

\( P_0 \) Pulse sequence derived from \( D_{j^+} \)
\( N_0 \) Pulse sequence derived from \( D_{j^-} \)
\( \alpha(k) \) Pulse train equivalent to \( k \)th Walsh coefficient

wal'(k,t) digital representation of \( \text{wal}(k,t) \)
FWHT Fast Walsh Hadamard Transform
FWT Fast Walsh Transform

\( B_h(N) \) Hadamard-ordered (\( N \times 1 \)) Walsh coefficient matrix

\( H(N) \) (\( N \times N \)) Hadamard matrix

\( \otimes \) Kronecker product

\( B_h(k) \) \( k \)th Hadamard-ordered Walsh coefficient
<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>A</td>
<td>Adder</td>
</tr>
<tr>
<td>S</td>
<td>Subtractor</td>
</tr>
<tr>
<td>z_i</td>
<td>Delay operation by i sampling periods</td>
</tr>
<tr>
<td>C_i</td>
<td>i-th bit of a binary counter C</td>
</tr>
<tr>
<td>P(N)</td>
<td>'Ideal' shuffle operator</td>
</tr>
<tr>
<td>I(N)</td>
<td>(N x N) identity matrix</td>
</tr>
<tr>
<td>T_P</td>
<td>Processing time</td>
</tr>
<tr>
<td>BRO</td>
<td>Bit reversal operation</td>
</tr>
<tr>
<td>S_W(*)</td>
<td>Permutation operator for sequency ordered Walsh transform</td>
</tr>
<tr>
<td>A</td>
<td>Equal by definition</td>
</tr>
<tr>
<td>i</td>
<td>i operated by S_W(*)</td>
</tr>
<tr>
<td>SR^r</td>
<td>Shift right by r places</td>
</tr>
<tr>
<td>BR(*)</td>
<td>Bit reversal operator</td>
</tr>
<tr>
<td>X(*)</td>
<td>Data permuted by S_W(*)</td>
</tr>
<tr>
<td>B_d(k)</td>
<td>k-th dyadic-ordered Walsh coefficient</td>
</tr>
<tr>
<td>S_D(*)</td>
<td>Permutation operator for dyadic-ordered Walsh transform</td>
</tr>
<tr>
<td>i</td>
<td>i operated by S_D(*)</td>
</tr>
<tr>
<td>X(*)</td>
<td>Data permuted by S_D(*)</td>
</tr>
<tr>
<td>G1 ... GI</td>
<td>Logic elements</td>
</tr>
<tr>
<td>LCD</td>
<td>Level crossing detector</td>
</tr>
<tr>
<td>T_s</td>
<td>Input signal period</td>
</tr>
<tr>
<td>T_c</td>
<td>Clock period</td>
</tr>
</tbody>
</table>
LIST OF SYMBOLS AND ABBREVIATIONS (continued)

\( a_{fs} \)  
sth coefficient of \( \cos 2\pi ft \); an element of the conversion matrix in fth row and sth column

\( b \)  
(\( N/2 \times 1 \)) sine coefficient matrix

\( F_a \)  
cosine to cosine conversion matrix

\( x_b(t) \)  
band-limited signal

\( K \)  
compensation matrix

\( F(k) \)  
kth discrete Fourier coefficient

\( F_b \)  
the Walsh to Fourier conversion matrix for the discrete case

\( DFT \)  
Discrete Fourier Transform

\( DIT \)  
Decimation in time

\( DIF \)  
Decimation in frequency

\( e_R \)  
roundoff error random variable

\( \sigma^2_R \)  
variance of roundoff error

\( e_S \)  
scaling error random variable

\( \sigma^2_S \)  
variance of scaling error

\( \sigma^2_{EF} \)  
variance of roundoff and scaling errors combined of a DFT coefficient

\( \sigma^2_T \)  
variance of scaling error in an output Walsh coefficient

\( \sigma^2_R \)  
variance of roundoff error in an output sine/cosine component by the Walsh to Fourier conversion method
\( \sigma^2_S \) variance of scaling error in an output sine/cosine component by the Walsh to Fourier Conversion

\( \sigma^2_{EW} \) variance of error in an output Fourier coefficient by the Walsh to Fourier conversion method
CHAPTER 1
INTRODUCTION

In recent years there has been a growing interest in the study of orthogonal transforms in the area of digital signal processing. This is primarily due to the advances made in the computer technology and special purpose digital processors. Research efforts and application of such transforms include image and speech processing, selection in pattern recognition \([1,2,3,4,5]\), analysis and design of communication systems \([6]\), generalized Wiener filtering \([7]\) and spectroscopy.

The basic idea behind the study of complete systems of orthogonal functions \(\{\psi\}\), is that any function \(x(t)\) defined in an interval \((0-T)\) which is square summable within the interval can be approximated by means of an orthogonal projection on the subspace \(M_n\) spanned by the first \(n\) elements of \(\{\psi\}\),

\[
x(t) = x_n = \sum_{k=1}^{n} \langle x(t), \psi_k \rangle \psi_k
\]

with the property that the mean square error can be made arbitrarily small by choosing \(n\) sufficiently large \([8]\) (Bessel's inequality). \(\psi_k\) in Eqn. (1.1) is the kth member
of the set $\{\phi\}$. The inner product is defined by,

$$\langle x(t), \phi(t) \rangle = \frac{1}{T} \int_{0}^{T} x(t) \phi(t) \, dt$$

The important advantages of a function representation by orthogonal series expansions are: (i) there are many orthonormal sets for which formulas for the $n$th element of the set have been worked out and are given in standard texts, (ii) the invariance of the inner product, and (iii) the ease of extending the projection to a subspace $M_{n+1}$ without the need to be wholly recomputed if the projection on $M_n$ is known; only $\langle x(t), \phi_{n+1} \rangle$ need to be evaluated. The signal representations of various orthogonal transforms are judged in terms of the mean square error criterion.

The literature abounds with algorithms and machine designs for the computation of the Discrete Fourier Transform [24,25,26], and digital spectral analysers that are based on Fast Fourier Transform (FFT) algorithms are commercially available [27,28]. While a Fourier analyser permits the characterization of signals and systems in the frequency domain, an alternative representation in terms of bi-valued orthogonal Walsh functions has led to the Walsh Spectral Analyser (WSA) which finds applications for the following reasons:

1. It can be efficiently implemented on either general-purpose or special-purpose computers due to the
binary nature of Walsh functions.

2. In certain applications, such as image processing [1], fast and computationally efficient Walsh analysis may be preferred over Fourier analysis.

3. For band limited signals, a measured Walsh Spectrum may be transformed to yield corresponding Fourier spectrum. The total process of Walsh Spectrum measurement followed by conversion to Fourier spectrum is faster than the Cooley-Tukey FFT algorithm in situations where up to 64 data samples are used [23]. For data lengths greater than 64, this method is superior to the FFT method in applications where the number L of Fourier components is relatively small compared with N. The number of multiplications in that case is approximately NL/6. A 64-point data analysis suffices in many applications such as amplifier distortion measurement, biomedical signal analysis, power-system spectral measurements and some areas of pattern recognition.

This thesis deals mainly with the design of a Microprocessor-based WSA using a fast transform algorithm. The approach followed is different from earlier instrumentation for this purpose [15,16,17] in that the input sequence is permuted using minimal LSI hardware, and a fast Walsh Hadamard Transform (FWHT) is implemented.
Apart from the display, the instrument is contained entirely within an off-the-shelf Intel System 80/10 which consists of a cabinet containing an 8080 based single-board computer SBC 80/10, a card cage with ready-wired bus and power supply lines for up to three additional boards, and a power supply. The cabinet dimensions are 43.2 cm x 8.9 cm height, 50.8 cm depth (17 x 3.5 x 20 inches). Only one additional specially-designed board is used: It contains circuits for trigger level adjustment, A/D conversion, 1K of read-write memory, a frequency multiplier module (FMM) and a direct memory access (DMA) controller.

The FMM circuit [21] generates 64 equally-spaced pulses within one period of the input signal, so that the data window spans one cycle. The leakage error in the Fourier spectrum measurement is made negligibly small. Another feature of the circuit is that it does not require a high-speed clock for its proper functioning. The signal frequency may lie anywhere within the range 0.2 Hz to 10 KHz without any range switching. A measurement uses two cycles of input signal, the first being used for frequency determination and the second cycle for data aquisition. This is followed by processing of 24 msec duration, to yield the Walsh spectrum.

Given the Walsh coefficients of the signal, the corresponding Fourier coefficients are obtained by way of a
matrix multiplication so that both spectra are available. The Walsh to Fourier conversion process uses the matrix multiplication [22],

$$[a] = [K]^{-1} [F_a] [A]$$  \hspace{1cm} (1.2)

$$[b] = [K]^{-1} [F_b] [B]$$  \hspace{1cm} (1.3)

where

$$[a] = (N/2 \times 1) \text{ cosine coefficient matrix}$$

$$[b] = (N/2 \times 1) \text{ sine coefficient matrix}$$

$$[B] = (N/2 \times 1) \text{ sal coefficient matrix}$$

$$[A] = (N/2 \times 1) \text{ cal coefficient matrix}$$

$$[K]^{-1} = \text{ diagonal } (N/2 \times N/2) \text{ compensation matrix}$$

$$[F_a], [F_b] = (N/2 \times N/2) \text{ conversion matrices for cosine and sine components, respectively.}$$

The conversion process takes about 1 sec, most of the time being devoted to software multiplication.

The results (Walsh and Fourier components) computed are in binary form. They are converted into signed BCD numbers and transmitted to a Hewlett Packard graphics terminal type 2647A with processing capability for displaying numerical and graphic data. The Walsh/Fourier spectrum of a sinusoidal signal as displayed on the graphics terminal is shown in Fig. 1.1.

The instrument uses less than 2K words of program memory (PROM) and 1K words of RAM. The System 80/10 monitor is not made use of so that space is left for additional 2K
Fig. 1.1(a) Display of Walsh spectrum of 1kHz sinusoid

Fig. 1.1(b) Fourier amplitude spectrum of the same signal after Walsh to Fourier Conversion
words of ROM. Also two additional boards may be plugged into the card cage. The system therefore lends itself to extension to include other measurements at low cost. The use of a System 80/10 package and LSI controller type 8257 for DMA were found to reduce development costs and time in a significant way.

Walsh functions and their properties are introduced in Chapter 2. These functions are used in different orderings, so there is the need to define the orderings and to establish the relationship between them.

Numerous designs for a WSA are reported in the literature [9,13,14,15,16,17]. A brief review of the existing designs is presented in Chapter 3. To aid the description, the WSA's are classified into direct and Fast transform type. Long shift registers are well suited to the implementation of FWHT. Improved design techniques to adapt the FWHT processor to yield the Walsh coefficients in dyadic and sequency ordering are presented; they reduce the processing time when compared to an earlier method [17]. Also a pipeline architecture using identical stages is evolved.

Chapter 4 deals with the organization of the Microprocessor based Walsh-Fourier Spectral Analyser. A detailed description is given of the design of each major section shown in the block diagram of the instrument.
The process of obtaining the Fourier coefficients from the Walsh coefficients was first reported by Siemens and Kitai [22] and more recently by Tadokoro and Higuchi [23], using a different algorithm. The two methods are compared in Chapter 5.

The Walsh to Fourier conversion process is compared with the FPT methods with respect to the memory requirements and the effects of finite word length in addition to the number of multiplications given in reference [23].

The software of the WSA is described with the aid of flowcharts in Chapter 6. Here the discussion is limited to two special purpose subroutines, viz, WTPORM for the computation of Walsh transform and WFCON for the Walsh-to-Fourier Conversion. Data structure used in the system and its effect on the accuracy of the final results are discussed.

The limitations of the present design and suggestions for improvement are given.
CHAPTER 2

WALSH FUNCTIONS: DEFINITIONS AND ORDERINGS

2.1 Introduction

Numerous definitions, methods of generation and application of Walsh functions have appeared in the literature. This chapter is concerned with definitions and the notations used. The definitions outlined here find use in the design of Walsh functions generator. The properties of Walsh functions that are necessary to develop algorithms for the computation of Walsh coefficients and the Walsh-to-Fourier conversion process are outlined. The significance of different orderings of Walsh functions and the relationships between them are discussed. These aid the development of digital processors to yield coefficients in different orderings.

2.2 Walsh functions: Definition

The set of Walsh functions \( \{\#\}_w \) was originally defined by J.L. Walsh in an interval \((0-1)\) recursively [29],

\[
\phi_0(t) = \begin{cases} 
1 & 0 \leq t < 1/2 \\
-1 & 1/2 \leq t < 1 
\end{cases}
\]

\[
\phi_1(t) = \begin{cases} 
1 & 0 \leq t < 1/2 \\
-1 & 1/2 \leq t < 1 
\end{cases}
\]

\[
\phi_2(t) = \begin{cases} 
1 & 0 \leq t < 1/4 \\
-1 & 1/4 \leq t < 1/2 \\
1 & 1/2 \leq t < 3/4 \\
-1 & 3/4 \leq t < 1 
\end{cases}
\]

\[
\phi_3(t) = \begin{cases} 
1 & 0 \leq t < 1/8 \\
-1 & 1/8 \leq t < 1/4 \\
1 & 1/4 \leq t < 3/8 \\
-1 & 3/8 \leq t < 1/2 \\
1 & 1/2 \leq t < 5/8 \\
-1 & 5/8 \leq t < 3/4 \\
1 & 3/4 \leq t < 7/8 \\
-1 & 7/8 \leq t < 1 
\end{cases}
\]

\[\vdots\]
\[
\phi_1^1(t) = 1 \\
= -1 \\
\phi_2^2(t) = 1 \\
= -1 \\
\phi_{n+1}^{2k-1}(t) = \phi_n^k(2t) \\
\phi_{n+1}^{2k}(t) = \phi_n^{k-1}(2t-1) \\
\phi_{n+1}^{2k}(t) = (-1)^{k+1} \phi_n^{k}(2t-1) \\
\phi_{n+1}^{2k}(t) = (-1)^{k} \phi_n^{k}(2t-1) \\
\text{where} \\
k = 1, 2, \ldots, 2^n-1, \ n = 1, 2, \ldots, \ \\
0 \leq t < 1/4, 1/4 \leq t < 3/4 \\
1/4 \leq t < 1/2, 3/4 \leq t < 1 \\
0 \leq t < 1/2 \\
1/2 \leq t < 1 \\
\tag{2.1}
\]

The functions in the set \{\phi\} exhibit alternately even and odd symmetry with respect to the middle of the interval. The even and odd Walsh functions are given distinct notation, \text{cal}(s,t) and \text{sal}(s,t) respectively [6]. There is a close connection between \text{sal} and sine functions as well as between \text{cal} and cosine functions. They are related to \text{wal}(k,t) by

\[
\text{cal}(s,t) = \text{wal}(k,t), \ k = 2s \\
\text{sal}(s,t) = \text{wal}(k,t), \ k = 2s-1 \\
\tag{2.2a}
\]

where \(s\) is called the sequency [30]; that is one half the average number of zero-crossings per second.

Using this terminology, Harmuth [6] developed a recursive definition of Walsh functions in the form of a difference equation.
wal(2k + p, t) = (-1)^{[k/2]+p} \{wal(k, 2(t + 1/4))

+ (-1)^{k+p} \{wal(k, 2(t - 1/4))\} \quad (2.3)

where p = 0 or 1, \(i = 0, 1, 2, \ldots\), wal(0, t) = 1 for \(-1/2 \leq t < 1/2\); \([k/2]\) represents the integer part of \(k/2\).

This definition covers only the interval \(-1/2 \leq t < 1/2\); periodic Walsh functions can be formed by duplicating the functions over each successive interval. The first 8 Walsh functions in the interval 0 to 1 are shown in Fig. 2.1.

Another approach is to introduce Walsh functions through a definition in terms of the binary representation of indices. The formula due to Pratt, Kane and Andrews [31] is

\[
\text{wal}(k, t) = \sum_{i=1}^{\infty} (k_i' + k_{i+1}) t_i \quad (2.4)
\]

where \(k\) is an integer between 0 to \(2^{m-1}\) with the binary representation

\[
k = (k_m k_{m-1} \ldots k_2 k_1)_{1/2}
\]

and \(t\) is a real number between 0 and 1 with the binary expansion

\[
t = (t_1 t_2 \ldots t_i).
\]

Since \(k_i' = 0\) for all \(i > m\), only the first \(m\) digits of \(t\) appear in Eqn. (2.4). Hence, wal\((k, t)\) can be represented by
Fig. 2.1 The set of eight Walsh functions
a $2^m$ bit string $W_j^k$, $0 \leq j < 2^m$. Choosing to denote 1 by Boolean 0 and -1 by Boolean 1, we have

$$W_j^k = \sum_{i=1}^{m} \ (j_i + j_{i+1}) \ k_{m-i+1} \mod 2 \quad (2.5)$$

$j$ being an integer constructed by taking the first $m$ binary digits of $t$.

The set of $2^m$ Walsh functions can be thus represented by a $(2^m \times 2^m)$ Boolean matrix which is a useful representation in the computation of the Discrete Walsh transform. The Boolean matrix can be considered as the sampled version of the continuous set of wal(k,t) in Fig. 2.1, the samples being taken at $t = j/2^m$ for $j = 0, 1, \ldots, 2^m-1$. As an example the Boolean matrix $\bar{W}$ for $m = 3$ is given in Fig. 2.2. The expression within the brackets represents binary to Gray code-conversion [32]. Thus Eqn. (2.5) becomes

$$W_j^k = \sum_{i} g_i^k j_{m-i+1} \quad (2.6)$$

where $g_i^k$ denotes the $i$th bit of Gray code representation of $k$. The design of some Walsh function generators [33,34,35, 36] are based on Eqn. (2.6).

2.3 Properties of Walsh Functions

The properties of Walsh functions which can be established using the definitions given before are:
\[ \begin{bmatrix}
1 & 1 & 1 & 1 & 1 & 1 & 1 \\
1 & 1 & 1 & -1 & -1 & -1 & -1 \\
1 & 1 & -1 & -1 & -1 & 1 & 1 \\
1 & 1 & -1 & -1 & 1 & -1 & -1 \\
1 & -1 & -1 & 1 & 1 & 1 & 1 \\
1 & -1 & -1 & -1 & 1 & -1 & -1 \\
1 & -1 & -1 & 1 & -1 & -1 & -1 \\
\end{bmatrix} \]

\[ \bar{W} = \text{wal}(0,t) ; \text{cal}(0,t) \]

\[ \text{wal}(1,t) ; \text{sal}(1,t) \]

\[ \text{wal}(2,t) ; \text{cal}(1,t) \]

\[ \text{wal}(3,t) ; \text{sal}(2,t) \]

\[ \text{wal}(4,t) ; \text{cal}(2,t) \]

\[ \text{wal}(5,t) ; \text{sal}(3,t) \]

\[ \text{wal}(6,t) ; \text{cal}(3,t) \]

\[ \text{wal}(7,t) ; \text{sal}(4,t) \]

**Fig. 2.2** Walsh-ordered discrete Walsh functions

for \( m = 3 \)
(i) Walsh functions form a complete set of mutually orthogonal functions in the unit interval and hence the possibility of using them for signal approximation or representation. Due to the orthogonality of Walsh functions, the following relation applies:

\[
\int \text{wal}(k,t) \text{wal}(i,t) \, dt = \begin{cases} 
1 & k = i \\
0 & k \neq i
\end{cases}
\] (2.7)

It can be shown that the product \([\bar{W}][\bar{W}]^T\) is \(2^{-m}[I]\) where \([I]\) is the identity matrix.

(ii) \(\text{wal}(k,j) = \text{wal}(j,k)\). Hence \(\bar{W}\) is symmetric.

(iii) The product of two Walsh functions yields another Walsh function (closure property)

\[
\text{wal}(i,t) \text{wal}(k,t) = \text{wal}((i \oplus k), t)
\]

where \(i + k = i_r \cdot 2^r (i_r + k_r) \mod 2\) and \(i_r, k_r\) represent the \(r\)th bit of the binary representation of \(i\) and \(k\) respectively. As an example consider the multiplication of \(\text{wal}(6,t)\) and \(\text{wal}(12,t)\). Using the binary representation for 6 and 12, one obtains 10 for modulo 2 addition:

\[
\begin{array}{c}
6 \quad 0110 \\
+ 12 \quad 1100 \\
10 \quad 1010 \\
\end{array}
\]

The multiplication of Walsh functions is associative:

\[
[\text{wal}(i,t) \text{wal}(k,t)] = \text{wal}(i,t) [\text{wal}(k,t) \text{wal}(l,t)]
\]

Walsh functions form a group with respect to multiplication.

(A group \([G, \cdot]\) is an algebraic structure, where \(G\) is a set
and is a composition on that set such that the axioms of associativity, closure, identity, and inverses are valid [37].) The inverse of an element in the set \( \{e\}_w \) is the element itself; the identity element is \( \text{wal}(0,t) \).

2.4 **Walsh and Fourier Series Expansion**

A signal \( x(t) \) defined in an interval \((0-T)\) can be represented in terms of the sequency ordered Walsh series according to

\[
x(t) = \sum_{k=0}^{\infty} B(k) \text{wal}(k,t)
\]  

(2.8)

where the \( k \)th Walsh coefficient \( B(k) \) is given by

\[
B(k) = \frac{1}{T} \int_{0}^{T} x(t) \text{wal}(k,t) \, dt
\]  

(2.9)

Eqn. (2.8) can be written in terms of \( ca \) and \( sa \) coefficients \( A_s, B_s \) in the form

\[
x(t) = A_0 + \sum_{s=1}^{\infty} \{A_s \, ca(s,t) + B_s \, sa(s,t)\}
\]  

(2.10)

where

\[
A_s = \frac{1}{T} \int_{0}^{T} x(t) \, ca(s,t) \, dt
\]  

(2.11)

\[
B_s = \frac{1}{T} \int_{0}^{T} x(t) \, sa(s,t) \, dt
\]  

(2.12)

A Fourier series representation of the signal is given by
\begin{equation}
\begin{aligned}
x(t) = \frac{a_0}{2} + \sum_{k=1}^{\infty} \left[ a_k \cos \frac{2\pi kt}{T} + b_k \sin \frac{2\pi kt}{T} \right]
\end{aligned}
\tag{2.13}
\end{equation}

where cosine and sine coefficients $a_k$ and $b_k$ are defined by

\begin{align}
a_k &= \frac{2}{T} \int_{0}^{T} x(t) \cos \frac{2\pi kt}{T} \, dt \\
b_k &= \frac{2}{T} \int_{0}^{T} x(t) \sin \frac{2\pi kt}{T} \, dt
\end{align}
\tag{2.14, 2.15}

2.5 Ordering of Walsh Functions

Walsh functions in three different orderings are used [38, 39, 40, 41]. The mathematical nature of Walsh functions is such that no universally used ordering exists. This is in sharp contrast to the complex exponential functions that are always ordered by their frequency. For the complex exponentials are characters of the group under modulo $N$ addition, which has a natural ordering by arithmetic value. Geometrically we may consider integers to be points on the real line, which has only one dimension and hence one ordering. On the other hand, Walsh functions are characters of the dyadic group, which is the group of binary vectors under bit-wise addition modulo-2. The space formed by these vectors, called dyadic space, has many dimensions but only two points along each axis (0 or 1). Since there is no natural ordering of the dyadic group, there is no single ordering of Walsh functions. The three types of orderings
used are: (i) sequency or Walsh, (ii) dyadic or Paley, and (iii) Natural or Hadamard.

2.5.1 Sequency or Walsh Ordering

This is the ordering which was originally employed by Walsh, and is characterised by the fact that \( \text{wal}(k,t) \) has \( k \) sign changes in the interval \( (0,1) \). The number of sign changes is used as generalised frequency \([30]\). This ordering resembles the ordering of sinusoidal orthogonal functions with increasing frequency and is favoured by researchers in most engineering applications \([42,43]\).

Fast algorithms for computing Walsh coefficients in sequency ordering require data shuffling. Details are given in Chapter 3, where transform techniques are considered.

2.5.2 Dyadic or Paley Ordering

This ordering was first used by Paley \([7]\) when he proposed alternatives to the original definition of Walsh. Let the set of Walsh functions in dyadic order be denoted by

\[
\{\Psi\}_p = \{\text{Pal}(k,t)\}_{k=0}^{N-1}
\]  

(2.16)

Index \( k \) in Eqn. (2.16) represents the \( k \)th member in the set. The merits of Paley ordering of Walsh functions are \([44]\),
1. \( \text{Pal}(k,t) \) has a simpler recursive definition:

\[
\text{Pal}(k,t) = \begin{cases} 
\text{Pal}(k,2t); & 0 \leq t < 1/2 \\
\text{Pal}(k,2t-1); & 1/2 \leq t < 1 
\end{cases}
\]

(2.17)

\[
\text{Pal}(2k+1,t) = \begin{cases} 
\text{Pal}(k,2t); & 0 \leq t < 1/2 \\
-\text{Pal}(k,2t-1); & 1/2 \leq t < 1 
\end{cases}
\]

(2.18)

In contrast, the sign to be attached to \( \text{wal}(2k,t) \) and \( \text{wal}(2k+1,t) \) must be determined by splitting \( k \) into two parts \( 2^m + i, 0 \leq i < 2^m \) and using

\[
\text{wal}(2k,t) = (-1)^{i+1} \text{wal}(k,2t-1) \quad 1/2 < t < 1
\]

\[
\text{wal}(2k+1,t) = (-1)^i \text{wal}(k,2t-1)
\]

(2.19)

2. \( \text{Pal}(k,t) \) can be expressed as a product of Rademacher functions:

\[
\text{Pal}(k,t) = \prod_{i=0}^{n} \text{rad}_i(t) \quad \text{for} \quad k = \sum_{i=0}^{n} k_i 2^i, \quad k_i \in (0,1)
\]

(2.20)

where \( \text{rad}_i(t) = \text{sgn}(\sin(2^{i-1}2\pi t)) \); \( i = 1, 2, \ldots, 3 \)

(2.21)

3. Gibbs [44] showed that Walsh functions are eigen solutions of logical differential equations, and the equation that gives \( \text{Pal}(k,t) \) is simpler.

4. Dyadic ordering is preferred in filtering applications due to the existence of dyadic convolution theory, analogous to arithmetic convolution used in linear
5. Yuen [45] computed the Walsh spectra of a large class of continuous functions, including \( \exp(t) \), \( \sin(t) \), \( \cos(t) \) and several powers of \( t \) and compared each spectrum in sequency order with that in dyadic order. He showed that in general the dyadic-ordered spectrum shows better convergence towards the higher-order end. The good convergence property of a dyadic ordered Walsh Spectrum has some significant implications in the case of function approximation. It means that an estimate can be made regarding the magnitude of the expansion coefficients and certain higher order coefficients need not be computed. The bounds on the expansion have been derived by Yuen.

Paley ordered Walsh functions are related to sequency ordered Walsh functions according to

\[
\text{Pal}(k,t) = \text{wal}[b(k),t]
\]

where \( b(k) \) is Gray code to binary code conversion of \( k \) [7]. This is illustrated below for \( m = 3 \).

<table>
<thead>
<tr>
<th>Pal(k,t)</th>
<th>k</th>
<th>b(k)</th>
<th>\text{wal}(b(k),t)</th>
</tr>
</thead>
<tbody>
<tr>
<td>k = 0</td>
<td>0 0 0</td>
<td>0 0 0</td>
<td>b(k) = 0</td>
</tr>
<tr>
<td>1</td>
<td>0 0 1</td>
<td>0 0 1</td>
<td>1</td>
</tr>
<tr>
<td>2</td>
<td>0 1 0</td>
<td>0 1 1</td>
<td>3</td>
</tr>
<tr>
<td>3</td>
<td>0 1 1</td>
<td>0 1 0</td>
<td>2</td>
</tr>
<tr>
<td>4</td>
<td>1 0 0</td>
<td>1 1 1</td>
<td>7</td>
</tr>
<tr>
<td>5</td>
<td>1 0 1</td>
<td>1 1 0</td>
<td>6</td>
</tr>
<tr>
<td>6</td>
<td>1 1 0</td>
<td>1 0 0</td>
<td>4</td>
</tr>
<tr>
<td>7</td>
<td>1 1 1</td>
<td>1 0 1</td>
<td>5</td>
</tr>
</tbody>
</table>
The formula for a Walsh function in Paley order can be written [7],

\[ \text{Pal}(k,j) = (-1)^{\sum_{i=1}^{m} k_{m-i+1} j_{i}} \]  \hspace{1cm} (2.23)

\( k_{i}, j_{i} \) are the \( i \)th binary digits in the binary representation of \( k \) and \( j \).

2.5.3 Natural or Hadamard Ordering

A third ordering called Natural or Hadamard ordering was proposed by Henderson [89]. It has the advantage that the computation of a finite number of Walsh coefficients (normally an integral power of 2) from a given data sequence is faster than that for dyadic or sequency-ordered coefficients. The algorithms used (and called Fast Walsh Hadamard Transform) are discussed in the next chapter. The set of Walsh functions in this order

\[ \{\$\}_{h} = \{\text{wal}_{h}(k,t)\}_{k=0}^{N-1} \]  \hspace{1cm} (2.24)

is related to the sequency ordered Walsh functions by [7]

\[ \text{wal}_{h}(k,t) = \text{wal}[b(<k>),t] \]  \hspace{1cm} (2.25)

where \(<k>\) is obtained by bit reversal of \( k \) and \( b(<k>) \) is the Gray code to binary code conversion of \(<k>\). For purposes of illustration, Eqn. (2.25) is evaluated for \( N = 8 \) and is given below [7].
\[ \text{wal}_h(k,t) \]

\begin{array}{cccc}
    & k & \langle k \rangle & b\langle k \rangle & \text{wal}[b(\langle k \rangle),t] \\
    \text{k} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \ \\
    1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 1 & 7 \\
    2 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \ \\
    3 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 4 \\
    4 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \ \\
    5 & 1 & 0 & 1 & 1 & 0 & 1 & 0 \ \\
    6 & 1 & 1 & 0 & 0 & 1 & 0 & 0 \ \\
    7 & 1 & 1 & 1 & 1 & 1 & 0 & 1 \\
\end{array}

The formula for a discrete Walsh function in Hadamard order is given by [7]

\[ m \sum_{i=1}^{m} k_i j_i \]

\[ \text{wal}_h(k,j) = (-1)^{m} \]

where \( k_i \) and \( j_i \) are the binary digits of \( k \) and \( j \) as used before.
CHAPTER 3

WALSH SPECTRAL ANALYSER

3.1 Concepts of the Walsh Spectral Analyser (WSA)

The purpose of the WSA is to compute the Walsh coefficients \( \{B(k)\}_{k=0}^{N-1} \) of a signal defined in a finite interval \((0-T)\) as the inner products of a set of orthogonal Walsh functions \( \{\text{wal}(k,t)\}_{k=0}^{N-1} \) and the input signal \( x(t) \), by

\[
B(k) = \frac{1}{T} \int_0^T \text{wal}(k,t) x(t) \, dt \quad (3.1)
\]

In the case of sampled input signals, integration in Eqn. (3.1) is replaced by summation. If there are \( N \) uniformly spaced samples in the interval \((0-T)\), Eqn. (3.1) can be modified to

\[
B(k) = \frac{1}{T} \sum_{i=0}^{N-1} \text{wal}(k,\frac{it}{N}) X(\frac{it}{N}) = \frac{1}{N} \sum_{i=0}^{N-1} \text{wal}(k,i) X(i) \quad (3.2)
\]

where \( X(\frac{it}{N}) \) is a sample of \( x(t) \) at the instant \( \frac{it}{N} \).

Taking the sampling interval \( T/N \) as unity, Eqn. (3.2) reduces to

\[
B(k) = \frac{1}{N} \sum_{i=0}^{N-1} \text{wal}(k,i) X(i) \quad (3.3)
\]
Varied approaches are followed to obtain a finite number of Walsh coefficients defined by Eqns. (3.1) and (3.3). They are discussed in sections 3.2, 3.3.

3.2 Review of Walsh Spectral Analysers

3.2.1 Analog Processors

Walsh Spectral analysers can be classified into either Analog or Digital type depending on the nature of elements used in building the instrument. Early processors were of the analog type, for example the design of Gethoffer [46] uses an analog shift register to store N sampled values which are combined in an operational amplifier (one for each coefficient) to obtain the output coefficient as an analog voltage. This process is illustrated for B(1) in Fig. 3.1. Harmuth [6] employed similar methods in the design of sequency filters.

Analog transformation is suitable for relatively fast real-time operation and for small data length. But it is not attractive for waveform analysis where many parallel outputs are desired from the viewpoints of accuracy (particularly at low frequencies), economy and reliability. The complexity of digital integrated circuits (ICS) grows with time. The advent of the Microprocessor as a general-purpose digital electronic block, whose function is determined by programming, is revolutionary. We can look
3.1 Schematic diagram of an Analog Processor for computing B(1) given eight data points X(0), ..., X(7).
for many technological innovations to improve digital circuit performance, cost, reliability, and flexibility. Low frequency signal analysis is not a problem with digital techniques. These considerations favour the choice of digital techniques in the design of instrumentation.

3.2.2 Digital Processors

Digital WSA's are classified into two viz 1) Direct method, 2) Fast Walsh Transformation method. Both accept N discrete quantized input samples and compute Walsh coefficients according the Eqn. (3.3); but the computing procedures differ.

**Direct Methods** [9,13,47]

A direct type WSA is schematically illustrated in Fig. 3.2. Each input signal sample is multiplied with a Walsh function of desired sequency (the multiplication reduces to that of determining the sign of the result as Walsh functions take only values ±1); the intermediate results are accumulated in an adder/subtractor. To obtain N coefficients, an N-ary Walsh function generator, N adder/subtractors are required. Walsh function generators are also required for transmission of signals by Walsh carriers and in signal synthesis using inverse Walsh transform apparatus and in real-time Walsh spectrum measurements.
Fig. 3.2 Schematic diagram of a N-coefficient Direct-type Walsh Spectral Analyser
Many contributions have been made to Walsh function generators [33,34,36,48], many of which have been reviewed [50]. A WSA of this type was first designed by Siemens and Kitai [9,13] and recently by Ashouri and Constantinides [15]. The instrument developed by Siemens is designed to yield Walsh spectral coefficients of periodic waveforms in real-time. The computation takes two cycles of input signal; the first cycle is devoted to measure the period of the input signal using digital methods and this period information is used to generate Walsh functions during the second cycle. An array of 64 Walsh functions are generated simultaneously, using a hazard-free type generator [36]. The generator is clocked by 64 pulses generated by a "Digital Frequency Multiplier" during the second cycle. The input sample is multiplied by Walsh functions and the multiplied outputs are accumulated, there being one accumulator for each coefficient. A special purpose accumulator using counter methods is used. The digital values of samples obtained from an Analog-to-Digital converter (A/D) are fed into the accumulator in a serial manner, starting with the least significant bit. The accumulator has the merit of giving the final result in the desired binary coded decimal form.

The frequency range of input signal analysed was
limited to about 30 Hz by the digital frequency multiplier circuit. An improved multiplier to extend the frequency range was investigated in the present work, and is described in detail in Chapter 4.

The design due to Ashouri and Constantinides [15] follows an approach which is similar to the above, but was designed to compute the Walsh spectrum of Delta-Sigma modulated signals. Delta-sigma modulation is a method of encoding an analog signal \( x(t) \) to give at the output \( L(t) \) which is bi-valued \( (\pm 1) \) according to the following criterion:

\[
\int_{t_1}^{t_2} x(t) \, dt = \int_{t_1}^{t_2} L(t) \, dt
\]

The \( k \)th Walsh coefficient for a delta-sigma modulated signal is then,

\[
B(k) = \frac{1}{N-1} \sum_{t=0}^{N-1} L(t) \, \text{wal}(k,t)
\]

Since \( L(t) \) and \( \text{wal}(k,t) \) can take only two values \( \pm 1 \), \( L(t) \) \( \text{wal}(k,t) \) is realized using exclusive-OR gates. The output is accumulated in an integrator for coefficients in analog form or in an up-down counter which can be realized by using readily available counter ICS (for an example TTL Bi-directional Counter 74191/74193).

Knauer [47] proposed a Hadamard processor following a
similar parallel approach. His instrument computes only 8 coefficients and was used in video-compression applications. This was considered to be one of the choices in the design of a Real-time Walsh-Hadamard/ Cosine Image Processor by Hein and Ahmed [51].

A hybrid Real-time Walsh waveform analyser differing in the final accumulation process from the previous processors was designed by Tanada and Sano [14], using both analog and digital techniques. Assuming that a N coefficient representation of a signal defined in a period (0-T) is desired, the time interval is divided into N equal parts. The average value of the signal during each sub-interval $\Delta \theta$ is obtained by analog integrators and transformed into a pulse. The pulse width (duration) is given by

$$D_{i+} = \frac{h}{E} \left( 1 + \frac{x(i)}{E} \right)$$  \hspace{1cm} (3.6)

where $0 < h \leq 1/2$, and $E$ and $h$ are constants. For example, $x(i) = -E$, 0 and $+E$ corresponds to time widths $D_{i+} = 0$, $h\Delta \theta$ and $2h\Delta \theta$. A reversed quantity for $-x(i)$ is necessary for multiplication with Walsh functions. This is represented by

$$D_{i-} = \frac{h}{E} \left( 1 - \frac{x(i)}{E} \right) = 2h\Delta \theta - D_{i+}$$  \hspace{1cm} (3.7)

that is, the time width $D_{i-}$ can be obtained from a pulse of width $2h\Delta \theta$ by dividing it into $D_{i+}$ and $D_{i-}$. With a pulse sequence $P_{o}$ consisting of $D_{i+}$ with a logic duration of 1 as its width and another pulse sequence $N_{o}$ consisting of $D_{i-}$,
multiplication can be performed by exclusively selecting either of the sequences according to the sign of Walsh function. Representing +1, -1 of Walsh functions by logic 0 and 1 respectively, the multiplication can be stated as follows:

If a Walsh function assumes logic 0, select $P_0$; otherwise select $N_0$ for the summation process. The pulses $P_0$ and $N_0$ selected on the above basis during each time interval are combined to get a pulse train $a(k)$ corresponding to the kth Walsh coefficient,

$$a(k) = \sum_{i=1}^{N} \left[ P_0 \text{ wal}'(k,i) \right] \cup \left[ N_0 \text{ wal}'(k,i) \right]$$  (3.8)

where wal'(k,i) is the digital representation of Walsh function wal(k,i) with the mapping mentioned earlier. The output pulse train $a(k)$ delivered by the multiplier is transformed into an analog voltage representing the Walsh coefficient $B(k)$ by an adder consisting of high-speed current switches with an integrator connected to them. Alternatively time widths $D_{i+}$ can be quantized with a sufficient number of clock pulses and the analyser can be constructed with only digital circuits after time-width transformation. Quantization of time width introduces errors in the final computed coefficients; the error is inversely proportional to the clock frequency used in the process. The summation in Eqn. (3.8) can then be affected
using only up-down counters, similar to the earlier processors discussed, and two's complement arithmetic is avoided.

The direct methods perform $N$ arithmetic operations (only additions or subtractions) for each coefficient (Eqn. (3.3)); a total of $N^2$ operations are needed for a $N$-coefficient processor. Fast Transform methods discussed in sections 3.3 perform only $N \log_2 N$ arithmetic operations. As a result, from the computational point of view, direct methods are inefficient when compared with transform methods and are costlier in terms of hardware. In the literature, it is often said that Direct methods are faster and operate in "real-time". The term "real-time" is loosely defined in practice. In electronic systems, processes are described as occurring in real time, only when they keep pace with the chronology of events in the "real world" - i.e. in the environment we perceive events. A processor can collect a finite segment of data and process it; while processing the previous segment it can acquire and store more data. This permits the generation of coefficients that are continuously updated. If the processor is fast enough, a contiguous stream of data is acquired, producing a series of coefficients (each slightly "out of date" when generated) and no data is lost. A processor with this capability can be certainly called a "real-time" processor and can also be
built using the Fast Transform approach using less hardware. Direct methods differ from the Fast Transform methods in that the former type delivers output coefficients instantaneously at the end of data acquisition (zero response time) while in the latter case there is a finite lag between the end of data acquisition and the delivery of output coefficients. High cost of hardware and bulkier equipment using the direct approach may be justified in applications requiring zero response time.

### 3.3 Fast Transform Algorithms and their Hardware Implementations

The use of Fast Fourier Transform (FFT) algorithms for the fast computation of Discrete Fourier Transform is well known. With the use of Walsh functions in engineering applications, similar fast algorithms to compute Walsh coefficients in Hadamard/Natural order and sequency order (hereafter called Fast Walsh-Hadamard Transform – FWHT and Fast Walsh Transform FWT respectively) were developed.

#### 3.3.1 Walsh-Hadamard Transform (FWHT)

It is well known that an efficient way to implement a FWHT of a $N$ length data sequence ($N = 2^m$) is to decompose the transform matrix $H(N)$ in Eqn. (3.9) into $m$ matrices that have many zero elements, thereby reducing the number of
arithmetic operations [15]. The FWHT is given by

$$B_h(N) = H(N) \times (N)$$  \hspace{1cm} (3.9)

where $B_h(N)$ = (N\times1) coefficient matrix (vector) in Hadamard order

$H(N)$ = (N\times N) Hadamard matrix

$X(N)$ = (N\times1) data vector

For example the decomposition for $H(8)$ is as follows:

$$H(8) = \begin{bmatrix}
1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\
1 & -1 & 1 & 1 & 1 & -1 & -1 & -1 \\
1 & 1 & 1 & 1 & -1 & -1 & 1 & 1 \\
1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 \\
1 & 1 & -1 & 1 & -1 & -1 & 1 & 1 \\
1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\
1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 \\
1 & 1 & -1 & 1 & -1 & -1 & 1 & 1
\end{bmatrix}$$  \hspace{1cm} (3.10)

$$= G_0 \cdot G_1 \cdot G_2$$  \hspace{1cm} (3.11)

$$= P \cdot P \cdot P$$  \hspace{1cm} (3.12)

$$= Q \cdot Q \cdot Q$$  \hspace{1cm} (3.13)

where

$$G_0 = \begin{bmatrix}
1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\
1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & -1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & -1
\end{bmatrix}$$  \hspace{1cm} (3.14)
\[
G_1 = \begin{bmatrix}
1 & 0 & 1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 0 & 0 & 0 \\
1 & 0 & -1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 & 1 \\
0 & 0 & 0 & 0 & 1 & 0 & -1 \\
0 & 0 & 0 & 0 & 1 & 0 & -1
\end{bmatrix}
\]
(3.15)

\[
G_2 = \begin{bmatrix}
1 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 1 \\
1 & 0 & 0 & 0 & -1 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & -1 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & -1
\end{bmatrix}
\]
(3.16)

\[
P = \begin{bmatrix}
1 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 1 & 0 \\
1 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & -1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & -1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1
\end{bmatrix}
\]
(3.17)

\[
Q = \begin{bmatrix}
1 & 0 & 0 & 0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 & 0 & 1 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 & 0 & 0 & -1 \\
0 & 0 & 0 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 & -1
\end{bmatrix}
\]
(3.18)

The factorization in Eqn. (3.11) is due to the fact that a Hadamard matrix of order \( N = 2^m \) can be generated by the successive Kronecker product of core matrix \( H(2)^{m} \) of order 2 which is the lowest order Hadamard matrix [7].
\[ H(2) = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \]  

(3.19)

(The Kronecker product \( U \otimes R \) of two matrices is obtained by
multiplying \( U \) with each element \( r_{ij} \) of \( R \) and substituting
the multiplied matrices \( r_{ij} U \) for the elements of \( r_{ij} \) in \( R \).)

For example,

\[ H(8) = H(2) \otimes H(2) \otimes H(2) \]  

(3.20)

The computation of Walsh coefficients using the
decomposition in Eqn. (3.11) is known as Fast Hadamard-
ordered Walsh-Hadamard Transform (FWHT).

The signal flow-graph shown in Fig. 3.3 illustrates
the computational process represented by Eqn. (3.11). It
involves three stages (iterations) of matrix multiplication
concerning to \( G_2, G_1 \) and \( G_0 \) in that order. The signal
flow-graph in Fig. 3.3 resembles to that of radix-2
Decimation-in-Time FFT algorithms [24]; the multiplication
associated with the FFT algorithms are absent in Fig. 3.3.

The following remarks on (FWHT) can be made for the general
case \( N = 2^m \):

1. The number of iterations required is \( N \log_2 N \).
2. The ith iteration induces \( 2^{i-1} \) partitions of size
   \( N/2^{i-1} \) on \( N \) data points.
3. Each iteration consists of repeated application of a
   basic operation widely known as the butterfly-operation in
   Digital Signal Processing.
Fig. 3.3: FFT Signal flow-graph for N = 8.
A butterfly operation combines two inputs to give two outputs according to

\[ X(i) = X(i) + X(i + N/2^j) \quad (3.21) \]

\[ X(i + N/2^j) = X(i) - X(i + N/2^j) \quad (3.22) \]

where \( i = 0, 1, \ldots, (N/2) - 1 \)

\[ j = 1, 2, \ldots, m = \log_2 N \]

It is seen from Eqns. (3.21) and (3.22) that intermediate results computed in an iteration are stored in the same memory locations in which original data being transformed were stored. This feature, known as in-place computation, does not require extra memory.

4. The number of arithmetic operations for the process is \( N \log_2 N \). (Note that arithmetic operations involve only additions and subtractions in the case of FWHT.)

This algorithm is adopted for software implementation of FWHT in a general purpose computer \([7, 52]\) due to the in-place computational advantage (the same approach is adopted in a Microprocessor based Walsh-spectral analyser developed in the present work).

3.3.2 Pipe-line Processor

A pipe-line hard-wired Hadamard Walsh processor was designed by Ashouri and Constantinides \([15]\) using the FWHT algorithm according to the signal flow-graph shown in Fig. 3.3 and is similar to the one described by Gorginsky and
Works [25] for a pipe-line FFT processor. The structure of the processor for $N = 8$ is given Fig. 3.4. It consists of $m$ stages; each consisting of a shift register $SR_1$ of $N/2^1$ bits, an adder $A_1$, subtractor $S_1$ and switches $SW_1$. The switches are controlled by a $m$-bit binary counter $C$. Considering the operation of the first stage, $SW_1$ is in position 1. $N/2$ data points are entered sequentially into $SR_1$. At the $(N/2 + 1)$th sample, switches $SW_1$ change to position 2 and $-X(i + N/2) + X(i)$ is entered into $SR_1$, $i$ being the index of the data in the last bit of $SR_1$. The sum $X(i) + X(i + N/2)$ is formed by $A_1$ and entered into the second stage. After $N$ samples, $SW_1$ changes to the position 1; now $-X(i + N/2) + X(i)$ from $SR_1$ is entered into the second stage. These operations are the ones defined by the iteration number 1 in Fig. 3.3. The operation of the other stages is similar, except that the switches in the succeeding stage operates twice as fast as in the preceding stage; they perform successive iterations of FWHT. The binary output of the counter $C$ identifies the order of the transformed output. It can be seen that there is a delay of $N$ sampling periods between the instants at which the $i$th sample in a data window of $N$ points is entered into the processor and the $i$th coefficient is available at the output. A comparison of this structure with a new pipeline structure is given in section 3.3.
Fig. 3.4 Organization of a Pipeline FWHT Processor
3.3.3 **Parallel Array Processor**

Elliot and Shum [16] designed a Parallel Array Processor based on the decomposition given by Eqn. (3.12). Their design uses two arrays of \( N \) registers (the word array is used in the sense of representing storage elements for \( N \) data points); one array for the \( N \) input data points and the other for storing intermediate results. Each pair of result registers is connected to a fixed pair of adders/subtractors. To begin with, \( N \) data points are acquired in the data registers. At the end of data acquisition, the data registers' outputs are clocked into the result registers. The output of the adders/subtractors are fed back to the result registers based on Eqn. (3.12) and given by

\[
X(i) = X(j) + X(j+1) \quad i = 0, 1, \ldots, N/2 - 1
\]

\[
X(i + N/2) = X(j) - X(j+1) \quad j = 2i
\]

The process of feedback is repeated \( m \) times to obtain the Hadamard ordered coefficients. Due to the parallel computation scheme used, its response time is shorter than the previously discussed pipeline structure. For example, 64 coefficients could be obtained in about 200 nanoseconds with TTL ICs. This processor's response time is close to that of the direct type processor. The parallel structure with increased hardware (when compared to Pipeline or Serial organization) can be justified in applications with fast
response time as discussed earlier.

3.3.4 Serial Organization: Shift Register Implementation

A processor which performs the butterfly operations of FWHT algorithms given in Eqns. (3.11 - 3.13) serially is termed as a serial processor. (A pipeline and parallel type processors discussed earlier perform m and N/2 butterfly operations at a time respectively.) This feature could result in a low cost solution to the computation of FWHT and may meet the requirements of many engineering applications. The architecture of a Serial Processor using long shift registers, based on Eqn. (3.13) is illustrated in Fig. 3.5. The computation according to Eqn. (3.13) is to fetch two operands, representing two data points separated by N/2 sampling intervals for a butterfly operation from shift register memory, and to store the intermediate results in another memory. Note that in-place computational advantage of the algorithm in Eqn. (3.11) is lost in this process, but results in identical operations in all iterations.

The processor consists of four shift registers SR1-4 of N/2 bits long, an adder A, a subtractor S and four switches SWa, SWb, SWc, and SWd. At the start, all the switches are in the position 1. The first step is to shift the data X(i), i = 0, 1, ..., N-1 from external memory into SR1 and SR2. It is assumed that external memory is used as
a buffer to acquire data, while the previous data is processed in real-time environment. In non-real time application, data can be entered directly into SR1 and SR2. After N shifts SW_d changes to the position 2 and the first iteration commences. Indexing the data in SR1-SR2 from right to left, X(0) and X(N/2) are operands to A and S. The outputs of A and S are shifted into SR3-SR4 in that order. When S output is shifted into SR3-SR4, new operands to A and S are fetched by shifting the contents of SR1 and SR2. After N/2 operations SW_a, SW_b and SW_c change to position 2; the roles of SR1-SR2 and SR3-SR4 are interchanged. The operands to A and S are obtained from SR3-SR4 and the outputs of A and S are shifted into SR1-SR2. This is repeated m times, transformed coefficients are outputted after m iterations. As the results are outputted serially, new data for processing can be inputted simultaneously.

Geadah and Corinthios [17] derived a decomposition for the Hadamard transform with a view of implementation using a shift register type organization as follows

\[
H(N) = \prod_{i=1}^{m} P(N) D
\]  

(3.24)

where

\[
D = I(N/2) \times H(2) \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}
\]

(3.25)

\[
= I(N/2) \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}
\]

(3.25a)
where \( I(N/2) \) is \( (N/2 \times N/2) \) identity matrix. Expression (3.24) is identical to \( Q \) in Eqn. (3.13). \( P(N) \) is the 'ideal' permutation matrix, which operates on a vector of dimension \( N \) according to

\[
P(N) \text{col } [X(0), X(1) \cdots \frac{X(N-2)}{2} X(N/2) \cdots X(N-2) X(N-1)]
\]

(3.26)

\[
= \text{col } [X(0), X(N/2), X(1), X(N/2+1) \cdots X(N-2)/2, X(N-1)]
\]

The operations in Eqn. (3.26) are equivalent to storing the results of butterfly operations in Eqn. (3.25) sequentially. In the digital implementation, Geadah and Corinthios employed four shift registers, each \( N/2 \) bits long. The outputs of the adder and subtractor resulting in a butterfly operation are shifted in two lower shift registers. After all the butterfly operations are over in an iteration, the results are permuted into upper shift registers according to Eqn. (3.26). This could have been avoided, if the results of the butterfly operation were shifted sequentially into the two lower shift registers cascaded as one register and this set used as data memory for the next iteration as is done in the processor in Fig. 3.5. Since the processor shown in Fig. 3.5 involves only \( N \) shifts in an iteration instead of \( 3N/2 \) shifts in [17], its processing time is shorter than the latter by a factor of 1.5. The hardware requirements of both processors are roughly the same.

Since an add/subtract operation can be performed in a
few nanoseconds, the processing time is limited by the speed with which data can be entered into shift registers. Let $f_s$ be maximum shifting frequency. The processing time $T_p$ is then $mN/f_s$. For $f_s = 10 \text{ MHz}$ and $N = 1024$, $T_p = \frac{10 \times 1024}{10 \times 10^6} = 1.024 \text{ msec}$. This would permit real-time processing at a throughput rate of 1 MHz.

The processor organization shown in Fig. 3.5 can be modified for pipeline processing as shown in Fig. 3.6. It consists of $m$ identical stages; each performing one iteration in Eqn. (5.13). The outputs of A and S are inputted to the next stage. To start with, assume that all the switches are in position 1. The input/output of A and S are entered into upper shift registers. The operands to A and S are from lower shift registers. After N samples are inputted, the switches change to position 2 and the roles of the upper and lower shift registers are interchanged. This organization speeds up the computation by $m$ times, when compared to a single stage structure. For example, the pipeline processor could be used for real-time processing up to 10 MHz throughput rate, assuming the maximum shifting frequency of 10 MHz as before.

A comparison of the performance characteristics of the pipeline structure described here (called Type 1) with that described in Section 3.3.2 (called Type 2) is given below.
1. The processing speed in a Type 1 processor is limited by the time required to shift one word and perform one addition. In Type 2, there is a cascaded delay of \( m \) additions and \( m \) shifts. This suggests that Type 1 is faster than Type 2 by a factor of about \( m \).

2. In terms of hardware, both types use the same number of adders/subtractors and switches, but Type 1 uses \( 2mN \) memory words and Type 2 uses only \( N \) memory words.

3. Type 1 uses identical modules/stages; not so for Type 2.

4. There is delay of \( mN \) sampling periods between the incoming data and the corresponding output coefficients in Type 1 and \( N \) sampling periods in Type 2.

3.3.5 Fast Walsh Transform Algorithms (FWT) and their Digital Hardware Implementation

One implementation approach is to first compute the FWHT for which efficient algorithms and processors were discussed in the preceding sections. Subsequently, FWT coefficients can be obtained using Eqn. (2.25). However, since this approach involves Gray code-to-binary conversion in addition to bit reversal, it is not a very efficient procedure in the case of software implementation in a computer. In the case of hardware implementation, this
involves more hardware to permute the output coefficients as well as extra processing time. This approach had been used in [16].

A FWT algorithm due to Manz [53] has a signal flow-graph as shown in Fig. 3.7 for \( N = 8 \). It is essentially a simple modification of the FWHT signal flow-graph. The first step is to permute the \( N \) data points in bit-reversed order. Example: For \( N = 8 \), \( X(1) = X(001) \), where index 1 is represented by 3-bit binary number is exchanged with \( X(100) = X(4) \) due to the bit-reversal operation. This is a time consuming operation [54], if implemented in software. This can be effected through hardware during the data acquisition phase with no increased complexity in the circuits and is followed in a digital processor built in the present work.

The process consists of \( m \) iterations and in \( i \)th iteration data is partitioned into \( 2^{i-1} \) blocks of equal size consisting of \( \frac{N}{2^{i-1}} \) points. Two types of butterfly operations are performed alternating with blocks in an iteration starting with the normal type as used in FWHT. The other type is called reversed butterfly operation and is illustrated in Fig. 3.8. Normal butterfly operations are described by

\[
X_{k+1}(i) = X_k(i) + X_k(i+2) \tag{3.27a}
\]

\[
X_{k+1}(i+1) = X_k(i+1) + X_k(i+3) \tag{3.27b}
\]

\[
X_{k+1}(i+2) = X_k(i) - X_k(i+2) \tag{3.27c}
\]
Fig. 3.7 FWT signal flow-graph for N = 8

Fig. 3.8(a) Normal butterfly operations

Fig. 3.8(b) Reverse butterfly operations
\[ X_{k+1}(i+3) = X_k(i+1) - X_k(i+2) \]  

(3.27d)

However, reversed butterfly operations are described by:

\[ X_{k+1}(i+1) = X_k(i+1) - X_k(i+3) \]  

(3.28b)

\[ X_{k+1}(i+2) = X_k(i) + X_k(i+2) \]  

(3.28c)

\[ X_{k+1}(i+3) = X_k(i) + X_k(i+3) \]  

(3.28d)

Except for these two differences, FWHT and FWT have similar structures. In the case of software implementation, this calls for the incorporation of two subroutines for normal and reversed butterfly operations instead of one in the case of FWHT, as well as provision for the choice between the two in a block. This would increase program memory and processing time slightly, when compared with the FWHT.

The pipeline processor in Fig. 3.4 could be modified to implement the signal-flow graph in Fig. 3.7. The adder and subtractor units in Fig. 3.7 are interchanged periodically. For example, the \( A \) and \( S \) in the first stage do not interchange; in the second stage they are interchanged for every 4 inputs and in the third stage for every two inputs. But still one needs a bit-reversal permutation operation, before data is entered into the processor. There exists a permutation operator \([56]\) such that FWHT of the permuted data would yield coefficients in sequency order. This would eliminate the need to interchange the adders and subtractors in Fig. 3.7. It also
avoids the use of two types of butterfly operations mentioned earlier in the software implementation. The permutation operator for this objective is developed in Section 3.3.6. In real-time applications, data comprising N samples is acquired first and stored in one segment of memory and then the processing of the acquired data commences. Meanwhile new data is acquired and stored in another segment of input memory. If one could store the data in a permuted sequence, the FWHT would then deliver sequency ordered coefficients. The extra time necessary to permute the FWHT coefficients from a processor is avoided. Also extra memory needed for permutation is eliminated. The performance characteristics of a processor using this approach is compared with that designed by Geadah and Corinthios [17] in Section 3.4.

3.3.6 Permutation Operator for Input Data to Obtain FWHT Through FWHT of Permutated Data [56]

The set of N orthogonal basis Walsh functions in Hadamard/Natural order (rows of Hadamard matrix) are defined [43] as

\[ \text{wal}_h(k,j) = (-1)^{<k,j>} \]  

where \( k, j \) are integers between 0 and \( N-1 \) with binary
expansions,
\[ k = k_{m-1} 2^{m-1} + k_{m-2} 2^{m-2} \ldots + k_s 2^s \ldots + k_0 2^0 \]  \hspace{1cm} (3.30)
\[ j = j_{m-1} 2^{m-1} + j_{m-2} 2^{m-2} \ldots + j_s 2^s \ldots + j_0 2^0 \]  \hspace{1cm} (3.31)
\[ k_s, j_s \in [0,1] \]

and
\[ \langle k, j \rangle = \sum_{s=0}^{m-1} k_s j_s \]  \hspace{1cm} (3.32)

Index \( k \) in Eqn. (3.29) refers to the \( k \)th Walsh function in Hadamard order and index \( j \) refers to the discrete time with sampling period taken as unity.

\[ B_h(k) = \sum_{j=0}^{N-1} \text{wal}_h(k,j) X(j) \]  \hspace{1cm} (3.33)

Let \( S_w(\cdot) \) be a permutation operator which maps a set of integers \( \{0,1,\ldots,N-1\} \) one-to-one into itself (a bijection mapping). It is defined as follows.

Let
\[ S_w(k) \triangleq \bar{k} \]  \hspace{1cm} (3.34)

Let the \( m \) bit binary expansion of \( k \) and \( \bar{k} \) be,
\[ k = k_{m-1} k_{m-2} \ldots k_s \ldots k_0 \]  \hspace{1cm} (3.35)
\[ \bar{k} = k_{m-1} \bar{k}_{m-2} \ldots \bar{k}_s \ldots \bar{k}_0 \]  \hspace{1cm} (3.36)

\( k \) and \( \bar{k} \) are related as follows
\[ \bar{k}_s = k_{m-s}; \text{ for } s = 0 \]  \hspace{1cm} (3.37)
\[ = k_{m-s} \oplus k_{m-s-1}; \text{ for } s = 1,2,\ldots,m-1 \]

The operator \( S_w(\cdot) \) performs the bit reversal of the binary to Gray code conversion of \( k \). From Eqn. (2.25), it can be seen that
\[ B(k) = B_h(\vec{k}) = B_h(\mathbf{w}_w(k)) \]  

\[ = \sum_{j=0}^{N-1} \mathbf{w}_h(\vec{k}, j) \mathbf{x}(j) \]  

From the definition of \( \mathbf{w}_h(k, j) \) in Eqn. (3.29),

\[ \mathbf{w}_h(\vec{k}, j) = \mathbf{w}_h(k, \vec{j}) \]  

using Eqns. (3.39) and (3.40),

\[ B(k) = \sum_{j=0}^{N-1} \mathbf{w}_h(k, \mathbf{w}_w(j)) \mathbf{x}(j) \]  

\[ = \sum_{i=0}^{N-1} \mathbf{w}_h(k, i) \mathbf{x}(\mathbf{w}_w^{-1}(i)) \]  

\( \mathbf{w}_w^{-1}(\cdot) \) is the inverse operator and is given by [56]

\[ \mathbf{w}_w^{-1}(k) = \bigoplus_{r=0}^{m-1} \mathbf{SR}^r [\mathbf{BR}(k)] \]  

where \( \oplus \): bit per bit mod. 2 summation

\( \mathbf{SR}^r \): shift right \( r \) bits with left zero filled

\( \mathbf{BR}(\cdot) \): Bit reversal operator.

Eqn. (3.42) can be written as follows:

\[ B(k) = \sum_{i=0}^{N-1} \mathbf{w}_h(k, i) \bar{x}(i) \]  

\( \bar{x}(i) \) is the permuted sequence and is given by

\[ \mathbf{x}(i) \rightarrow \bar{x}(\mathbf{w}_w(i)), \quad i \epsilon \{0, 1, \ldots, N-1\} \]  

The permutation according to Eqn. (3.45) is illustrated by an example for \( N = 8 \).
\[ S_w(\ell) = \bar{\ell} \]

<table>
<thead>
<tr>
<th>( \ell )</th>
<th>( l_1 )</th>
<th>( l_2 )</th>
<th>( S_w(\ell) = \bar{\ell} )</th>
<th>( \bar{\ell} )</th>
<th>( X(\ell) )</th>
<th>( \bar{X}(\ell) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>2</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>3</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>4</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>5</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>6</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>7</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

It is also illustrated graphically in Fig. 3.9. It will be seen later that mapping of \( X(\ell) \) to \( \bar{X}(\ell) \) can be achieved readily through hardware with no extra computing time.

3.3.7 **Permutation Operator for Input Data to Obtain Coefficients in Dyadic Order through FWHT of Permuted Data**

Let \( B_d(k) \) denote the \( k \)th Walsh coefficient in dyadic order. Let \( S_D(\cdot) \) be a permutation operator which maps a set of integers \( \{0, 1, \ldots, N-1\} \) one-to-one onto itself (a bijection mapping). It is defined as follows:

\[
S_D(k) \triangleq \bar{k} \tag{3.46}
\]

Let the \( m \)-bit binary expansion of \( k \) and \( \bar{k} \) be

\[
k = k_{m-1} k_{m-2} \ldots k_0 \tag{3.47}
\]

\[
\bar{k} = k_{m-1} k_{m-2} \ldots k_0
\]

\( k \) and \( \bar{k} \) are related by

\[
\bar{k} = k_{m-1} k_{m-2} \ldots k_0 \tag{3.48}
\]

Eqn. (3.48) is a bit-reversal operation. From Eqns. (2.22)
Fig. 3.9 Mapping of $\hat{X}(\bar{\lambda})$ to $\bar{X}(\lambda)$ for sequency ordered transform by $S_w(.)$
and (2.25) \[ B_d(k) = B_h(k) \]

\[ = \sum_{j=0}^{N-1} w_1 h(k, j) X(j) \]  

(3.50)

using Eqn. (3.40) in (3.50),

\[ B_d(k) = \sum_{j=0}^{N-1} w_1 h(k, S_d(j)) X(j) \]  

(3.51)

\[ = \sum_{i=0}^{N-1} w_1 h(i, \lambda) X(S_d^{-1}(\lambda)) \]  

(3.52)

where \( S_d^{-1}(\cdot) \) is inverse operator equal to \( S_d(\cdot) \) in this case.

\[ X(\lambda) = \hat{X}(S_d(\lambda)) \]  

(3.53)

\[ \lambda \in \{0, 1, \ldots, N-1\} \]

The permutation according to Eqn. (3.53) is illustrated by an example for \( N = 8 \).

| \( \lambda \) | | | | | \( \hat{X}(\lambda) \) | | \( X(\lambda) \) |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 1 | 1 | 0 | 0 | 4 |
| 2 | 0 | 1 | 0 | 0 | 1 | 0 | 2 |
| 3 | 0 | 1 | 1 | 1 | 1 | 0 | 6 |
| 4 | 1 | 0 | 0 | 0 | 1 | 1 | 1 |
| 5 | 1 | 0 | 1 | 1 | 0 | 1 | 5 |
| 6 | 1 | 1 | 0 | 0 | 1 | 1 | 3 |
| 7 | 1 | 1 | 1 | 1 | 1 | 1 | 7 |
Its graphical representation is shown in Fig. 3.10.

Fig. 3.10 Mapping of $X(\ell)$ to $\hat{X}(\ell)$ for dyadic ordered transform by $S_D(.)$.
3.4 New Design Approach for a Digital WSA to Obtain Coefficients in Natural, Dyadic and Sequency Orderings

In sections 3.3.6 and 3.3.7, the permutation operators $S_w(\cdot)$ and $S_D(\cdot)$ were derived with the view that FWHT of the permuted data sequence would yield output coefficients in sequency and dyadic orderings respectively. This enables one to adapt a FWHT processor for obtaining transform coefficients in different orderings. In designing a digital WSA, the following approach is considered.

1. Discrete samples obtained from a signal are stored in the processor memory in the permuted sequence. Example: $X(0)$ is stored in memory location 0 and $X(1)$ is stored in memory location 7 for sequency order (from Fig. 3.9). In terms of hardware, this involves providing address information for the memory location, assuming that one uses semiconductor Random Access Memory (RAM).

2. The data from RAM is retrieved sequentially and is sent to the serial processor described in section 3.3.4. In the case of software implementation the FWHT on the permuted data is performed as discussed in section 3.3.1.

The schematic diagram of hardware called a Permuting Module to realize data shuffling according to Eqns. (3.45) and (3.53) is described with the aid of Fig. 3.11. It consists of an address counter $AC$, three tri-state gates $G_1$, $G_2$, $G_3$ and a set of EXCLUSIVE-OR-gates. $AC$ is $m$ bits long.
Tri-state gates G1, G2, G3.

Fig. 5.11 Special purpose Permuting Module
where \( m = \log_2 N \). Let the decimal representation of AC outputs be \( i \). From Fig. 3.11, it can be seen that input to G1 is \( S_w(i) \) and to G2, it is \( S_D(i) \). During the data acquisition phase, one of the three gates G1, G2 and G3 will be enabled by control circuitry; G1 for sequency ordered coefficients, G2 for dyadic ordered coefficients and G3 for Natural ordering. As a consequence, data stored in the memory would represent \( \hat{x}(i) \), \( \tilde{x}(i) \) and \( X(i) \) respectively. During data retrieval for further processing in a serial processor described earlier, only G3 is enabled. In the case of computer implementation, DMA could be considered as a kind of Direct Memory Access (DMA). This is described in detail in Chapter 4 where the Microprocessor based WSA is discussed in detail.

We compare the new approach with that proposed by Geadah and Corinthios [17]. Their algorithms using shift registers require a reshuffling of the intermediate results of an iteration; the reshufflings vary from iteration to iteration and with output coefficients orderings. This process takes \( N/2 \) shift operations for \( N/2 \) butterfly operations and an additional \( N \) shift operations for data reshuffling in an iteration. This slows down the processing speed by a factor of \( 3m/2(m+1) \) when compared with the new approach. When pipelined for higher input data rate, the
new design uses similar modules (assuming that one iteration uses one module) in contrast to the use of dissimilar modules in the case of [17].
CHAPTER 4
MICROPROCESSOR-BASED WALSH SPECTRAL ANALYSE

4.1 Design Principles

In a Microprocessor-based Walsh Spectral Analyser, one possible approach to obtain \( \{B(k)\}_{k=0}^{N-1} \) is to get \( \{B_{h}(k)\}_{k=0}^{N-1} \) by performing a FWHT through software and subsequently reorder \( B_{h}(k) \) stored in system memory according to Eqn. (2.25). This method is not attractive due to time-consuming reordering operation in software implementation or extra hardware required as stated in Chapter 3. The other approach is to implement Manz's algorithm [53] which requires a reordering of inputs in bit-reversed order. This is time-consuming through software, so a hardware implementation is preferred. Until now the bit-reversal operation has been implemented only through software in Microprocessor-based signal processors [55]. In Section 3.3.6, it is shown that if the data is permuted according to Eqn. (3.45), then FWHT of the permuted data sequence delivers \( \{B(k)\}_{k=0}^{N-1} \). The use of two kinds of butterfly operations of Manz's algorithm is also avoided, the processing time and program length are decreased due to the reduced bookkeeping operations by using one kind of butterfly operation. The permutation operation is realized.
through a DMA operation during the data acquisition phase.

In brief, the design principles are as follows:

(1) The input signal is sampled and converted into
digital values. The acquired data sequence is stored
in the system memory in a permuted order through DMA.

(2) After the completion of data acquisition, the FWHT is
performed according to the signal flow-graph of
Chapter 2.

4.2 System Organization

The organization of a Microprocessor-based WSA is
shown in Fig. 4.1. For descriptive purposes, the operations
of the instrument are divided into four distinct phases viz;
(1) Data acquisition, (2) Computation, (3) Control, and (4)
Display of results. The operating principles of functional
blocks shown in Fig. 4.1 are described in the following
sections. The circuit details of these blocks are dealt
with in Section 4.3.

4.2.1 Data Acquisition

Fast transform methods employed for spectrum
computation accept an input data sequence of \( N \) samples and
deliver \( N \) output coefficients. Normally \( N = 2^m \), where \( m \) is
an integer. As a result, an on-line instrument for this
purpose is required to acquire \( 2^m \) samples of input signal.
Fig. 4.1 Organization of a Microprocessor-based WSA
for one measurement cycle. For periodic signals, it is desired that $2^m$ samples span an integral number of periods to avoid leakage errors in the computation of discrete Fourier coefficients [19]. Automatic generation of sampling pulses at this rate is realized using the Frequency Multiplication Module (FMM) shown in Fig. 4.1. The FMM, in conjunction with a level crossing detector (LCD), determines the period $T_s$ of $x(t)$ digitally in the first cycle, after the process is started. Using the obtained period information $T_s$, equally spaced pulses at a repetition rate of $2^m/T_s$ are generated in the second cycle. The newfrequency multiplier circuit [21] is described in Section 4.3.4. A multiplication factor of 64 is used in the system built to obtain a 64-coefficient representation of the signal in a period. The output pulses of FMM form "convert commands" to the Analog to Digital Converter (A/D) employed to digitize $x(t)$. An eight bit A/D converter with output representation in two's complement form is used; this code is well suited to binary arithmetic used in the microprocessor system.

The output of the A/D is stored in Read-write memory (SBC 80/10 uses static RAM) in a permuted sequence according to Eqn. (3.45), starting from a base address in the RAM. Assume that memory locations starting from 0000 to 003F hexadecimal (0-63 decimal) are assigned for data storage.
The 64 output samples obtained from the A/D converter are indexed as \( X(0000) \), \( X(0001) \), \( X(0002) \), \( \ldots \), \( X(003F) \) (in 4 digit hexadecimal representation). In the case of sequential storage, \( X(0000) \) would be stored in memory location 0000, \( X(0001) \) in 0001 and so on. The memory address for the next sample storage can be obtained by a simple increment operation of an address counter. But when the data is to be stored in a permuted sequence according to Eqn. (3.45), the computation of memory address for next sample storage is not straightforward. For example, \( X(0001) \) should be stored in memory location that would be occupied by \( X(0020) \) in the case of sequential storage. Hence the address of the memory location is to be obtained by adding an offset dependent upon the data index to the base address as shown below for \( X(0001) \):

\[
\begin{array}{cccccc}
0000 & 0000 & 0000 & 0000 & \text{Base address} \\
+ & 0000 & 0000 & 0010 & 0000 & \text{Offset} \\
0000 & 0000 & 0010 & 0000 & (0020)
\end{array}
\]

It can be seen from Eqn. (3.45) that for a data length 64, only the six lsbs of the offset are affected; the remaining bits are all zero. As a result, the addition operation could be avoided by choosing a base address with the six lsbs zero. The six lsbs of the address counter which is incremented by one after the storage of each sample are modified by hardware as shown in Fig. 4.2 so that the final
output contains the proper address for data storage. The DMA block in Fig. 4.1 has been specially designed around the LSI peripheral chip 8257 for this purpose. Hardware details are described in Section 4.3.7.

4.2.2 Computation and Control

The system 80/10 in Fig. 4.1 is used for the following purposes in measuring Walsh coefficients:

1) To generate control and data information so that external circuit blocks are properly initialized for data acquisition.

2) To perform the FWHT according to the signal flowgraph in Fig. 3.3.

3) To output the coefficients for display.

These are realized through software residing in System's SBC-80/10 program memory. A brief description of SBC-80/10 and software details are included in Section 4.3.1 and Chapter 6.

In addition to software-generated command signals, various other signals such as the FMM start signal, the control signals for enabling tri-state buffers at appropriate instants, and the DMA termination signal etc. are derived using TTL circuits that are mounted on the special purpose board. They will be described later.
4.3 Hardware Design

4.3.1 System 80/10

The discussion is limited to a functional description. Circuit details are given in reference [57].

The SBC 80/10 is a complete computer system with central processing unit (CPU), system clock, read/write memory, non-volatile Read-only memory (ROM), Input/output (I/O) ports and drivers, serial communication interface, bus control logic and drivers all residing on a 6.75 x 12 inch printed circuit card. The System 80/10 block diagram is shown in Fig. 4.3.

The 8080A contains six 8-bit general purpose registers and an accumulator. The six registers can be addressed individually or in pairs, providing both single and double precision operators. The 8080A has a 16-bit program counter and allows the CPU to address directly 64K memory locations. It has a 16-bit stack pointer register which controls the addressing of a last in/first out stack located within any portion of memory. The stack permits almost unlimited subroutine nesting. Memory and I/O devices can be interfaced through the 16-bit address bus and 8-bit bi-directional data bus.

The 8080A instruction set includes five different types of instructions [58].
Fig. 4.3: System 80/10 Block Diagram
(1) Data Transfer type – move data between registers or between registers and memory.

(2) Arithmetic group – add, subtract, increment or decrement data in registers or in memory.

(3) Logical group – AND, OR, EXCLUSIVE-OR, compare, rotate or complement data in registers or in memory.

(4) Branch group – conditional or unconditional jump instructions, subroutine call instructions and return instructions.

(5) Stack, I/O and Machine control group – including I/O instructions as well as instructions for maintaining the stack and internal control flags.

The execution time of instructions varies from 4 to a maximum of 17 clock cycles; the clock frequency is 2 MHz.

The SBC-80/10 contains space for 1K of 8-bit words, using type 8111 static RAMs; addresses of RAM are from 3COO (hexadecimal) to 3FFF. Non-volatile read-only memory (ROM) of up to 4K, 8-bit words in increments of 1K can be installed in sockets provided for this purpose; addresses are from 0000 to 0FFF. ROMs used in the system are type 8708, erasable and electrically programmable.

Communication between external peripherals and the SBC-80/10 is through 48 programmable parallel I/O lines via two 8255 Programmable Peripheral Interface (PPI). Also serial communication is available via an 8251 universal
synchronous/asynchronous receiver/transmitter (USART). A Teletype is interfaced through this module. The address of parallel and serial I/O ports are given in Table 4.1.

Table 4.1: I/O Addresses of SBC 80/10

<table>
<thead>
<tr>
<th>I/O Address</th>
<th>Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>E4</td>
<td>PPI1 Port A</td>
</tr>
<tr>
<td>E5</td>
<td>PPI1 Port B</td>
</tr>
<tr>
<td>E6</td>
<td>PPI1 Port C</td>
</tr>
<tr>
<td>E7</td>
<td>PPI1 Control Port</td>
</tr>
<tr>
<td>E8</td>
<td>PPI2 Port A</td>
</tr>
<tr>
<td>E9</td>
<td>PPI2 Port B</td>
</tr>
<tr>
<td>EA</td>
<td>PPI2 Port C</td>
</tr>
<tr>
<td>EB</td>
<td>PPI2 Control Port</td>
</tr>
<tr>
<td>EC/EE*</td>
<td>USART Data Port</td>
</tr>
<tr>
<td>ED/EF*</td>
<td>USART Control Port</td>
</tr>
</tbody>
</table>

* Jumper selectable

The SBC-80/10 has an external bus which includes system data and address buses and a clock. The DMA module and the A/D converter are interfaced to the system SBC-80/10 through this bus. The bus clock which is derived from the oscillator of the processor clock is used in the FMM module; no separate clock is needed. As the bus signals are required for the special purpose circuit board, they are listed in Table 4.2. A HP Graphics Terminal 2647A [59] is
### Table 4.2 System bus

#### PIN ASSIGNMENTS FOR CONNECTOR P1
(System Bus)

<table>
<thead>
<tr>
<th>POWER, SUPPLIES</th>
<th>PIN</th>
<th>Mnemonic</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>GND</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
<tr>
<td>3</td>
<td>VCC</td>
<td>VCC</td>
<td>+5VDC</td>
</tr>
<tr>
<td>5</td>
<td>VCC</td>
<td>VCC</td>
<td>+5VDC</td>
</tr>
<tr>
<td>7</td>
<td>VDD</td>
<td>VDD</td>
<td>+12VDC</td>
</tr>
<tr>
<td>9</td>
<td>GND</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
<tr>
<td>BUS CONTROLS</td>
<td>13</td>
<td>BCLK/</td>
<td>Bus Clock</td>
</tr>
<tr>
<td></td>
<td>15</td>
<td>BBEN</td>
<td>Bus Pri. In</td>
</tr>
<tr>
<td></td>
<td>17</td>
<td>BUSY/</td>
<td>Bus Busy</td>
</tr>
<tr>
<td></td>
<td>19</td>
<td>MRDC/</td>
<td>Mem Read Cad</td>
</tr>
<tr>
<td></td>
<td>21</td>
<td>IOEC/</td>
<td>I/O Read Cad</td>
</tr>
<tr>
<td></td>
<td>23</td>
<td>XACK/</td>
<td>XFER Acklow</td>
</tr>
<tr>
<td>SPARES</td>
<td>25</td>
<td>AACK/</td>
<td>Special</td>
</tr>
<tr>
<td></td>
<td>27</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>29</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>31</td>
<td>CCLK/</td>
<td>Constant Clock</td>
</tr>
<tr>
<td></td>
<td>33</td>
<td></td>
<td></td>
</tr>
<tr>
<td>INTERRUPTS</td>
<td>35</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>37</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>39</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>41</td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADDRESS</td>
<td>43</td>
<td>ADRF/</td>
<td>Address Bus</td>
</tr>
<tr>
<td></td>
<td>45</td>
<td>ADRD/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>47</td>
<td>ADRB/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>49</td>
<td>ADRS/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>51</td>
<td>ADR6/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>53</td>
<td>ADR4/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>55</td>
<td>ADR2/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>57</td>
<td>ADR1/</td>
<td></td>
</tr>
<tr>
<td>DATA</td>
<td>59</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>61</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>63</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>65</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>67</td>
<td>D76/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>69</td>
<td>D74/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>71</td>
<td>D72/</td>
<td></td>
</tr>
<tr>
<td></td>
<td>73</td>
<td></td>
<td></td>
</tr>
<tr>
<td>POWER SUPPLIES</td>
<td>75</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
<tr>
<td></td>
<td>77</td>
<td>V82/</td>
<td>-10VDC</td>
</tr>
<tr>
<td></td>
<td>79</td>
<td>VAA</td>
<td>-12VDC</td>
</tr>
<tr>
<td></td>
<td>81</td>
<td>VCC</td>
<td>+5VDC</td>
</tr>
<tr>
<td></td>
<td>83</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
<tr>
<td></td>
<td>76</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
<tr>
<td></td>
<td>78</td>
<td>V82/</td>
<td>-10VDC</td>
</tr>
<tr>
<td></td>
<td>80</td>
<td>VAA</td>
<td>-12VDC</td>
</tr>
<tr>
<td></td>
<td>82</td>
<td>VCC</td>
<td>+5VDC</td>
</tr>
<tr>
<td></td>
<td>84</td>
<td>VCC</td>
<td>+5VDC</td>
</tr>
<tr>
<td></td>
<td>86</td>
<td>GND</td>
<td>Signal GND</td>
</tr>
</tbody>
</table>

- **Used by Intellect MDS Bus.**
interfaced through the USART which provides an RS 232C interface for serial I/O data communication.

A software monitor for the system 80/10, contained in two ROMs 8226 is available. This aids the development of application software with the following capabilities:

(1) it gives the user access to console input and output routines as well as paper tape input and output control software.

(2) along with a TTY, it provides access to memory and registers and has control commands to begin execution and to display or alter the contents of memory or registers. The system monitor resides in two ROMS and only 2K of ROM is available for the user. The system monitor is used only for program development. In its final form, the instrument dispenses with the system monitor, leaving space for an additional 2K of ROM space for user's program.

4.3.2 **Level Crossing Detector (LCD)**

The purpose of LCD is to generate a TTL-compatible square wave signal with period equal to that of input signal $x(t)$ for frequency multiplication purposes. It is designed using a high speed comparator LM311 (G4) as shown in Fig. 4.4(a). When the positive going $x(t)$ crosses $V_{\text{ref}}$ set by potentiometer P1, G4 output goes LO marking the beginning of a period. The output goes HI when negative-going $x(t)$
Fig. 4.4(a) Circuit diagram of LCD

Fig. 4.4(b) Output waveform of LCD
crosses $V_{ref}$. This is illustrated in Fig. 4.4(b). A 50 mV hysteresis is introduced into the circuit through feedback resistors to prevent false triggering by noise during period measurement.

4.3.3 Frequency Multiplication Module (FMM)

Analog methods which perform frequency multiplication often employ phase-locked loop (PLL) techniques. However the response of PLL is slow and is not satisfactory for low frequency signals. Krishnamurthy et al. proposed an analog method using an analog integrator, a set of comparators and logic gates [60]. The principle of operation is explained with reference to Fig. 4.5 for a multiplication factor of 3.

The frequency multiplication system consists of a square wave generator $SG$, Integrator $I$, comparators $CR1$, $CR2$ and exclusive-OR gates EOR1 and EOR2, all connected as shown in Fig. 4.5. The triangular waveform at the output of $I$ is compared with preset reference voltages $VR1$ and $VR2$ equal to 1/3 and 2/3 respectively of the triangular wave amplitude in $CR1$ and $CR2$. The output waveforms corresponding to circuit blocks are given in Fig. 4.6. The outputs are then exclusive-ORed in the gate EOR1 and finally the output of gate EOR1 is exclusive-ORed with square wave $A$. The output waveform $F$ will have a frequency equal to 3 times the input frequency. Though the above system is designed for fixed
Fig. 4.6 Waveforms of the Frequency tripler circuit
frequency signals, it can be adapted to varying frequency by deriving the reference voltage to the comparators from the output of the Integrator I by using a peak detector [61].

The above principle can be extended to realize frequency multiplication by a factor n by using \((n-1)\) comparators with reference voltages of magnitude \(V/n, 2V/n, \ldots, (n-1)V/n\) where \(V\) represents the peak of triangular wave. The low frequency operation may be limited by the accuracy of the integrator. The accuracy of pulse positioning will be poorer than that for digital methods. Also the number of comparators increases with \(n\) and may not be attractive in terms of the number of ICs used when compared to digital methods. The frequency range of operation of this system is lower than that for digital methods.

There are also digital methods using rate multipliers but these usually do not generate equally-spaced pulses [62].

A digital technique to produce equally spaced pulses with fast response to changes in input frequency (one period) was first developed by Siemens and Kitai [9] and other systems using the same basic principle have since been reported [63, 65]. These methods are restricted to low frequency input signals by the maximum clocking rate for IC devices; for example, 6.6 kHz maximum at a clock frequency
of 55 MHz, and a multiplication factor of 64 (output pulse frequency: 363 kHz) using Schottky TTL. The output frequency is limited by the speed of TTL counter ICs. Due to the finite set up time for up/down mode control, independent up and down counters are used when operated with a 55 MHz clock. In the new method [21] the maximum output frequency is limited by the settling time of a D/A converter rather than by the speed of digital components. Only one up/down counter is used. In the system built, a low cost D/A converter was used having a 1 μsec settling time. This limits the maximum output pulse frequency to 660 kHz at a clock frequency of only 2 MHz. The maximum output pulse frequency would be increased five times, using the D/A converters with 200 ns settling time that are now available from leading manufacturers. Since the new method has evolved from the earlier one, we first describe the earlier method and its limitations.

Earlier Method

The basic principle of operation may be followed from Fig. 4.7 which includes three binary counters. Counters F and I are up-counters, while down-counter DC is presettable via its load input LD. Initially all three counters are cleared and the clock is connected to counters F and I in cascade. The clock generates pulses of period T where $T << T_S$ ($T_S$ is the period of the input signal). During the first
Fig. 4.7 Block diagram of Digital Frequency Multiplier - Earlier method

T₀ = (I)T ≈ Tₛ/2^k
cycle of the signal, the beginning of which is marked by a trigger signal from the 
LCD, the input period is determined in terms of accumulated clock pulse count in F and I. At 
the end of the first cycle DC is preset by loading the contents (I) of counter I; and switch S is operated so that 
the clock only feeds counter DC. When the counter contents 
(DC) reaches zero, an output pulse is generated which also 
reloads DC with (I). During this phase of operation, DC 
receives pulses at a rate which is $2^k$ times faster ($k$ being 
the number of binary stages of counter F) than the rate for 
counter I during the first cycle, so there are $2^k$ uniformly 
spaced pulses during time $T_S$. In Walsh spectral measurement 
applications, a second phase of duration $T_S$ suffices (i.e. 
the data window spans one cycle of the signal).

In processes of this basic type, an error arises because résidue (F) in the counter F is ignored. The 
output pulse period is $T_o = (I)T + (F)T/2^k$. The timing error between consecutive output pulses is

$$\frac{(F)T/2^k}{(I)T + (F)T/2^k}$$

which is roughly $-1/(I)$ maximum. This error is made small 
by increasing the minimum count $(I) = 2^{m-1}$ say, relative to 
the maximum possible value of $(F)$ viz. $2^k-1$. Table 4.3 
gives errors for different value of $m$, based on a 
multiplication factor of 64. Also the maximum input
Table 4.3 Timing Error of PMM

<table>
<thead>
<tr>
<th>k</th>
<th>m (bits)</th>
<th>Max. Timing error %</th>
<th>Minimum clock pulse count</th>
<th>Max input frequency for 10 MHz clock</th>
</tr>
</thead>
<tbody>
<tr>
<td>6</td>
<td>8</td>
<td>-0.763</td>
<td>8192</td>
<td>1221 Hz</td>
</tr>
<tr>
<td>6</td>
<td>10</td>
<td>-0.192</td>
<td>32768</td>
<td>305 Hz</td>
</tr>
<tr>
<td>6</td>
<td>12</td>
<td>-0.048</td>
<td>131072</td>
<td>76 Hz</td>
</tr>
</tbody>
</table>

frequency for a clock frequency of 10 MHz is given for each case.

The counter I has a latched underflow indicator at its mth bit to establish that \( I, \geq 2^{m-1} \) in any measurement. An overflow indicator is also provided. The maximum error in Eqn. (4.1) can be reduced to \( \pm \frac{1}{2^m} \) (magnitude halved, but error which may be positive or negative) by initializing counter F to half range (MSB of F=1) instead of zero [9].

An additional timing error is due to the finite clock frequency which introduces a maximum error \( -T \) in the period determination. The normalized maximum timing error is

\[
-\frac{1}{2^k (I) + (F)'}
\]

and is negligible in comparison with Eqn. (4.1).

The new system, a block diagram of which is shown in Fig. 4.8, uses the information contained in the counter F so as to obviate the error given by Eqn. (4.1). At the end of the first cycle the contents of the counter F and I give the
Fig. 4.8 Block diagram of Digital Frequency Multiplier - New method
period $T$ of the signal in terms of the number of clock pulses, as in the earlier systems. Division by $2^k$ yields an integer part $I$ and a fractional part $\psi = (B^2) / 2^k$. The process is such that successive output pulses are delayed relative to those for the earlier method by $\Delta, 2\Delta, 3\Delta, \ldots$ where $\Delta = \frac{1}{2} T$ and $0 < \psi < 1$. These successively-increasing delays are achieved by means of an accumulating adder shown as $A$ and ACC, a D/A converter and a linear pulse duration modulator (PDM). The up-counter $I$ and the down-counter DC of Fig. 4.7 are replaced by a presettable up-down counter $I$ and a latch $IL$ in Fig. 4.8 which serve the same purpose as before.

**Principle of Operation**

When initializing the instrument, switches $S_a$ and $S_b$ are in position 1 so that counters $F$ and $I$ are in cascade. The counter $I$ is in the up-count mode, and all digital storage devices are cleared. Before proceeding, we note that if $(F)$ happens to be zero at the end of the first cycle, there should be a zero delay in PDM. It is not practical to make a monostable circuit whose delay commences at zero, so the PDM circuit is designed to provide a delay $T$ for $(F) = 0$ and slightly less than $2T$ i.e. $(2-1/2^k)T$ when $(F)$ is maximum. The circuit uses delay $\Delta = (1+\psi)T$, and to compensate for this, the PDM circuit is triggered when counter $I$ reaches one instead of zero in its
down count mode. The one state of the counter I is detected by D₁ in Fig. 4.8. In the pulse delay process, a carry CY may be produced by adder A. When a CY occurs, the counter is allowed to count down to zero as detected by D₀; this corresponds to a delay of the triggering of the PDM by T secs, as described shortly.

The first phase, of duration Tₛ, is no different from that of the system in Fig. 4.7. At its termination switches S₁ and S₂ are changed to position 2, the contents (I)ₚ of I are latched in IL and counter I is changed to count down. At the beginning of the second phase, the output of adder A is (F). The digital output of A is converted into an analog voltage by the D/A converter, and applied to the PDM circuit. Counter I proceeds to count down and on reaching one, the PDM circuit is triggered to produce the first output pulse, the PDM delay being \( \tau_1 = [1+\tau]T \). Accumulator ACC is clocked so that (A) changes to 2(F), corresponding to a PDM delay \( \tau_2 = [1+2\tau]T \) and so on. Overflow of adder A first occurs at the rth output pulse which satisfies the condition \( \tau_1 = [1+\tau]T \geq 2T \); then (A) falls to r⁻¹, the carry latch CY is set, and D₀ is enabled so that the trigger pulse to the PDM is delayed by one clock pulse. The process of producing uniformly-timed output pulses continues in this way during the second cycle of the input signal; adder A overflows from time to time and the trigger pulse to PDM is
delayed by time interval $T$ whenever it does. The D/A converter output during this cycle is a staircase voltage which falls in value whenever $CY = 1$.

Regardless of the state of $CY$, the NOR gate in the detector circuit is low for a time duration $T$. The trailing edge of this output pulse is used to reload the counter $I$, so although the PDM circuit is triggered at $(I) = 1$ when $CY = 0$, reload takes place when $(I)$ falls to zero. Likewise, when $CY = 1$, the reload takes place when $(I)$ falls to $-1$ (two's complement).

**Timing of Output Pulses**

The duration of the $r$th PDM output pulse is given by

$$
\Delta_r = \left(1 + \left[\frac{r}{4}\right] \mod 1\right) T
$$

(4.3)

Output pulses occur at times $t_1$, $t_2$, etc. from the instant at which the second cycle commences:

$$
t_1 = [(I)-1]T + \Delta_1
$$

$$
t_2 = [(I) + (I)-1]T + \Delta_2 = [2(I)-1]T + \Delta_2
$$

etc. Also with every overflow of the adder, there is a cumulative delay $T$ in the output pulse timing, so that in general, the time of occurrence of the $r$th output pulse is

$$
t_r = \left[\frac{r(I) - 1}{4} + CY_r\right] T + \Delta_r
$$

(4.4)
where \( r \sum \bar{C}_Y \) is the sum of carries up to and including \( r \).

From Eqns. (4.3) and (4.4) the interval between any two successive output pulses is,

\[
T_r - T_{r-1} = (I)T + \nu T = (I)T + (F)T/2^k
\]

(4.5)

which is exactly the desired interval. The expressions (4.3) to (4.5) ignore the time quantization errors in the measurement of the period \( T_S \) of the signal in the first phase due to the finite clock frequency, and errors due to nonlinearity and imperfect adjustment of the PDM circuit. It is seen that \( I \) fills the role of a coarse digital timer, while \( F \) is used for fine timing adjustment. Small errors in the D/A converter or in the PDM circuit disturb the output pulse positions slightly, but they do not give rise to cumulative errors in output pulse timing.

**Example**

To illustrate the functioning of the circuit, the following are assumed:

- Clock frequency 100 KHz \((T = 10 \mu s)\)
- Input signal frequency 59.6 Hz \((T_S = 16.7785 \, \text{ms})\)
- Multiplication factor \(2^k = 64\) \((F \text{ has } 6 \text{ bits})\)

The required PDM characteristic for the above clocking rate is shown in Fig. 4.9. A unit increment in (A) gives rise to a 5/32 \( \mu \text{s} \) increment in output pulse duration, and the maximum pulse duration is 10(2^k)/64 \( \mu \text{s} \) < 20 \( \mu \text{s} \).
Fig. 4.9 PDM characteristics for a clock frequency of 100 KHz
In this example, $T_S/T = 1677.85$ and since the counters ignore fractions, their contents at the end of the first cycle are $(I) = 26$, $(P) = 13$, so that $\psi = 13/64$. The ideal output pulse period is $T_S/64 = 262.16$ $\mu$s. Table 4.4 lists the output (A) of the adder and carry CY, the PDM output pulse duration $\Delta_T$, and the times of occurrence $t_r$ of the output pulses, using Eqn. (4.4). The first eleven pulses suffice. The fluctuations of (A) and the occurrence of the carry are evident. The intervals between successive output pulses are constant at 262.03 $\mu$s, being low by about 0.05% due to the clock quantization.

<table>
<thead>
<tr>
<th>Output Pulse No.</th>
<th>(A)</th>
<th>CY</th>
<th>$\Delta_T$ (\mu s)</th>
<th>$t_r$ (\mu s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>13</td>
<td>0</td>
<td>12.03</td>
<td>262.03</td>
</tr>
<tr>
<td>2</td>
<td>26</td>
<td>0</td>
<td>14.06</td>
<td>524.06</td>
</tr>
<tr>
<td>3</td>
<td>39</td>
<td>0</td>
<td>16.09</td>
<td>786.09</td>
</tr>
<tr>
<td>4</td>
<td>52</td>
<td>0</td>
<td>18.13</td>
<td>1048.13</td>
</tr>
<tr>
<td>5</td>
<td>1</td>
<td>1</td>
<td>10.16</td>
<td>1310.16</td>
</tr>
<tr>
<td>6</td>
<td>14</td>
<td>0</td>
<td>12.19</td>
<td>1572.19</td>
</tr>
<tr>
<td>7</td>
<td>27</td>
<td>0</td>
<td>14.22</td>
<td>1834.22</td>
</tr>
<tr>
<td>8</td>
<td>40</td>
<td>0</td>
<td>16.25</td>
<td>2096.25</td>
</tr>
<tr>
<td>9</td>
<td>53</td>
<td>0</td>
<td>18.28</td>
<td>2358.28</td>
</tr>
<tr>
<td>10</td>
<td>2</td>
<td>1</td>
<td>10.31</td>
<td>2620.31</td>
</tr>
<tr>
<td>11</td>
<td>15</td>
<td>0</td>
<td>12.34</td>
<td>2882.34</td>
</tr>
</tbody>
</table>
Maximum Speed

The maximum output frequency of the system is determined by the settling time of the D/A converter and the recovery time of the PDM circuit. The maximum duration of the PDM circuit output pulse is slightly less than 2T, so if this circuit recovers in a time that is less than T, the minimum value of (I) is 3. This assumes that the D/A converter settles within 2T. In the system constructed the settling time of the D/A converter determines the maximum clock frequency. A D/A converter type 7520 with an operational amplifier type 350J was used. A clock frequency of 2 MHz produces a maximum pulse output frequency of 660 KHz. For k = 6, the corresponding maximum input frequency is 10.3 KHz.

The PDM circuit used provides an output pulse whose duration is independent of analog input voltage variations during the triggered state. This is used to advantage by updating the D/A converter during the triggered state. Figure 4.8 shows that the PDM trigger also clocks the accumulator. Carry CY is prevented from changing during early D/A output updating since clocking of the CY latch takes place only when an output pulse is generated.

Input Signal Frequency Range

The minimum count (I) and the number of binary stages in (I) determines the frequency range covered. Thus if I and
its associated modules in the system are 12 bits long, a frequency range of 1365 to 1 in input signal is accommodated.

**Timing Errors**

Timing errors between any two consecutive pulses are due to the following:

a) Error due to the finite clock rate, given by Eqn. (4.2). For \( I = 26 \), as in the example, the maximum error is 0.058% at \( F = 0 \). The maximum error at \( I = 3 \) is 0.52%, which compares with the error obtained with the system described in [9] for minimum \( I = 511 \).

b) The spacings between consecutive output pulses vary because of nonlinearity of the D/A converter and of the PDM characteristic. Let these, combined, give a time error \( \delta \) in PDM pulse duration. The maximum pulse timing error, normalised with respect to \( T_0 \), is \( \delta/(I)T \) and is usually negligible compared with the clock quantization error.

### 4.3.4 Frequency Multiplication System: Circuit Description:

**Functional Organization**

For descriptive purposes, the frequency multiplication system circuitry can be functionally divided into five blocks as follows:
(1) Control circuitry
(2) Fractional counter F and accumulator ACC
(3) Integral counter I
(4) Digitally controlled Pulse duration Modulator
(5) Trigger and load pulse generator.

The circuitry details of each block are described in the following sections.

**Control Circuitry**

The circuit diagram is shown in Fig. 4.10. It responds to signals from a control panel or from a microcomputer to which this system could be interfaced and starts the pulse multiplication process. The Control Signals are TTL Compatible.

2.1 **START**: The negative going edge of the pulse sets flip-flop G5 and enables "Trigger from level crossing of input signal" to Status Counter SC input through NAND gate G6 and inverter G7.

2.2 "Trigger from level crossing of input signal $x(t)$": is derived from the input signal whose positive going edge marks the beginning of a cycle of the input signal $x(t)$. This signal, gated by the output of flip-flop G5 activates counter SC.

Control circuitry also incorporates a data multiplexer (MPXR). The counter SC is 2 bits long and is cleared initially so that its two outputs PHASE1 and PHASE2 are both
Fig. 4.10 Control circuitry of FMM
LO. The first trigger from level crossing of input $x(t)$ changes PHASE1 output to HI and marks the first phase of frequency multiplication process i.e. determination of the period of input signal $x(t)$. The second trigger changes PHASE2 to HI and (PHASE1)(PHASE2) represents the second phase of the multiplication process.

The PHASE2 signal is the selector input to MPXR consisting of TTL quadraple 2 to 1 Data Selector 74157 (G9); it represents switches $S_a$ and $S_b$ shown in the block diagram Fig. 4.8. A LO at input S corresponds to position 1 of $S_a$ and $S_b$ in Fig. 4.8. MPXR in this position gates A inputs to Y outputs. HI on S corresponds to position 2 of $S_a$ and $S_b$. MPXR gates B inputs to Y outputs in the position 2. The four pairs of input signals are:

a) system clock CP gated by PHASE1 signal is applied to 1A input; 1B input is held HI. 1Y output CP1 forms clock pulses to counter forming least two significant bits of fractional counter F.

b) system clock CP gated by PHASE1 signal is applied to 2A input; 2B input is held HI. 2A input is logical complement of 1A input. The corresponding 2Y output CP2 sends clock pulses to counter forming four most significant bits of the fractional counter F.

c) The carry output of the fractional counter F followed by inversion is applied to 3A input; 3B input is
held HI. The corresponding 3Y output CP3 delivers clock pulses to integral counter I in up-count mode.

d) 4A input is held HI; system clock CP is connected to 4B input. The corresponding 4Y output supplies clock pulses to the integral counter in down-count mode.

**Fractional Counter F and Accumulator ACC:**

The circuit diagram is shown in Fig. 4.11. Counter F is 6 bits long and consists of two cascaded binary counters 7476 and 74161. Carry CP13, generated when the 74161 counter overflows, drives the integral counter I in the up-count mode.

Accumulator ACC consists of a 6 bit TTL latch 74174 and serves as a register to store the output of 6 bit adder A. The accumulator is loaded with the output of adder A by a trigger pulse as explained later. The digital output of the adder (SB0-SB5) is the sum of the contents of the fractional counter (F) and the contents of ACC. Adder A consists of a 7482 and a 7483 in cascade.

**Integral Counter I**

The circuit diagram of an 8 bit integral counter I is shown in Fig. 4.12; additional cascaded stages may be added if desired. Countup pulses to I are the carry output pulses of the fractional counter F and count-down pulses are the clock pulses CP. The contents (I) of the counter I at the beginning of second period of input signal x(t) are saved in
Fig. 4.11 Fractional counter and accumulator
an 8212 8-bit latch IL. PHASE2 signal which marks the
beginning of the second period is used as clock to latch (I)
into IL.

The 00000000 and 00000001 states of the counter I are
decoded as negative going signals designated as ZERO and ONE
respectively by using Hex inverters 7404 and 8 input NAND
gates 7430. These signals are combined with the carry of
adder A to generate the trigger pulse to PDM and the load
pulse to the counter I.

**Digitally Controlled Pulse Duration Modulator (PDM):**

The PDM is required to deliver output pulses of
duration varying from 1 to 2 clock periods. The pulse
duration is linearly related to input voltage. Referring to
Fig. 4.13, the PDM consists of D/A converter followed by
analog pulse duration modulator.

**D/A Converter**

A type AD 7520 10 bit multiplying type converter with
current output is used. The reference voltage is -3.7 volts
derived from -5 volts by zener diode D1 and resistor R1.
The current output of the D/A is converted to voltage by a
BB3507J summing operational amplifier AMP [66]. A dc offset
voltage from potentiometer P1 is also applied to the summing
amplifier through a 10K input resistor R2. The summing
amplifier output is applied to the input of the analog pulse
duration modulator.
Analog Pulse Duration Modulator

The pulse duration modulator uses a TTL monostable multivibrator 74121. Its operation is explained by referring to Fig. 4.13. In the quiescent state the output at \( C_{\text{ext}} \) is HI (typically 4.8V). Diode D2 is reverse biased, provided that \( V_{\text{in}} \) is less than the voltage at the \( C_{\text{ext}} \) pin. Timing capacitor C is charged to a value \( (V_{\text{in}} - V_{\text{off}}) \) where \( V_{\text{off}} \) is the voltage at the \( R_{\text{ext}} / C_{\text{ext}} \) pin in the quiescent state (typically 0.7 V). When triggered by an input pulse, the voltage \( V_d \) at \( C_{\text{ext}} \) goes LO (0.7 V) and the voltage at \( R_{\text{ext}} / C_{\text{ext}} \) goes negative with respect to ground. Capacitor C charges linearly, the constant charging current I being adjustable by means of potentiometer P2. When the voltage at \( R_{\text{ext}} / C_{\text{ext}} \) reaches the threshold value \( V_{\text{th}} \) (typically 0.7 V), the monostable reverts back to the quiescent state. The pulse duration \( T_D \) is given by

\[
T_D = CV_{\text{in}} - C (V_{\text{off}} + V_d - V_{\text{th}})
\]

The second term is cancelled by offsetting \( V_{\text{in}} \) in the quiescent state.

Once the PDM circuit is triggered, the duration of the output pulse is not affected by a change in \( V_{\text{in}} \). This characteristic of the circuit is used to advantage in providing an early updating of the D/A converter output, which now settles while the PDM is in the triggered state.

The measured voltage vs pulse duration characteris-
tics obtained for the PDM with $C = 470$ pf, $R3 = 500$ ohms and $I = 0.760$ MA is shown in Fig. 4.14.

Triggering and Load Control

As explained previously, the PDM is activated during each output period at one of the following instants,

(a) $CY = 0$ and $(I) = 1$

(b) $CY = 1$ and $(I) = 0$

The trigger pulses commence at the second period of the input signal $x(t)$. This is realized by using combinational logic NAND and NOR gates as shown in Fig. 4.15.

The load pulse to the integral counter lags behind trigger pulse by one clock period $T$. Once the PDM circuit is triggered, its output pulse is not affected by a change in its analog input voltage. This permits early updating of the D/A converter by the leading edge of the PDM trigger pulse. Referring to Fig. 4.9, it is also used to update the accumulator ACC, rather than by using the output pulse. Early updating will, on occasion, affect the carry from adder $A$. A D-flip-flop is therefore included; it is clocked by the output pulse, and provides the trigger control signals $CY$ and $CY$. This makes the width of the trigger pulse to be $T$. The load pulse to counter $I$ is derived by triggering a monostable multivibrator from the trailing edge of the trigger pulse. Separate decoding circuits for the triggering and load pulse are avoided.
Fig. 4.14 Pulse width vs control voltage characteristics
FIG. 4.15 Triggering and load control
4.3.5 A/D Converter

The analog input signal x(t) is quantized by A/D converter ADC 82AG (successive approximation type) in 2.8 μsec. It accepts bipolar signals within ±5 volts range and provides an 8 bit digital output in complementary offset binary/two's complement representation. The circuit diagram is shown in Fig. 4.16(a).

The output pulses of FMM form convert commands. A conversion begins with the negative going of a command pulse. Simultaneously the status output ST goes HI. Completion of a conversion is marked by ST returning to LO; a write signal is generated at this instant to write the output of A/D in the System 80/10 memory. The timing diagram in Fig. 4.16(b) illustrates the above sequence of events. The output of A/D is interfaced to the System 80/10 data bus through an eight-bit bipolar 8212 input/output port. The mode of operation is determined by the "DENB" signal as seen below, which is derived by combining DACK1 and A/DR/ of DMA controller.

<table>
<thead>
<tr>
<th>DENB</th>
<th>STB</th>
<th>DS1,DS2</th>
<th>MD</th>
<th>output</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>tri-state</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>DATA in</td>
</tr>
</tbody>
</table>
Fig. 4.16(a) Circuit diagram of A/D converter

Fig. 4.16(b) A/D converter Timing diagram
The ±15V supplies required for A/D are derived from +5V using a type 546 Dual Power supply module [67].

4.3.6 DMA

The DMA function is to generate, upon request, memory addresses for the A/D data to be deposited directly in the System 80/10 RAM. The Programmable DMA controller 8257 shown in Fig. 4.17(a) has four channels [68]. It operates in three modes; (1) DMA read, which causes data to be transferred from memory to a peripheral, (2) DMA write, which causes data to be transferred from a peripheral to memory, and (3) DMA verify, which does not actually involve the transfer of data. In the instrumentation built for WSA only channel 1 in mode 2 is used. Each channel includes two sixteen-bit registers; (1) DMA address register, (2) Terminal count register. The former is loaded with the address of the first memory location to be accessed (memory location 0400 Hexadecimal is used as the starting address). The value loaded into the low-order 14 bits of the terminal count register specifies the number of DMA cycles minus one before terminal count TC output is activated. The two most significant bits of the terminal count register specify the type of DMA operation as follows:
<table>
<thead>
<tr>
<th>Bit 15</th>
<th>Bit 14</th>
<th>Type of DMA operation</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>verify DMA cycle</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>write DMA cycle</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>read DMA cycle</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>illegal</td>
</tr>
</tbody>
</table>

The 8257 registers are programmed by the system 80/10 using I/O instructions, and are configured as I/O devices with the four MSBs of address data as 0000 (1111 on the system bus); these four bits are decoded by NAND gate G34 to provide a chip select (CS) signal to the 8257 as shown in Fig. 4.17(a). The DMA request DRQ1 is generated through software via the output port EB. Upon receiving a DMA transfer request, the 8257

(1) acquires control of system 80/10 bus through hold request (HRQ) output. HRQ, after inversion, is connected to BPRN/ line of the system 80/10 Bus.

(2) acknowledges the DMA request and DACK1 output goes LO (DMA request on channel 1 is assumed). The first negative going transition of DACK1 is used to start the frequency multiplication process. It is also used to enable the A/D output to the system bus.

(3) outputs the least significant eight bits of memory address on its output lines A₀ to A₇. These outputs are operated by the EXCLUSIVE-OR gates G37 to modify the address bits according to
\begin{align*}
B_i &= A_i + A_{i+1} \quad i = 0, 1, \ldots, 4 \\
B_5 &= A_5
\end{align*}

These are connected to 8212 I/O port G39 in bit reversed order. This is exactly the operation involved in implementing Eqn. (3.45) in hardware. The inverters (G38) are used to make the address outputs compatible with the system 80/10 address bus. Note that bit DI_1 is held permanently at logic 0 and six output bits B_0 - B_5 are used to derive ADR1/-ADR6/. This means that alternative locations are skipped in depositing the A/D data in memory. The reason for doing this is to reformat the data as a 16-bit word through software, since computations are performed on 16-bit data. In the case of A/D with more than 8 bits, bit DI_1 of 8212 may be used to point to the next location to deposit the MSB of data, by complementing it before the 8257 issues the next DMA cycle. The most significant 8 bits are latched at I/O port 8212 by AEN and ADSTB signals. The sequence of events after a DMA request in DMA write mode is illustrated in Fig. 4.18. Note that A/DR/ output of 8257 enabling the A/D output to the data bus and write pulse MEMW are generated with a delay of one and two clock cycles from ADSTB pulse respectively. It is possible that an A/D conversion might not have been completed within this time. To avoid the transfer of incorrect data, memory write pulse MWTC/ is generated using the monostable multivibrator G44.
Fig. 4.17(b) Address data interface between 8257 and System 80/10 in programming mode
Fig. 4.18 Timing diagram of DMA operation
and flip-flop G45 triggered by the negative going edge of STATUS output of A/D as shown in Fig. 4.17(b). This prolongs the DMA write cycle by not returning an active READY signal by the system 80/10 memory.

During the programming mode, the 4 LSB address bits are transmitted from the system 80/10 bus to the 8257 A₀–A₃ inputs through 8212 I/O Port (G43) as shown in Fig. 4.17(b). I/O Ports G39 and G40 are in high impedance state while G43 is enabled during the programming mode of the 8257.

4.3.7 Control

The use of off-the-shelf general purpose peripheral ICs, tri-state I/O ports 8212, and programmable DMA controller 8257 results in a compact control structure. The control reduces to that of initiating and terminating a DMA request. This is implemented using a 7474 flip-flop (G48) as shown in Fig. 4.19. It is set by a BEGIN signal generated through software by the system 80/10. At the completion of required number of DMA cycles (64 samples in the system developed), TC output from the 8257 goes HI and resets G45, thereby terminating further DMA requests.

A provision is made through a two position switch to use either the input signal to be analysed or external marker pulses for frequency multiplication purposes. The latter mode permits the sequency analysis of non-periodic
Fig. 4.19 Control circuit
signals by computing the sequency coefficients of a signal defined within the period set by two marker pulses.

4.3.8 Output Display

Attention to the display of final measured quantities is important for the effective use of any instrument. In the case of Walsh spectral measurements, the coefficient values and a plot of amplitude vs sequency are desired. In Fourier analysis, the outputs of interest are numerical values of Fourier components, a plot of Magnitude vs Frequency and Phase vs Frequency. These capabilities are added by interfacing to an intelligent BASIC graphic terminal 2647A. (The cost of a terminal of this capability is much higher than the instrument itself.) The TERMINAL BASIC contains most of the standard BASIC statements together with statements that will allow a program to monitor and control terminal operations. This allows a program running in the 2647A to interact with peripheral devices (the Walsh Spectral Analyser in the present case). Detailed operation of the terminal is given in references [59,69].
CHAPTER 5
ON WALSH TO FOURIER CONVERSION

5.1 Introduction

Walsh-to-Fourier Conversion was studied by Blachman [70,71], Abramson [72], Siemens and Kitai [22,73] and recently by Tadokoro and Higuchi [23]. Signals can be classified into four spectral categories:

1. infinite Walsh series with infinite Fourier Series
2. finite Walsh series with finite Fourier Series
3. finite Walsh series with infinite Fourier Series
4. infinite Walsh series with finite Fourier Series

The last category is of particular interest. It has been shown that for a bandlimited signal with the highest normalised frequency component (harmonic) \( F \), all the \( F \) Fourier components can be determined from a finite number of Walsh coefficients with the sequency up to \( S \) without error, provided \( S \geq F \) [22,73,74,75,76]. The Discrete Fourier Transform of a given sampled data sequence is widely employed to compute the Fourier coefficients of a signal. The DFT is implemented on general or special purpose computers using FFT techniques. Given \( N \) data samples, one
can obtain $N$ Walsh coefficients using Fast Walsh transform algorithms as discussed earlier and $N$ Fourier coefficients using FFT algorithms. Tadokoro and Higuchi [23] have formulated an algorithm to obtain $N$ discrete Fourier coefficients from $N$ discrete Walsh coefficients (a coefficient obtained using sampled values is referred to as a discrete coefficient) and it differs from the method used by Siemens and Kitai. Either way, obtaining the Fourier coefficients through the Walsh coefficients reduces the number of multiplications when compared to the Cooley-Tukey FFT algorithm for a data length up to 64. The fast nature of this process was pursued by Horton for digital impedance relaying of power lines [77]. Also when the desired number $L$ of Fourier components is small compared to the input data length $N$, this conversion process is faster. (This feature is beneficial in narrow band signal analysis.) A comparison of the Walsh to Fourier transformation process with the FFT method with respect to the number of multiplications, storage requirements and the finite word length effects is now considered.

5.2 Walsh to Fourier Conversion Process due to Siemens and Kitai [22]

This process is illustrated in Fig. 5.1. Using the symmetry properties of the sine/cosine functions and cal/sal
Fig. 5.1 Walsh to Fourier Transformation Process
functions, it can be shown that $a_k$ terms of the Fourier expansion are functions of only $A_s$ terms of the corresponding Walsh expansions. Similarly $b_k$ terms depend only on $B_s$ terms.

The conversion process to obtain $b_k$ terms from $B_s$ terms is discussed below. A similar method applies to $a_k$ terms. Equating the odd parts of the Fourier and Walsh series representation (Eqns. (2.10) and 2.13) we get,

$$
\sum_{k=1}^{1} b_k \sin 2\pi kt = \sum_{s=1}^{2} B_s \text{sal}(s,t)
$$

(5.1)

$I$ is taken as unity henceforth. Multiplying both sides by $\sin 2\pi ft$ and integrating over unit interval, we get

$$
b_f = \sum_{s=1}^{2} b_{f,s} B_s; \quad f = \{1, 2, \ldots, \infty\}
$$

(5.2)

where $b_{f,s}$ is the $f$th sine coefficient of the $s$th sal function and is given by

$$
b_{f,s} = \frac{1}{2} \int_{0}^{1} \sin 2\pi ft \text{sal}(s,t) \, dt
$$

similarly $a_f$ terms are given by

$$
a_f = \sum_{s=1}^{2} a_{f,s} A_s; \quad f = \{1, 2, \ldots, \infty\}
$$

(5.3)

where $a_{f,s}$ is the $f$th cosine coefficient of the $s$th cal
function. In matrix form, Eqn. (5.2) can be written as,

$$[b] = [F_b] [B]$$  \hspace{1cm} (5.4)

where $[b]$ is the sine coefficient matrix
$[F_b]$ is the conversion matrix
$[B]$ is the Walsh (sal) coefficient matrix.

For a band limited signal $x_b(t)$

$$x_b(t) = \sum_{k=0}^{F} a_k \cos 2\pi kt + \sum_{k=0}^{F} b_k \sin 2\pi kt$$  \hspace{1cm} (5.5)

where $F$ is the highest frequency component of the input signal.

The $s$th sal coefficient of $x_b$ is then,

$$B_s = \int_0^1 x_b(t) \cdot \text{sal}(s,t) \, dt$$  \hspace{1cm} (5.6)

$$= \int_0^1 \left[ \sum_{k=1}^{F} b_k \sin 2\pi kt \right] \cdot \text{sal}(s,t) \, dt$$  \hspace{1cm} (5.7)

$$= \sum_{k=1}^{F} d_s,k \cdot b_k$$  \hspace{1cm} (5.8)

where $d_{s,k}$ is the $s$th sal coefficient of $\sin 2\pi kt$. In matrix form, Eqn. (5.8) can be written as,

$$[B] = [D] [b]$$  \hspace{1cm} (5.9)

where $[B]$ is a $(S \times 1)$ sal coefficient matrix
$[D]$ is the $(S \times F)$ conversion matrix
$S$ = a finite number.
\[ [b] = (F \times 1) \text{ sine coefficient matrix}, \]

One can solve for the set \( \{b_f\}_{f=1}^{F} \) by a system of \( F \) linearly independent equations for \( S > F \). Since

\[
d_{s,k} = \frac{1}{2} \int_{0}^{1} \sin(2\pi k t) \, \text{sal}(s,t) \, dt = \frac{1}{2} b_{k,s}
\]

\[
[D] = \frac{1}{2} \left[ F_b \right]^T
\]

(5.10)

From Eqn. (5.9), the sine coefficient matrix \( b \) is given by

\[
[b] = [K^{-1}] \left[ F_b \right] [B]
\]

(5.11)

where

\[
[K] = \frac{1}{2} \left[ F_b \right] [F_b]^T
\]

Using the orthogonal property of Walsh functions, Siemens proved that \( [K]^{-1} \) is a non-singular diagonal matrix [73] whose elements are given by

\[
k_{f,f} = \text{sinc}^{-2} \left( \frac{f}{2^m} \right); \quad f = \{1, 2, \ldots, 2^m - 1\}
\]

\[
= \text{sinc}^{-2} \left( \frac{f}{2^m} \right) \quad [m] = 2^m - 1 = F
\]

(5.12)

As \( N = \sum_{m=0}^{\infty} 2^m \to \infty \), \( k_{f,f} \) approaches unity. The rows of the matrix \( F_b \) represent the Walsh (sal) expansions of the corresponding sine functions. For example, the third row of \( F_b \) gives the Walsh expansion of \( \sin 2\pi(3t) \):

\[
\sin 6\pi t = \frac{4}{3\pi} \text{sal}(1,t) + 1.025 \text{sal}(3,t) + \ldots
\]

(5.13)
Similarly the columns of the matrix $F_B$ represent the Fourier expansions of the sal functions $\{\text{sal}(s,t)\}_{s=1,2,\ldots,F}$. For example, the first column of $F_B$ represents the Fourier coefficients of $\text{sal}(1,t)$:

$$\text{sal}(1,t) = \frac{4}{\pi} \sin 2\pi t - \frac{4}{3\pi} \sin 2\pi(3t) + \ldots \quad (5.14)$$

The elements $a_{f,s}$, $b_{f,s}$ can be computed using a non-recursive equation for the Fourier transform of a Walsh function [78]. The pattern of non-zero unique elements in $F_B$ is important in the Walsh-Fourier conversion process and is discussed below. If $s$ is odd, the sal function $\text{sal}(s,t)$ have non-zero Fourier coefficients for odd-numbered harmonics only; that is $b_{2k-1,s}$ only are non-zero. $\text{sal}(2s,t)$ can be considered as a wave $\text{sal}(s,t)$ with a time base that has been halved. Consequently, $\text{sal}(2s,t)$ has non-zero coefficients whose harmonics numbers are twice those of $\text{sal}(s,t)$; that is $b_{2}, b_{2s}, b_{6}, b_{2s} \ldots b_{2}(2k-1), 2s$ are non-zero elements. In general, the non-zero coefficients of $\text{sal}(2^k s, t)$ are $b_{2^k}, b_{2^k s}, b_{3 \cdot 2^k}, b_{2^k s}, \ldots, b_{2^k (2^k - 1)}, b_{2^k s}$, where $s$ is odd. In view of this property, the matrix $F_B$ of dimension $2^{m-1} \times 2^{m-1}$, has $2^{m-2}$ independent columns elements; any other column of elements is a subset of some odd-numbered column elements.

The $F_B$ matrix for $F = 8$ is given below.
<table>
<thead>
<tr>
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1.27324</td>
<td>-0.52739</td>
<td>-0.10490</td>
<td>-0.2532</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>-1.21324</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-0.52739</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0.42441</td>
<td>1.024624</td>
<td>-0.68463</td>
<td>-0.283584</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>-</td>
<td>-</td>
<td>1.27324</td>
<td>-</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>0.25464</td>
<td>0.614774</td>
<td>-0.920075</td>
<td>-0.38110</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>-0.424413</td>
<td>-</td>
<td>-</td>
<td>1.02462</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>0.18189</td>
<td>-0.07534</td>
<td>0.37876</td>
<td>0.9144</td>
<td>-</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>1.27324</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

denotes zero elements

5.3 Walsh to Fourier Conversion Method of Tadokoro and Higuchi [23]

The method described above starts with band limited signals. The Walsh and Fourier coefficients used are the analog inner products of the input signal with the basis functions according to Eqn. (2.11, 2.12, 2.14, 2.15).

The method due to Tadokoro and Higuchi can be considered as a discrete version of that due to Siemens and Kitai. Consider a data sequence $X(0)$, ..., $X(N-1)$ of length $N$ obtained by uniform sampling of an analog signal. The discrete Fourier transform of the data sequence is given by

$$ F(k) = \frac{1}{N} \sum_{i=0}^{N-1} X(i) e^{-j2\pi ki/N}, \quad k = 0, 1, \ldots, N-1 $$

where $w = e^{-j2\pi/N}$.
Given the $N$ DFT coefficients, the corresponding data sequence $X(0), \ldots, X(N-1)$ is obtained by the inverse transform

$$X(i) = \sum_{k=0}^{N-1} F(k) w^{-ik}, \quad i = 0, 1, \ldots, N-1$$

(5.16)

If the data sequence is real, then $F(k)$ satisfies the following symmetry conditions [24]

$$\text{Re}[F(k)] = \hat{a}_k = \text{Re}[F(N-k)]$$

(5.17a)

$$\text{Im}[F(k)] = \hat{b}_k = -\text{Im}[F(N-k)]$$

(5.17b)

(The notation $(\cdot)$ is used to differentiate the coefficients of the DFT from those obtained as the analog inner product of the input signal and the basis functions used by Siemens and Kitai.)

Using the symmetry relations of Eqns. (5.17a) and (5.17b), the data sequence can be represented by,

$$X(i) = \sum_{k=0}^{(N/2)-1} 2\hat{a}_k \cos \frac{2\pi ki}{N} + \sum_{k=1}^{N/2} 2\hat{b}_k \sin \frac{2\pi ki}{N}$$

(5.18)

The discrete Walsh transform of the given data sequence is given by

$$B(k) = \frac{1}{N} \sum_{i=0}^{N-1} X(i) \text{ wal}(k,i), \quad k = 0, 1, \ldots, N-1$$

(5.19)

and the inverse transform is given by

$$X(i) = \sum_{k=0}^{N-1} B(k) \text{ wal}(k,i), \quad i = 0, 1, \ldots, N-1$$

(5.20)
In terms of cal and sal functions, Eqn. (5.20) can be written as

\[ X(i) = \sum_{s=0}^{N/2-1} A_s \text{cal}(s,i) + \sum_{s=1}^{N/2} B_s \text{sal}(s,i) \]  

(5.21)

The Fourier component \( \hat{b}_k \) is computed according to

\[ \hat{b}_k = \frac{1}{N} \sum_{i=0}^{N-1} X(i) \sin \frac{2\pi ki}{N} \]  

(5.22)

Substituting for \( X(i) \) from Eqn. (5.21) in Eqn. (5.22) we get

\[ \hat{b}_k = \frac{1}{N} \sum_{i=0}^{N-1} \left( \sum_{s=0}^{N/2-1} A_s \text{cal}(s,i) + \sum_{s=1}^{N/2} B_s \text{sal}(s,i) \right) \sin \frac{2\pi ki}{N} \]  

(5.23)

Interchanging the order of summation gives

\[ \hat{b}_k = \sum_{s=0}^{N/2-1} A_s \sum_{i=0}^{N-1} \frac{1}{N} \text{cal}(s,i) \sin \frac{2\pi ki}{N} \]  

\[ + \sum_{s=1}^{N/2} B_s \sum_{i=0}^{N-1} \frac{1}{N} \text{sal}(s,i) \sin \frac{2\pi ki}{N} \]  

(5.24)

The sum \( \frac{1}{N} \sum_{i=0}^{N-1} \text{cal}(s,i) \sin \frac{2\pi ki}{N} \) is the sth cal coefficient of \( \sin 2\pi ik/N \). Similarly, \( \frac{1}{N} \sum_{i=0}^{N-1} \text{sal}(s,i) \sin \frac{2\pi ki}{N} \) is the sth sal coefficient of \( \sin 2\pi ik/N \). In the case of analog integration, due to the odd and even symmetry of cal/sal functions, we have the following relations:
\[
\frac{1}{2} \int_{-\frac{1}{2}}^{\frac{1}{2}} c(s,t) \sin 2\pi kt \, dt = 0 \quad (5.25)
\]
\[
\frac{1}{2} \int_{-\frac{1}{2}}^{\frac{1}{2}} c(s,t) \cos 2\pi kt \, dt = 0 \quad (5.26)
\]

This simplifies the conversion process in that \(b_k\) terms depend only on \(B_s\) terms and \(a_k\) terms only on \(A_s\) terms. Similar relationships do not hold if the discrete orthonormal basis functions employed in the Fourier series expansion are \(\{\sin 2\pi ki/N\}_{k=1}^{N/2}\), \(\{\cos 2\pi ki/N\}_{k=1}^{N/2}\). This is illustrated for

\[
\frac{1}{2} \sum_{i=0}^{N-1} c(1,i) \sin \frac{2\pi i}{N}
\]

for \(N = 8\) in Table 5.1.

Tadokoro and Higuchi simplified the conversion process by employing a displaced set of sinusoidal basis functions so that \(\hat{a}_k\) terms depend only on \(A_s\) terms and \(\hat{b}_k\) terms only on \(B_s\) terms. The sets of basis functions used are

\[
\{\sin k \left[ \frac{2\pi i}{N} + \frac{\pi}{N} \right] \}_{k=1}^{N/2}
\]
\[
\{\cos k \left[ \frac{2\pi i}{N} + \frac{\pi}{N} \right] \}_{k=0}^{(N/2)-1}
\]

With the above choice, Eqn. (5.24) reduces to

\[
b_k = \sum_{s=1}^{N/2} B_s b_k, s \quad (5.28)
\]
\[
\begin{array}{cccccccc}
  i & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\
  \text{cal}(1,t) & 1 & 1 & -1 & -1 & -1 & 1 & 1 \\
  \text{sin} \frac{2\pi i}{8} & 0 & 1/\sqrt{2} & +1 & +1/\sqrt{2} & 0 & -1/\sqrt{2} & -1 & -1/\sqrt{2} \\
  \text{cal}(1,t) \times \text{sin} \frac{2\pi i}{8} & 0 & 1/\sqrt{2} & -1 & -1/\sqrt{2} & 0 & +1/\sqrt{2} & -1 & -1/\sqrt{2} \\
\end{array}
\]

\[
\frac{1}{N} \sum_{i=0}^{N-1} \text{cal}(1,i) \text{ sin} \frac{2\pi i}{N} = \frac{(+1/\sqrt{2} - 1 - 1/\sqrt{2} + 1/\sqrt{2} - 1 - 1/\sqrt{2})}{8} = -0.25
\]

Table 5.1: First cal coefficient of \( \text{sin} \frac{2\pi i}{N} \)
where \( \hat{b}_k \) is the \( k \)th discrete sal coefficient of the \( k \)th

discrete sine function. In matrix form, Eqn. (5.28) can be

written

\[
\hat{b} = \hat{F}_b \hat{B}
\]

(5.29)

where \( \hat{b} = [N/2 \times 1] \) sine coefficient matrix

\( \hat{F}_b = [N/2 \times N/2] \) conversion matrix

\( \hat{B} = [N/2 \times 1] \) sal coefficient matrix

The \( k \)th row in \( \hat{F}_b \) is the discrete Walsh expansion of the

\( k \)th sinusoid and can be obtained by performing a Walsh

transform with the discrete sinusoid as the input data

sequence.

Similar relations exist between the sal coefficients

\( \hat{A} \) and the cosine components \( \hat{A} \). The conversion matrix

\( \hat{F}_b \) for \( N = 16 \) is given in Table 5.2. The corresponding

elements in the matrix \( \hat{F}_b \) and \( F_b \) used in the previous

method are not the same. The differences between them is

due to the differences in the computation of the matrix

elements and the basis sinusoidal functions used. Siemens

and Kitai's method computes the Fourier coefficients as

analog inner products of the signal and basis functions

according to Eqns. (2.14) and (2.15) whereas Tadokoro and

Higuchi's method computes the Discrete Fourier coefficients

from the Discrete Walsh coefficients; they formulated the

conversion process as the solution of \( N \) linear simultaneous

equations. The pattern of non-zero elements in \( F_b \) and \( \hat{F}_b \)
### Table 5.2: Conversion matrix $\tilde{F}_B$ for $N = 16$

<table>
<thead>
<tr>
<th>k\s</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1.28145</td>
<td>0</td>
<td>-0.53079</td>
<td>0</td>
<td>-0.10558</td>
<td>0</td>
<td>-0.25488</td>
<td>0</td>
</tr>
<tr>
<td>2</td>
<td>0</td>
<td>1.30656</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>-0.54119</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>3</td>
<td>0.44998</td>
<td>0</td>
<td>1.08636</td>
<td>0</td>
<td>-0.72588</td>
<td>0</td>
<td>0.30066</td>
<td>0</td>
</tr>
<tr>
<td>4</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1.41420</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>5</td>
<td>0.30066</td>
<td>0</td>
<td>0.72588</td>
<td>0</td>
<td>1.08636</td>
<td>0</td>
<td>-0.44998</td>
<td>0</td>
</tr>
<tr>
<td>6</td>
<td>0</td>
<td>0.54118</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1.30656</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>7</td>
<td>0.25489</td>
<td>0</td>
<td>-0.10558</td>
<td>0</td>
<td>0.53078</td>
<td>0</td>
<td>1.28145</td>
<td>0</td>
</tr>
<tr>
<td>8</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>
are similar. Hence the number of multiplications needed in the conversion process using both the methods are equal. (The matrix $P_b$ and $K$ are multiplied and considered as a single matrix.)

The Discrete Fourier coefficients obtained through the second method are different from those obtained through conventional FFT methods as they assume a different set of sinusoidal functions. (The sets of basis functions assumed in the conventional DFT are

$$[\sin \frac{2\pi}{N} k i]_{k=1}^{N/2} \text{ and } [\cos \frac{2\pi}{N} k i]_{k=0}^{(N/2)-1}.$$  

Let $c_k$ and $d_k$ be respectively the real and imaginary parts of the $k$th conventional DFT coefficients. Then

$$c_k = \frac{1}{2} [a_k \cos \frac{k\pi}{N} + b_k \sin \frac{k\pi}{N}] \quad (5.30)$$

$$d_k = \frac{1}{2} [a_k \sin \frac{k\pi}{N} - b_k \cos \frac{k\pi}{N}] \quad (5.31)$$

In practice, the amplitude and the phase of the spectral components at different frequencies are of interest. To obtain the former, there is no need to transform the coefficients according to Eqns. (5.30) and (5.31). The latter can be obtained by adding a phase angle $k\pi/N$ to that obtained from the Fourier coefficients computed via the Walsh coefficients. Due to this, the additional transformation according to Eqns. (5.30, 5.31) which increases the number of multiplications by $2N$ may be of interest in
situations where only phase spectra are desired.

5.4 A Comparison of the Walsh-to-Fourier Conversion Process with the Cooley-Tukey FFT Algorithm of Computing the DFT

Over the years many fast algorithms [24,83,84,85] to compute the DFT have been developed. Some of the recent ones [87] are referred to as very fast Fourier transform algorithms. We limit ourselves to comparing the Walsh to Fourier Transformation with the commonly used Cooley and Tukey algorithm. The comparisons are made on the basis of the number of multiplications, memory requirements, and the effect of finite-word length in their implementation.

5.4.1 Speed

One of the factors normally considered in comparing the speeds of two signal processors is the number of multiplications required. The computation of the DFT coefficients using the Cooley-Tukey FFT algorithm according to Eqn. (5.15) requires \((N/2)\log_2 N\) complex multiplications i.e. \(2N\log_2 N\) real multiplications (a complex multiplication requires four real multiplications).

The Walsh to Fourier conversion process described above is achieved by the multiplication of two matrices \(F_w\) and the Walsh coefficients matrix. The number of multiplications needed to obtain the sine components from
The coefficients is equal to the number of non-zero elements in \( \hat{P}_b \) matrix which is square and of dimension \((N/2 \times N/2)\) for a data sequence of length \(N\). The odd-numbered rows of \( \hat{P}_b \) contain \(N/4\) non-zero elements. The \(k\)th sine coefficient is obtained by

\[
\hat{b}_k = \sum_{i=1}^{2^j} \hat{b}_{k-i} B[2^j(2i-1)]
\]

where

\[
k = 2^j(2q-1), \quad j, q \text{ are integers}
\]

\[
z = \left(\frac{N}{4}\right) \frac{1}{2^j} = \frac{N}{2^{j+2}}
\]

**Example:** The values of \(k, z\) for \(N = 16\) are

<table>
<thead>
<tr>
<th>(k)</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
</tr>
</thead>
<tbody>
<tr>
<td>(q)</td>
<td>1</td>
<td>1</td>
<td>2</td>
<td>1</td>
<td>3</td>
<td>2</td>
<td>4</td>
<td>1</td>
</tr>
<tr>
<td>(z)</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>2</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>3</td>
</tr>
</tbody>
</table>

\(z\) determines the number of multiplications required for each \(\hat{b}_k\). Summing over \(z\) for all \(\hat{b}_k\)s using Eqn. (5.32), the number of multiplications required is

\[
M_s = \frac{N}{4} \times \frac{N}{4} + \frac{N}{8} \times \frac{N}{8} + \ldots + \frac{N}{16} \times \frac{1}{4^{k-1}} + \ldots + 4 + 1
\]

\[
= \frac{(N^2-4)}{12}
\]

An equal number of multiplications is required to obtain the Fourier cosine components, thereby making the total number of multiplications \(M_{\text{tot}}\) equal to \((N^2-4)/6\). \(M_{\text{tot}}\) for \(N = 8\),
16, 32 is given below and is illustrated graphically in Fig. 5.2.

<table>
<thead>
<tr>
<th>Method</th>
<th>N</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
<th>256</th>
</tr>
</thead>
<tbody>
<tr>
<td>FFT</td>
<td>48</td>
<td>128</td>
<td>320</td>
<td>768</td>
<td>1792</td>
<td>4096</td>
<td></td>
</tr>
<tr>
<td>Walsh to Fourier</td>
<td>10</td>
<td>42</td>
<td>170</td>
<td>682</td>
<td>2730</td>
<td>10,922</td>
<td></td>
</tr>
</tbody>
</table>

The Walsh to Fourier conversion process is seen to be faster than the FFT method for data lengths up to 64. In certain applications [64,79], the number of frequency components of interest is relatively small compared to $N$. In the case of the FFT methods, the number of multiplications does not depend on the number of frequency components desired. But in the case of the Walsh to Fourier conversion, by limiting the conversion process to the selected number of components, one may reduce the number of multiplications. In that case, for $L = 2^r$ frequency components, the number of multiplications required is

$$N_L = 2 \left[ \frac{N}{4} \right] \times \frac{L}{4} + \frac{NL}{64} + \cdots + \frac{N}{2^{r+1}} \frac{L}{2^{r+1}}$$ (5.34a)

$$= N(L^2-1)/64$$ (5.34b)
Fig. 5.2 Number of multiplications required to compute the Fourier coefficients by the FFT and Walsh to Fourier Conversion method.
Given $N$, the number of multiplications required for different values of $L$ is shown graphically in Fig. 5.3 where it is seen that the Walsh to Fourier conversion process requires fewer multiplications for all $N$, provided that $L$ is relatively small compared to $N$.

5.4.2 Memory Requirements

The amount of memory capacity considered here represents only that required for the data and coefficients storage in a computer-based instrumentation for Fourier analysis. In the case of the FFT, $N$ words are required for twiddle factors $\{w^k\}_{k=0}^{N/2}$ and $2N$ words for input data; a total of $3N$ words are required. (Since the FFT calculations are done on complex numbers, 2 words are assumed for one data point).

As the Fast Walsh Transform operates on real data yielding real-valued coefficients, the Walsh to Fourier conversion method requires $N$ words for the data and $N^2/16$ words for the coefficient storage. The memory requirements for both methods for different values of $N$ are:

<table>
<thead>
<tr>
<th>$N$</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
</tr>
</thead>
<tbody>
<tr>
<td>FFT</td>
<td>24</td>
<td>48</td>
<td>96</td>
<td>192</td>
<td>384</td>
</tr>
<tr>
<td>Walsh to Fourier Conversion</td>
<td>12</td>
<td>32</td>
<td>96</td>
<td>320</td>
<td>1152</td>
</tr>
</tbody>
</table>
Fig. 5.3 Number of multiplications required in the Walsh to Fourier conversion method when the number L of desired components is less than N
This favors the Walsh to Fourier Method when compared to FFT for a data length less than or equal to 32. The memory requirements becomes excessive for large values of $N$ for Walsh to Fourier conversion.

5.4.3 Effect of Finite Register Length

In practice, digital signal processing using general-purpose and special-purpose processors requires the representation of data in binary form with a finite number of bits. The results of processing may require additional bits for their representation. For example, a $b$-bit data sample multiplied by another $b$-bit data results in a product which is $2b$ bits long. The finite register length can be maintained by truncating or rounding the $b$ least significant bits, the effects of which depend on such factors as whether we use fixed-point or floating-point arithmetic and on the type of representation for the data. In processing with fixed-point arithmetic it is natural, in a signal processing context, to consider a register as representing a fixed-point fraction. Then the products of two fractions remains a fraction and the finite word-length constraint can be maintained by truncating or rounding the least significant bits. The result of addition or subtraction need not be truncated or rounded but it can increase in magnitude so that eventually the sum is not a fraction. This overflow
can be handled by requiring that the input data be sufficiently small so that the possibility of overflow is avoided, or by scaling the results appropriately when overflow occurs.

In the following discussion we assume the binary data representation in two's complement form with \((b+1)\) bits. Errors due to roundoff and truncation in the computation of the discrete Fourier transform were studied by Oppenheim and Weinstein [80] and Welsh [81] for the decimation in time (DIT) radix-2 fast Fourier transform algorithm. Recently Sundaramurthy and Reddy [82] obtained the error values for the DFT using the decimation in frequency (DIF) FFT algorithm. We describe briefly the statistical model used by them to obtain the error values, and we apply similar techniques to the Walsh to Fourier Transform algorithm described earlier to derive the computational errors. The results are compared with the FFT algorithms' error values [80, 81, 82].

The error due to a roundoff operation can be modelled as a random variable \(e_R\) uniformly distributed within the range \((-1/2)2^{-b}\) to \((1/2)2^{-b}\). It can be shown that the variance of this error is:

\[
\sigma_{e_R}^2 = \frac{1}{12} \times 2^{-2b}
\]

(5.35)
In a similar way the variance of the error due to scaling by a factor of 1/2 (shifting the number to the right by 1 bit and truncating the last bit shifted out) is [82],

$$2 \sigma_{e_s}^2 = \frac{2^{-2b}}{8}$$  \hspace{1cm} (5.36)

The signal flow-graphs depicting both the DIT and DIF FFT algorithms are shown in Fig. 5.4(a) and 5.4(b) respectively. At each stage, the algorithm passes through the entire array of $N$ complex numbers, two at a time, generating a new $N$ number array. Butterfly operation using the DIT FFT algorithm is

$$X_{k+1}(i) = X_k(i) + w X_k(j)$$  
$$X_{k+1}(j) = X_k(i) - w X_k(j)$$  \hspace{1cm} (5.37)

The butterfly operation for the DIF FFT algorithm is

$$X_{k+1}(i) = X_k(i) + X_k(j)$$  
$$X_{k+1}(j) = \frac{X_k(i) - X_k(j)}{w}$$  \hspace{1cm} (5.38)

where $w$ is some appropriate power of $w = e^{-2\pi j/N}$.

In implementing the FFT algorithm with fixed-point arithmetic we must guard against overflow. From Eqns. (5.31) and (5.38), it follows that

$$\max \{ |X_{k+1}(i)|, |X_{k+1}(j)| \}$$  
$$\leq 2 \max \{ |X_k(i)|, |X_k(j)| \}$$  \hspace{1cm} (5.39)

Eqn. (5.39) suggests that the maximum modulus increases by no more than a factor of two from stage to stage. As a
Fig. 5.4(a) Signal flow-graph of the decimation in time FFT algorithm

Fig. 5.4(b) Signal flow graph of the decimation in frequency FFT algorithm
result, overflow can be prevented by requiring that \(|x(i)| < 1\) and incorporating an attenuation factor of 1/2 at each stage (right shift). Using this step by step scaling procedure, the final result is less than 1. The computation according to Eqn. (5.37) introduces one complex noise source due to rounding and another due to scaling at each node. In the case of Eqn. (5.38), one complex noise source due to scaling is introduced at each node but the roundoff error source is introduced at one of the two nodes in a butterfly operation. With this model for the noise sources due to computation, the variance of output errors for the DIT and DIF algorithms respectively are [82]

\[
\sigma_{\text{EF}}^2_{\text{DIT}} = \frac{5}{3} \cdot 2^{-2b} \left[ 1 - \left(\frac{1}{2}\right)^m \right] \tag{5.40}
\]

\[
\sigma_{\text{EF}}^2_{\text{DIF}} = \frac{4}{3} \cdot 2^{-2b} \left[ 1 - \left(\frac{1}{2}\right)^m \right] \tag{5.41}
\]

where \(m = \log_2 N\).

A multiplication by factor 1 or \(j\) is noiseless. In Eqns. (5.40) and (5.41), it is assumed that such multiplications are noisy. The corresponding error values by subtracting the variance of the noise sources due to the noiseless multiplications are [82]

\[
\sigma_{\text{EF}}^2_{\text{DIT}} = \frac{2^{-2b}}{3} \left( 5 - \frac{4m}{N} - \frac{3}{N} \right) \tag{5.42}
\]
\[ \sigma_{DIF}^2 = \frac{2^{-2b}}{3} \left( \frac{19}{6} - \frac{4}{N} + \frac{4}{3N^2} \right) \]  

(5.43)

Walsh to Fourier Transformation Rounding and Scaling Errors

The transformation involves the computation of the Fast Walsh Transform followed by a matrix multiplication. As described in Chapter 3, the FWT performs a number of repeated butterfly operations analogous to the FFT algorithms. The butterfly operation in the case of the FWT is

\[ x_{k+1}^{(i)} = x_k^{(i)} + x_k^{(j)} \]
\[ x_{k+1}^{(j)} = x_k^{(i)} - x_k^{(j)} \]  

(5.44)

Assuming \(|x_k^{(i)}| < 1\), iteration by iteration scaling similar to the FFT algorithms is adopted to prevent overflow. It can be seen that the number of truncations errors propagating to any output node from the first, second, third and \(i\)th iteration are \(N/2\), \(N/4\), \(N/8\) and \(N/2^i\) respectively. The errors in the \(i\)th iteration undergo scalings by \(m^{-i}\) times. Hence the total error due to scaling is

\[ \sigma_T^2 = \sigma_s^2 \left[ \frac{N}{2} \left( \frac{1}{4} \right)^{m-1} + \frac{N}{4} \left( \frac{1}{4} \right)^{m-2} + \ldots + \frac{N}{2^i} \left( \frac{1}{4} \right)^{m-i} + \ldots + 1 \right] \]  

(5.45)

which reduces to

\[ \sigma_T^2 = 2 \sigma_s^2 \left[ 1 - \frac{1}{N} \right] \]

\[ = \frac{2^{-2b}}{4} \left[ 1 - \frac{1}{N} \right] \]  

(5.46)
In obtaining the Fourier components, we perform a matrix multiplication. As discussed in section 5.4.1, a maximum of \( N/4 \) real multiplications are needed in the worst case. Note that the roundoff errors due to successive multiplication are scaled. Hence the maximum error due to roundoff in the computed sine/cosine Fourier components is

\[
\sigma^2_R = \sigma^2_{e_R} \left[ \left(\frac{1}{4}\right)^{N/4} - \left(\frac{1}{4}\right)^{N/4-2} + \cdots + \left(\frac{1}{4}\right)^{(N/4)-i} \right] \tag{5.47}
\]

which reduces to

\[
\sigma^2_R = \sigma^2_{e_R} \times \frac{4}{3} \left[ 1 - \frac{1}{(4)^{N/4}} \right] \tag{5.48}
\]

The maximum variance of the error due to scaling is

\[
\sigma^2_S = \sigma^2_{e_S} \left[ \left(\frac{1}{4}\right)^{N/4} - \left(\frac{1}{4}\right)^{N/4-2} + \cdots + \left(\frac{1}{4}\right)^{(N/4)-i} \right] \tag{5.49}
\]

Hence the total error is

\[
\sigma^2_{E_W^{\text{WFT}}} = \left[ \sigma^2_{e_S} + \sigma^2_{e_R} \right] \times \frac{4}{3} \left[ 1 - \frac{1}{(4)^{N/4}} \right] + \frac{2^{-2b}}{4} \left[ 1 - \frac{1}{N} \right] \left(\frac{1}{4}\right)^{N/4} \tag{5.50}
\]

The second term is due to the scaling errors introduced in the computation of the Walsh coefficients. Eqn. (5.50) after simplification becomes

\[
\sigma^2_{E_W^{\text{WFT}}} = \frac{5}{18} \cdot 2^{-2b} - \frac{2^{-2b}}{36} \left(\frac{1}{4}\right)^{N/4} - \frac{2^{-2b}}{4N} \left(\frac{1}{4}\right)^{N/4} \tag{5.51}
\]

In general the kth sine or cosine components require a number of multiplications \( z = N/2^{j+2} \) where \( j \) is defined in Eqn. (5.32). In that case the variance of the computational
error is given by

\[ \sigma_{EW_{WFT}}^2 = \frac{5}{18} 2^{-2b} - \frac{2^{-2b}}{36} \left( \frac{1}{4} \right)^{N/2} \left( \frac{1}{4} \right)^{j+2} - \frac{2^{-2b}}{4N} \left( \frac{1}{4} \right)^{N/2} \left( \frac{1}{4} \right)^{j+2} \]  

(5.52)

Eqns. (5.51) and (5.52) give the variance of the error for either the sine or cosine components where as Eqns. (5.42) and (5.43) give the variances of the error for both the components combined (variance of a spectral component at a given frequency). When comparing the error performance of the FFT method with the Walsh to Fourier transformation, \( 2\sigma_{I_{WFT}}^2 \) should be used. From Eqns. (5.42), (5.43) and (5.51), the Walsh to Fourier transformation introduces a smaller computational error. The percentage improvement of over FFT algorithms is illustrated in Fig. 5.5.
Fig. 5.5 Improvement of computational errors of the Walsh to Fourier Conversion process

Over DIT FFT algorithm

% Improvement
CHAPTER 6
SOFTWARE DESIGN

6.1 Introduction
Software for the WSA is written in the form of functional modules or subroutines to be called by a main program. The list of subroutines is given later. Two of them, "WTFORM" and "WFCON" are special purpose modules for the WSA, and are described in detail below. Other subroutines used are general purpose in nature so that no descriptions are warranted. We also describe briefly the main program which links with a program written in BASIC in the HP2647A graphic terminal. Complete software details are available in a laboratory manual.

6.2 WTFORM (Fast Walsh Hadamard Transform) Subroutine
Chapter 4 shows that Walsh spectral analysis consists of two phases, viz, data acquisition and computation. The flow-chart of a subroutine developed for this purpose is shown in Fig. 6.1. It begins with the initialization of I/O ports of the system 80/10. (Port E8: output, E9: input, upper EA: input, lower EA: output), the address and the terminal count register of the DMA controller 8257 to BFBP and BFCO.
(hexadecimal) respectively. The DMA controller is set to
obtain 64 data samples with the starting address for the
data storage as 4000 (Hexadecimal) and is in the "write"
mode with the above parameter initialization. A "DMA request"
signal is generated next in the form of a pulse via bit 5 of
port EA and data acquisition commences. During this phase
the system 80/10 is in the hold mode and resumes operation
after 64 samples are deposited in the system memory.

The output of the A/D converter has 8 bits. The
magnitude of a coefficient value may attain a maximum of
$2^{14}-1$ for a 64 point discrete Walsh transform with the
implied binary point after the 6th least significant bit;
hence two word (16 bit) data representation is needed. The
DMA hardware is designed so that the sampled A/D outputs are
written with alternate memory locations skipped. In order
to represent the data as two word numbers the data is
reformatted so that the most significant byte is set to
11111111 or 00000000 depending on whether the least
significant byte (sampled A/D output) is positive or
negative.

The next step is to perform the FWHT (note that the
FWHT on the permuted data gives the sequency ordered Walsh
coefficients).

Based on the signal flow-graph (Fig. 3.3) for the
FWHT, different variables are set up for the FWHT. Their
designations and assignments are given below.

<table>
<thead>
<tr>
<th>Variable</th>
<th>Memory Location</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LOC1</td>
<td>3C 50</td>
<td>16 bit (two word) variable containing the starting address of the data array i.e. X(0). (4000 is taken as the starting address.)</td>
</tr>
<tr>
<td></td>
<td>3C 51</td>
<td></td>
</tr>
<tr>
<td>LOC2</td>
<td>3C 52</td>
<td>16 bit (two word) variable containing the separation of two operands used in a butterfly operation. Initialized to 64 decimal and halved for each succeeding iteration.</td>
</tr>
<tr>
<td></td>
<td>3C 53</td>
<td></td>
</tr>
<tr>
<td>LOC3</td>
<td>3C 54</td>
<td>8 bit (one word) variable containing the number of butterfly operations in a partition. Initialized to 32 decimal and is halved for the succeeding iteration.</td>
</tr>
<tr>
<td>LOC4</td>
<td>3C 55</td>
<td>8 bit (one word) variable containing the number of blocks in an iteration. Doubled for the succeeding iteration.</td>
</tr>
<tr>
<td>LOC5</td>
<td>3C 56</td>
<td>8 bit (one word) variable containing the number of iterations. Decrement by one at the completion of each iteration.</td>
</tr>
</tbody>
</table>

With the variables initialized to the values indicated above, the first iteration of FWHT commences. The register pair DE is loaded with the starting address of the data array and serves as a pointer to the address of the upper operand [X(i) in Eqn. (3.21)], for a butterfly operation. By adding the value of LOC2 loaded in register pair HL with the contents of register pair DE, the address of the second operand for a butterfly operation at the start
of an iteration is derived. Registers B and C are loaded with the contents of LOC3 and LOC4 respectively and serve as counters for the number of butterfly operations in a block and the number of blocks to be performed in an iteration.

To perform a butterfly operation, first $X(i) - X(i + N/2^j)$ is computed and is stored in the memory location occupied by $X(i + N/2^j)$ previously. Next $2X(i)$ is formed and the contents of the memory location occupied by $X(i + N/2^j)$ is subtracted from it, thereby obtaining $X(i) + X(i + N/2^j)$. The result is stored in memory locations occupied by $X(i)$. After each butterfly operation, register B is decremented and tested for zero (completion of a block). If not zero register pairs DE and HL are incremented to point to the next operands and a branch to the point 2 shown in the flow-chart (Fig. 6.1) is taken to perform the next butterfly operation.

When the contents of register B are zero, register C is decremented in an iteration and is tested for zero. In the case of a non-zero value, register pairs DE and HL are updated as shown in the flowchart (Fig. 6.1) to point to the operands in the next block. Also register B is loaded with the contents of LOC3. A branch to the point 2 in the flowchart is taken to continue the next block of butterfly operations.

A zero in the register C indicates that all the
Fig. 6.1 Flowchart of WFORM (FWHT) subroutine
Fig. 6.1 Flowchart of WTFORM (FWHT) subroutine (continued)
butterfly operations in the current iteration are over. Now LOC5 is decremented and tested. If zero, the FWHT is completed and the subroutine "BICON" is called to convert the Walsh coefficients in binary form into signed BCD form. At the completion of BCD conversion, a return to the calling main program is taken.

If LOC5 is not zero, the contents of LOC3 (number of butterfly operations in a block) is halved, the contents of LOC4 (number of blocks in an iteration) is doubled and the contents of LOC2 (separation of two operands in a butterfly operation) is halved. With these updatings of the variables, a branch to point 1 in the flowchart (Fig. 6.1) is made to continue the next iteration. This completes the description of the flowchart.

The results are available both in binary and BCD form. The computation takes 23.844 msec for 64 Walsh coefficients. Binary to BCD conversion takes an additional 99.439 msec.

6.3 Walsh-to-Fourier Conversion Subroutine (WFCON)

Either of the two processes described in Chapter 5 can be considered for software implementation of Walsh to Fourier conversion in a microcomputer based WSA. The process due to Siemens and Kitai would yield Fourier coefficients without the numerical integration error
associated with the DFT coefficients provided the Walsh coefficients are computed without integration error; this is feasible in the instrument described in the thesis, if the signal is integrated over each interval and the integrator output is used as the data sequence for FWT [14]. The DFT components obtained by the process due to Tudakoro and Higuchi are different from those obtained from conventional FFT methods and requires correction in obtaining phase information of a spectral component as discussed in Section 5.3. In view of this, the process due to Siemens and Kitai was chosen.

The first microprocessor based Walsh to Fourier converter [75] used the first-generation 4004 microprocessor. The conversion time was 1.81 seconds for 32 coefficients. The slow speed can be attributed to the long cycle time of 4004. A later system [76] was based on the decimal PPS-25 microprocessor, the conversion time for 64 decimal coefficients being 2.5 seconds.

In the present work, the 64 binary-coded Fourier coefficients are obtained from 64 Walsh coefficients in about one second.

Data Representation and Storage

The Walsh coefficients computed are in 16 bit two's complement form with the implied binary point after the 6th lsb. Two memory locations are used for each coefficient.
As a result, two's complement representation for matrix elements is adopted in contrast to sign and magnitude representation in [75] and signed BCD form in [76]. Each conversion matrix element has a 9 bit fractional part and a 1 bit integer part. It is packed as two bytes with the remaining bits set to 1 or 0 depending on whether the coefficient is positive or negative. Booth's algorithm [88], which is independent of the signs of the operands for multiplication, is employed. The multiplication time is smaller, on the average, than conventional shift and add methods of multiplication. The Fourier coefficients computed are 32 bits long.

Observation of the conversion matrix elements for sine and cosine components shows that the magnitude of the corresponding elements in the two matrices are the same, but the signs may differ. Earlier implementations stored the magnitude and the signs of coefficients for sine and cosine components separately, resulting in additional software overheads to fetch the corresponding signs during the conversion process. It is seen that the signs of cosine conversion matrix-elements either differ or do not differ from the signs of the corresponding sine conversion matrix elements alternately as shown in Fig. 6.2. Using this property, during accumulation, a partial result of a multiplication is either added to or subtracted from the
<table>
<thead>
<tr>
<th>$f$</th>
<th>1</th>
<th>3</th>
<th>5</th>
<th>7</th>
<th>9</th>
<th>11</th>
<th>13</th>
<th>15</th>
<th>17</th>
<th>19</th>
<th>21</th>
<th>23</th>
<th>25</th>
<th>27</th>
<th>29</th>
<th>31</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>13</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>15</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>17</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>19</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>21</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>23</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>25</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>27</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>29</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>31</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- denotes a sign change. No entry denotes no sign change

Fig. 6.2 Pattern of sign changes of the conversion matrix elements for cosine components from the corresponding conversion matrix elements for sine components
accumulated sum alternately for cosine components, assuming that the conversion matrix elements for sine components are stored.

The conversion process is achieved in two stages; in the first sal coefficient matrix $B$ is multiplied with $F_b$ and in the second, uncompensated Fourier components are multiplied by the appropriate compensation factors. This procedure requires the storage of only odd-numbered rows of $F_b$ and the diagonal elements of the matrix $[K]^{-1}$; i.e. $N^2/16$ + $N/2$ elements. If implemented as a one-stage procedure, the elements of a matrix obtained by the product of two matrices $F_a$ and $[K]^{-1}$ need to be stored; i.e. $(N^2-4)/12$ elements are required and this is greater than that for the former scheme. If the input signal is sequency limited to the first $N/2$ cal or sal components, the corresponding $N/2$ sine or cosine components are given by $F_b B$ or $F_a A$; this can be realized by excluding the second stage in a two stage conversion process.

The number of multiplications increases by $N$ in the two stage procedure over that of one stage procedure.

The matrix elements $b_{1,1}, b_{1,3} \ldots b_{1,(N/2)-1}, b_{3,1}$, $b_{3,3}, \ldots, b_{3,(N/2)-1} \ldots b_{2k-1,1}, b_{2k-1,3} \ldots b_{(N/2)-1,(N/2)-1}$ are stored in contiguous memory locations starting with the location $OA00$ (Hexadecimal). Zero-valued elements of $F_b$ are not stored. The conversion
matrix \( F_b \) elements are given both in decimal and Hexadecimal form in Table 6.1.

**Flowchart for Walsh-to-Fourier Conversion Subroutine**

The main task of the conversion subroutine is to retrieve appropriate Walsh coefficients and conversion matrix elements for multiplication to obtain Fourier coefficients according to Eqn. (5.32). Siemens' scheme for software implementation [73] was adopted in the previous designs [74,75]. Only non-zero unique elements of conversion matrix \( F_b \) are stored as explained before. Based on this storage scheme, Siemens' procedure uses the following algorithm. Let \( f \) be the order of a component in the Fourier sine coefficient matrix \( b \); it can be expressed as

\[
f = 2^k (2q-1)
\]

(6.1)

where \( q \) defines the row wherein the conversion coefficients for the \( f \)th component are stored. Note that not all the coefficients in that row are used for a given \( f \). \( k \) in Eqn. (6.1) is used in retrieving the Walsh coefficients in the order \( 2^k, 3(2^k), 5(2^k) \ldots \).

The above algorithm requires the computation of \( k, q \) for each \( f \), in addition to the process of generating actual memory addresses based on data representation using \( k \) and \( q \).

A different approach is adopted which avoids these computations. The flowchart of the subroutine is shown in
<table>
<thead>
<tr>
<th>Element No.</th>
<th>Value</th>
<th>Element No.</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Decimal</td>
<td>Hex</td>
<td></td>
</tr>
<tr>
<td>b1, 1</td>
<td>1.273240</td>
<td>02 8B</td>
<td>p5, 1</td>
</tr>
<tr>
<td>b1, 3</td>
<td>-0.527393</td>
<td>FE F2</td>
<td>p5, 3</td>
</tr>
<tr>
<td>b1, 5</td>
<td>-0.104905</td>
<td>FF CB</td>
<td>p5, 5</td>
</tr>
<tr>
<td>b1, 7</td>
<td>-0.253263</td>
<td>FF 7F</td>
<td>p5, 7</td>
</tr>
<tr>
<td>b1, 9</td>
<td>-0.024944</td>
<td>FF F4</td>
<td>p5, 9</td>
</tr>
<tr>
<td>b1, 11</td>
<td>0.010332</td>
<td>00 05</td>
<td>p5, 11</td>
</tr>
<tr>
<td>b1, 13</td>
<td>-0.051944</td>
<td>FF EG</td>
<td>p5, 13</td>
</tr>
<tr>
<td>b1, 15</td>
<td>-0.125403</td>
<td>FF CO</td>
<td>p5, 15</td>
</tr>
<tr>
<td>b1, 17</td>
<td>-0.006161</td>
<td>FF PD</td>
<td>p5, 17</td>
</tr>
<tr>
<td>b1, 19</td>
<td>0.002552</td>
<td>00 01</td>
<td>p5, 19</td>
</tr>
<tr>
<td>b1, 21</td>
<td>0.000508</td>
<td>00 00</td>
<td>p5, 21</td>
</tr>
<tr>
<td>b1, 23</td>
<td>0.001225</td>
<td>00 00</td>
<td>p5, 23</td>
</tr>
<tr>
<td>b1, 25</td>
<td>-0.012442</td>
<td>FF FA</td>
<td>p5, 25</td>
</tr>
<tr>
<td>b1, 27</td>
<td>0.005154</td>
<td>00 02</td>
<td>p5, 27</td>
</tr>
<tr>
<td>b1, 29</td>
<td>-0.025909</td>
<td>FF F3</td>
<td>p5, 29</td>
</tr>
<tr>
<td>b1, 31</td>
<td>-0.062550</td>
<td>FF EO</td>
<td>p5, 31</td>
</tr>
<tr>
<td>b3, 1</td>
<td>0.424413</td>
<td>00 D9</td>
<td>b7, 1</td>
</tr>
<tr>
<td>b3, 3</td>
<td>1.024624</td>
<td>02 0C</td>
<td>b7, 3</td>
</tr>
<tr>
<td>b3, 5</td>
<td>-0.684632</td>
<td>FE A2</td>
<td>b7, 5</td>
</tr>
<tr>
<td>b3, 7</td>
<td>0.283584</td>
<td>00 91</td>
<td>b7, 7</td>
</tr>
<tr>
<td>b3, 9</td>
<td>-0.086024</td>
<td>FF D4</td>
<td>b7, 9</td>
</tr>
<tr>
<td>b3, 11</td>
<td>-0.207681</td>
<td>FF 96</td>
<td>b7, 11</td>
</tr>
<tr>
<td>b3, 13</td>
<td>-0.310816</td>
<td>FF G1</td>
<td>b7, 13</td>
</tr>
<tr>
<td>b3, 15</td>
<td>0.128744</td>
<td>00 41</td>
<td>b7, 15</td>
</tr>
<tr>
<td>b3, 17</td>
<td>-0.019097</td>
<td>FF F7</td>
<td>b7, 17</td>
</tr>
<tr>
<td>b3, 19</td>
<td>-0.046105</td>
<td>FF E9</td>
<td>b7, 19</td>
</tr>
<tr>
<td>b3, 21</td>
<td>0.030827</td>
<td>00 0F</td>
<td>b7, 21</td>
</tr>
<tr>
<td>b3, 23</td>
<td>-0.012760</td>
<td>FF FA</td>
<td>b7, 23</td>
</tr>
<tr>
<td>b3, 25</td>
<td>-0.042066</td>
<td>FF EB</td>
<td>b7, 25</td>
</tr>
<tr>
<td>b3, 27</td>
<td>-0.101556</td>
<td>FF CD</td>
<td>b7, 27</td>
</tr>
<tr>
<td>b3, 29</td>
<td>-0.151989</td>
<td>FF B3</td>
<td>b7, 29</td>
</tr>
<tr>
<td>b3, 31</td>
<td>0.062956</td>
<td>00 20</td>
<td>b7, 31</td>
</tr>
<tr>
<td>Element No.</td>
<td>Value</td>
<td>Hex</td>
<td>Element No.</td>
</tr>
<tr>
<td>------------</td>
<td>-----------</td>
<td>------</td>
<td>-------------</td>
</tr>
<tr>
<td>b9,1</td>
<td>0.141471</td>
<td>00 48</td>
<td>b13,1</td>
</tr>
<tr>
<td>b9,3</td>
<td>-0.058599</td>
<td>FF E2</td>
<td>b13,3</td>
</tr>
<tr>
<td>b9,5</td>
<td>0.294598</td>
<td>00 96</td>
<td>b13,5</td>
</tr>
<tr>
<td>b9,7</td>
<td>0.711223</td>
<td>01 6C</td>
<td>b13,7</td>
</tr>
<tr>
<td>b9,9</td>
<td>0.866628</td>
<td>01 BB</td>
<td>b13,9</td>
</tr>
<tr>
<td>b9,11</td>
<td>-0.358960</td>
<td>FF 49</td>
<td>b13,11</td>
</tr>
<tr>
<td>b9,13</td>
<td>-0.071463</td>
<td>FF DC</td>
<td>b13,13</td>
</tr>
<tr>
<td>b9,15</td>
<td>-0.172383</td>
<td>FF A8</td>
<td>b13,15</td>
</tr>
<tr>
<td>b9,17</td>
<td>-0.081531</td>
<td>FF D7</td>
<td>b13,17</td>
</tr>
<tr>
<td>b9,19</td>
<td>0.033771</td>
<td>00 11</td>
<td>b13,19</td>
</tr>
<tr>
<td>b9,21</td>
<td>-0.169780</td>
<td>FF AA</td>
<td>b13,21</td>
</tr>
<tr>
<td>b9,23</td>
<td>-0.409884</td>
<td>FF 2F</td>
<td>b13,23</td>
</tr>
<tr>
<td>b9,25</td>
<td>0.336383</td>
<td>00 AC</td>
<td>b13,25</td>
</tr>
<tr>
<td>b9,27</td>
<td>-0.139335</td>
<td>FF B9</td>
<td>b13,27</td>
</tr>
<tr>
<td>b9,29</td>
<td>-0.027715</td>
<td>FF F2</td>
<td>b13,29</td>
</tr>
<tr>
<td>b9,31</td>
<td>-0.066911</td>
<td>FF DE</td>
<td>b13,31</td>
</tr>
<tr>
<td>b11,1</td>
<td>0.115749</td>
<td>00 3B</td>
<td>b15,1</td>
</tr>
<tr>
<td>b11,3</td>
<td>0.279443</td>
<td>00 8F</td>
<td>b15,3</td>
</tr>
<tr>
<td>b11,5</td>
<td>0.418216</td>
<td>00 D6</td>
<td>b15,5</td>
</tr>
<tr>
<td>b11,7</td>
<td>-0.173231</td>
<td>FF A8</td>
<td>b15,7</td>
</tr>
<tr>
<td>b11,9</td>
<td>0.324092</td>
<td>00 A5</td>
<td>b15,9</td>
</tr>
<tr>
<td>b11,11</td>
<td>0.782427</td>
<td>01 90</td>
<td>b15,11</td>
</tr>
<tr>
<td>b11,13</td>
<td>-0.522801</td>
<td>FF F5</td>
<td>b15,13</td>
</tr>
<tr>
<td>b11,15</td>
<td>0.216551</td>
<td>00 6E</td>
<td>b15,15</td>
</tr>
<tr>
<td>b11,17</td>
<td>-0.129796</td>
<td>FF BE</td>
<td>b15,17</td>
</tr>
<tr>
<td>b11,19</td>
<td>-0.313355</td>
<td>FF G0</td>
<td>b15,19</td>
</tr>
<tr>
<td>b11,21</td>
<td>-0.468969</td>
<td>FF 10</td>
<td>b15,21</td>
</tr>
<tr>
<td>b11,23</td>
<td>0.194253</td>
<td>00 63</td>
<td>b15,23</td>
</tr>
<tr>
<td>b11,25</td>
<td>0.103830</td>
<td>00 35</td>
<td>b15,25</td>
</tr>
<tr>
<td>b11,27</td>
<td>0.250669</td>
<td>00 80</td>
<td>b15,27</td>
</tr>
<tr>
<td>b11,29</td>
<td>-0.167492</td>
<td>FF AB</td>
<td>b15,29</td>
</tr>
<tr>
<td>b11,31</td>
<td>0.069377</td>
<td>00 23</td>
<td>b15,31</td>
</tr>
<tr>
<td>Element No.</td>
<td>Value</td>
<td>Element No.</td>
<td>Value</td>
</tr>
<tr>
<td>------------</td>
<td>---------</td>
<td>-------------</td>
<td>---------</td>
</tr>
<tr>
<td>b17, 1</td>
<td>0.074896</td>
<td>b21, 2</td>
<td>0.160630</td>
</tr>
<tr>
<td>b17, 3</td>
<td>-0.031023</td>
<td>b21, 3</td>
<td>0.146375</td>
</tr>
<tr>
<td>b17, 5</td>
<td>-0.006171</td>
<td>b21, 5</td>
<td>0.219065</td>
</tr>
<tr>
<td>b17, 7</td>
<td>-0.014898</td>
<td>b21, 7</td>
<td>-0.090740</td>
</tr>
<tr>
<td>b17, 9</td>
<td>0.151260</td>
<td>b21, 9</td>
<td>0.169762</td>
</tr>
<tr>
<td>b17, 11</td>
<td>-0.062654</td>
<td>b21, 11</td>
<td>0.409843</td>
</tr>
<tr>
<td>b17, 13</td>
<td>0.314983</td>
<td>b21, 13</td>
<td>-0.273848</td>
</tr>
<tr>
<td>b17, 15</td>
<td>0.760436</td>
<td>b21, 15</td>
<td>0.113432</td>
</tr>
<tr>
<td>b17, 17</td>
<td>0.639012</td>
<td>b21, 17</td>
<td>0.189249</td>
</tr>
<tr>
<td>b17, 19</td>
<td>-0.347530</td>
<td>b21, 19</td>
<td>0.456888</td>
</tr>
<tr>
<td>b17, 21</td>
<td>-0.069128</td>
<td>b21, 21</td>
<td>0.683785</td>
</tr>
<tr>
<td>b17, 23</td>
<td>-0.166890</td>
<td>b21, 23</td>
<td>-0.283231</td>
</tr>
<tr>
<td>b17, 25</td>
<td>-0.016437</td>
<td>b21, 25</td>
<td>-0.151390</td>
</tr>
<tr>
<td>b17, 27</td>
<td>0.006809</td>
<td>b21, 27</td>
<td>-0.365489</td>
</tr>
<tr>
<td>b17, 29</td>
<td>-0.034229</td>
<td>b21, 29</td>
<td>0.244212</td>
</tr>
<tr>
<td>b17, 31</td>
<td>-0.082635</td>
<td>b21, 31</td>
<td>-0.101156</td>
</tr>
<tr>
<td>b19, 1</td>
<td>0.067013</td>
<td>b23, 1</td>
<td>0.055358</td>
</tr>
<tr>
<td>b19, 3</td>
<td>0.161783</td>
<td>b23, 3</td>
<td>-0.022930</td>
</tr>
<tr>
<td>b19, 5</td>
<td>-0.108100</td>
<td>b23, 5</td>
<td>0.115278</td>
</tr>
<tr>
<td>b19, 7</td>
<td>0.044776</td>
<td>b23, 7</td>
<td>0.276835</td>
</tr>
<tr>
<td>b19, 9</td>
<td>0.147608</td>
<td>b23, 9</td>
<td>0.399115</td>
</tr>
<tr>
<td>b19, 11</td>
<td>0.533326</td>
<td>b23, 11</td>
<td>-0.140716</td>
</tr>
<tr>
<td>b19, 13</td>
<td>0.224911</td>
<td>b23, 13</td>
<td>-0.027940</td>
</tr>
<tr>
<td>b19, 15</td>
<td>0.297864</td>
<td>b23, 15</td>
<td>-0.067454</td>
</tr>
<tr>
<td>b19, 17</td>
<td>0.791907</td>
<td>b23, 17</td>
<td>0.142620</td>
</tr>
<tr>
<td>b19, 19</td>
<td>-0.480492</td>
<td>b23, 19</td>
<td>0.296991</td>
</tr>
<tr>
<td>b19, 21</td>
<td>0.190905</td>
<td>b23, 21</td>
<td>0.716999</td>
</tr>
<tr>
<td>b19, 23</td>
<td>-0.060374</td>
<td>b23, 23</td>
<td>-0.588426</td>
</tr>
<tr>
<td>b19, 25</td>
<td>-0.145756</td>
<td>b23, 25</td>
<td>0.243734</td>
</tr>
<tr>
<td>b19, 27</td>
<td>0.218139</td>
<td>b23, 27</td>
<td>0.048482</td>
</tr>
<tr>
<td>b19, 29</td>
<td>0.090356</td>
<td>b23, 29</td>
<td>0.117045</td>
</tr>
<tr>
<td>b19, 31</td>
<td>0.067013</td>
<td>b23, 31</td>
<td>0.055358</td>
</tr>
<tr>
<td>Element No.</td>
<td>Decimal</td>
<td>Hex</td>
<td>Element No.</td>
</tr>
<tr>
<td>------------</td>
<td>-----------</td>
<td>---------</td>
<td>-------------</td>
</tr>
<tr>
<td>b25, 1</td>
<td>0.050930</td>
<td>00 1A</td>
<td>b29, 1</td>
</tr>
<tr>
<td>b25, 3</td>
<td>-0.021096</td>
<td>FF F6</td>
<td>b29, 3</td>
</tr>
<tr>
<td>b25, 5</td>
<td>0.106055</td>
<td>00 36</td>
<td>b29, 5</td>
</tr>
<tr>
<td>b25, 7</td>
<td>0.256040</td>
<td>00 83</td>
<td>b29, 7</td>
</tr>
<tr>
<td>b25, 9</td>
<td>-0.210127</td>
<td>FF 95</td>
<td>b29, 9</td>
</tr>
<tr>
<td>b27, 11</td>
<td>0.087037</td>
<td>00 2C</td>
<td>b29, 11</td>
</tr>
<tr>
<td>b27, 13</td>
<td>0.017313</td>
<td>00 08</td>
<td>b29, 13</td>
</tr>
<tr>
<td>b27, 15</td>
<td>0.041797</td>
<td>00 15</td>
<td>b29, 15</td>
</tr>
<tr>
<td>b27, 17</td>
<td>0.116814</td>
<td>00 3B</td>
<td>b29, 17</td>
</tr>
<tr>
<td>b27, 19</td>
<td>-0.048386</td>
<td>FF E8</td>
<td>b29, 19</td>
</tr>
<tr>
<td>b27, 21</td>
<td>0.243253</td>
<td>00 7C</td>
<td>b29, 21</td>
</tr>
<tr>
<td>b27, 23</td>
<td>0.587265</td>
<td>01 2C</td>
<td>b29, 23</td>
</tr>
<tr>
<td>b27, 25</td>
<td>0.715585</td>
<td>01 6E</td>
<td>b29, 25</td>
</tr>
<tr>
<td>b27, 27</td>
<td>-0.296405</td>
<td>FF 69</td>
<td>b29, 27</td>
</tr>
<tr>
<td>b27, 29</td>
<td>-0.058959</td>
<td>FF E2</td>
<td>b29, 29</td>
</tr>
<tr>
<td>b27, 31</td>
<td>-0.142339</td>
<td>FF B8</td>
<td>b29, 31</td>
</tr>
<tr>
<td>b27, 1</td>
<td>0.047157</td>
<td>00 18</td>
<td>b31, 1</td>
</tr>
<tr>
<td>b27, 3</td>
<td>0.113847</td>
<td>00 3A</td>
<td>b31, 3</td>
</tr>
<tr>
<td>b27, 5</td>
<td>0.170384</td>
<td>00 57</td>
<td>b31, 5</td>
</tr>
<tr>
<td>b27, 7</td>
<td>-0.070575</td>
<td>FF DC</td>
<td>b31, 7</td>
</tr>
<tr>
<td>b27, 9</td>
<td>0.037723</td>
<td>FF ED</td>
<td>b31, 9</td>
</tr>
<tr>
<td>b27, 11</td>
<td>-0.091072</td>
<td>FF D2</td>
<td>b31, 11</td>
</tr>
<tr>
<td>b27, 13</td>
<td>0.060826</td>
<td>00 1F</td>
<td>b31, 13</td>
</tr>
<tr>
<td>b27, 15</td>
<td>-0.025206</td>
<td>FF F4</td>
<td>b31, 15</td>
</tr>
<tr>
<td>b27, 17</td>
<td>0.100628</td>
<td>00 33</td>
<td>b31, 17</td>
</tr>
<tr>
<td>b27, 19</td>
<td>-0.242937</td>
<td>00 7C</td>
<td>b31, 19</td>
</tr>
<tr>
<td>b27, 21</td>
<td>0.363851</td>
<td>00 BA</td>
<td>b31, 21</td>
</tr>
<tr>
<td>b27, 23</td>
<td>-0.150600</td>
<td>FFF B3</td>
<td>b31, 23</td>
</tr>
<tr>
<td>b27, 25</td>
<td>0.316753</td>
<td>00 90</td>
<td>b31, 25</td>
</tr>
<tr>
<td>b27, 27</td>
<td>-0.680212</td>
<td>01 5C</td>
<td>b31, 27</td>
</tr>
<tr>
<td>b27, 29</td>
<td>0.454503</td>
<td>FF 18</td>
<td>b31, 29</td>
</tr>
<tr>
<td>b27, 31</td>
<td>0.188261</td>
<td>00 60</td>
<td>b31, 31</td>
</tr>
</tbody>
</table>
Fig. 6.3. Three address pointers are used for (1) the conversion matrix elements (2) the Walsh coefficients and (3) the Fourier coefficients (each treated as an one-dimensional array) and are denoted MPRA, MPDA and STRA respectively. In computations, these pointers are updated using the properties of the conversion matrix. They are initialized to the addresses of the first elements of the respective arrays. The matrix multiplication is carried out as follows:

Step 1: The conversion process begins with the first matrix element of the first row in $F_D$ and all of the multiplications associated with it are completed. The multiplications associated with a conversion element – say $b_{1,1}$ – can be written as $(b_{1,1}, B_1), (b_{1,1}, B_2), \ldots, (b_{1,1}, B_4); 2^k \leq N/2$. This shows that the index of Walsh coefficients associated with $b_{1,1}$ is doubled for successive multiplications starting with 1. Each successive product is assigned to a Fourier component whose index is also twice the previous value. To update the pointers MPDA, STRA, four variables denoted SPAN1, SPAN2, SPAN3 and SPAN4 are set up. SPAN1 and SPAN2 are initialized to 1; SPAN3 and SPAN4 are initialized to 2 (for 32 bits results). SPAN2 and SPAN4 are doubled by a left shift operation and are added to the current values of the pointers MPDA and STRA respectively after each multiplication. The uses of SPAN1
Entry Point

1. **INITIALIZATION**
   - Set MPRA to OAO
   - SPAN 1 = SPAN 2 = 1
   - SPAN 3 = SPAN 4 = 2
   - ROW = COLMN = 16

2. **Call MPY**

3. **Call ACCM**

4. **SPAN 2 > 32**
   - No
   - **SPAN 4 > 64**
   - Yes

5. **COLMN = COLMN - 1**

6. **COLMN = 0**
   - No
   - **Go to 2**
   - Yes

7. **Go to 3**

**Fig. 6.3** Flowchart of Walsh to Fourier conversion subroutine
Fig. 6.3 Flowchart of Walsh to Fourier conversion subroutine (continued)
and SPAN3 will be explained later.

The multiplications associated with a conversion element is terminated when either SPAN2 exceeds 32 or SPAN4 exceeds 64. Then all multiplications associated with the conversion element are completed.

Step 2: The next matrix element in the same row as in step 1 is used for multiplications; this element is retrieved with the pointer MPRA incremented by 2. The index of the first Walsh coefficient associated with this element would be higher than the index of the initial Walsh coefficient for the previous element by 2. SPAN1 which stores the later is incremented by 2; SPAN2 is reinitialized to the value of SPAN1. Since the products are contributions to Fourier components with the same starting index as for the previous element, SPAN4 is reinitialized to the value of SPAN3. Then step 1 is followed.

When all of the multiplications corresponding to one row is complete, step 3 is followed. This can be accomplished by way of a counter which is preset to 16.

Step 3: The starting index of the Fourier component to which the products of multiplications associated with the new row of elements are assigned is greater than that of the previous row by 2. For example, the products associated with the elements in the third row are assigned to $F_3$, $F_5$, $F_{12}$, ..., $2^k(3)$. As a result STRA is initialized to the
previous starting value plus 8. SPAN3 which stores the
index of the first Fourier component corresponding to a row
is incremented by 4; SPAN4 is also set equal to SPAN3. The
index of the Walsh coefficients associated with the first
element of the new row is 1 and hence SPAN1 and SPAN2 are
set to 1. Also MPDA is set to point to the beginning of
Walsh coefficient array. Then step 1 is followed.

If all of the rows of matrix elements are used, the
conversion process is complete and is returned to the main
program.

The program is written as a subroutine with the entry
point OD42. The subroutine assumes a data structure for
Walsh coefficients with cal and sal coefficients inter-
leaved as would be obtained by the FWHT algorithm.

The conversion time for 32 sine or cosine components
is 493 msec maximum.

This subroutine computes uncompensated sine or cosine
components. To compute both, it has to be executed twice
with the pointers initialized properly. Then they are
compensated by multiplying the Fourier components with
compensation factors which are listed in decimal and
hexadecimal form in Table 6.2.
Table 6.2: Compensation matrix elements

<table>
<thead>
<tr>
<th>No</th>
<th>Decimal</th>
<th>Hex</th>
</tr>
</thead>
<tbody>
<tr>
<td>$K_{1,1}$</td>
<td>1.00080</td>
<td>0100</td>
</tr>
<tr>
<td>$K_{2,2}$</td>
<td>1.00322</td>
<td>0101</td>
</tr>
<tr>
<td>$K_{3,3}$</td>
<td>1.00726</td>
<td>0102</td>
</tr>
<tr>
<td>$K_{4,4}$</td>
<td>1.01295</td>
<td>0103</td>
</tr>
<tr>
<td>$K_{5,5}$</td>
<td>1.02032</td>
<td>0105</td>
</tr>
<tr>
<td>$K_{6,6}$</td>
<td>1.02942</td>
<td>0107</td>
</tr>
<tr>
<td>$K_{7,7}$</td>
<td>1.04030</td>
<td>010A</td>
</tr>
<tr>
<td>$K_{8,8}$</td>
<td>1.05303</td>
<td>010E</td>
</tr>
<tr>
<td>$K_{9,9}$</td>
<td>1.06767</td>
<td>0111</td>
</tr>
<tr>
<td>$K_{10,10}$</td>
<td>1.08434</td>
<td>0116</td>
</tr>
<tr>
<td>$K_{11,11}$</td>
<td>1.10312</td>
<td>012A</td>
</tr>
<tr>
<td>$K_{12,12}$</td>
<td>1.12415</td>
<td>0120</td>
</tr>
<tr>
<td>$K_{13,13}$</td>
<td>1.14755</td>
<td>0126</td>
</tr>
<tr>
<td>$K_{14,14}$</td>
<td>1.17349</td>
<td>012C</td>
</tr>
<tr>
<td>$K_{15,15}$</td>
<td>1.20214</td>
<td>0134</td>
</tr>
<tr>
<td>$K_{16,16}$</td>
<td>1.23370</td>
<td>013C</td>
</tr>
<tr>
<td>$K_{17,17}$</td>
<td>1.26841</td>
<td>0145</td>
</tr>
<tr>
<td>$K_{18,18}$</td>
<td>1.30651</td>
<td>014E</td>
</tr>
<tr>
<td>$K_{19,19}$</td>
<td>1.34831</td>
<td>0159</td>
</tr>
<tr>
<td>$K_{20,20}$</td>
<td>1.39414</td>
<td>0165</td>
</tr>
<tr>
<td>$K_{21,21}$</td>
<td>1.44437</td>
<td>0172</td>
</tr>
<tr>
<td>$K_{22,22}$</td>
<td>1.49942</td>
<td>0180</td>
</tr>
<tr>
<td>$K_{23,23}$</td>
<td>1.55979</td>
<td>018F</td>
</tr>
<tr>
<td>$K_{24,24}$</td>
<td>1.62604</td>
<td>01A0</td>
</tr>
<tr>
<td>$K_{25,25}$</td>
<td>1.69879</td>
<td>01B3</td>
</tr>
<tr>
<td>$K_{26,26}$</td>
<td>1.77876</td>
<td>01C7</td>
</tr>
<tr>
<td>$K_{27,27}$</td>
<td>1.86670</td>
<td>01DE</td>
</tr>
<tr>
<td>$K_{28,28}$</td>
<td>1.96385</td>
<td>01F7</td>
</tr>
<tr>
<td>$K_{29,29}$</td>
<td>2.07104</td>
<td>0212</td>
</tr>
<tr>
<td>$K_{30,30}$</td>
<td>2.18965</td>
<td>0231</td>
</tr>
<tr>
<td>$K_{31,31}$</td>
<td>2.32119</td>
<td>0252</td>
</tr>
<tr>
<td>$K_{32,32}$</td>
<td>1.23370</td>
<td>015C</td>
</tr>
</tbody>
</table>
6.4 List of Other Subroutines Used in the WSA

1. SINCON: Initializes the variables to obtain the Fourier sine components from sal coefficients and computes sine components by calling subroutine WFCON. The entry point is 0E60.

2. COSCON: Initializes the variables to obtain the Fourier cosine components from sal coefficients and computes cosine components by calling the subroutine WFCON. The entry point is 0E80.

3. MPY: Multiplies two 16 bit binary numbers in two's complement form using Booth's algorithm. The entry point is 0F60.

4. ACCM: Used to either add or subtract a 32 bit word in registers BCDE with a 32 bit word stored in the system memory. Used by WFCON subroutine. The entry point is 0F83.

5. BICON: Converts an array of 64 16-bit binary numbers into an array of 64 signed BCD numbers. The entry point is 0F00.

6. CLEAR: Clears a segment of 252 memory locations. The entry point is 0F80.

7. COMPEN: Multiplies the cosine and sine Fourier components obtained by executing the WFCON subroutine with the compensation factors of the matrix \([K]^{-1}\). The entry point is 0F00.
6.5. Main Program

The main program consists two programs, viz., 1) program A written in 8080 ASSEMBLY language in the System 80/10, 2) program B written in BASIC in the graphics terminal hp2647A. The information transfer between the two programs is through a RS232C serial I/O interface. The flowcharts of the programs are given in Figs. 6.4(a) and 6.4(b). Program A is started by the system 80/10 Reset switch (this starts the program execution at location 0000) and B by the "RUN" command of hp2647A. The start of program B precedes the Reset operation i.e. the start of program A. The flowcharts are self explanatory. The features of program B are:

1. An output message "SYSTEM READY. TYPE 1 TO Start, 0 TO TERMINATE" is displayed.

2. A message in the form of one ASCII character is typed on the keyboard of 2647A as the response to the query in (1).

3. After computing the Walsh coefficients, an output message "SELECTION OF SPECTRUM. TYPE 'F' FOR FOURIER SPECTRUM, 'W' FOR WALSH SPECTRUM" is displayed. Depending on the response, either Walsh or Fourier coefficients are outputted.

4. In the case of Walsh coefficients, a table of cal and sal coefficients values in volts with the associated sequency numbers are given. Also plots of cal coefficient
Fig. 6.4(a) Flowchart of main Walsh spectral analyser program in Intel 8080 Assembly language
Start
- Receive PC
Display MS # 1
Receive RS

Stop
Is RS=1 or 0
1

Teceive PC
Display MS # 2
Is RS=W or F
W
Call WPLPOT
Display MS # 3
Is RS=1 or 0
0

Receive RS
Display MS # 4
Call FPLPOT

PC = Prompt Character
RS = Response
MS # 1: "SYSTEM READY. TYPE 1 TO START 0 TO TERMINATE"
MS # 2: "SELECTION OF SPECTRUM. TYPE W FOR WALSH COEFFICIENTS.
F FOR FOURIER COEFFICIENTS"
MS # 3: "TYPE 1 IF FOURIER COEFFICIENTS ARE NEEDED"
MS # 4: "TYPE 1 TO CONTINUE"

Fig. 6.4(b) Flowchart of BASIC program for hp 2647A graphic terminal
values vs sequency and sal coefficients values vs sequency are displayed. The results obtained for a sinusoidal voltage with a peak to peak voltage of 4 volts are illustrated in Table 6.3.

In the case of Fourier components, a table of magnitude in volts, phase in degrees and the frequency (Harmonic number) are given along with a plot of magnitude vs frequency. The results of Fourier analysis for the same sinusoidal signals are illustrated in Table 6.4.

5. After outputting the results as described above, a message "Type 1 to Continue" is displayed. Upon receipt of a response, the system is ready for the analysis of the next input signal.

6.6 Computational Errors

6.6.1 Errors on Walsh Coefficients

The computational errors in the following discussions are expressed in terms of the LSB of the A/D converter.

The 8-bit input data obtained from the A/D converter is in 2's complement form with an accuracy of ±1 lsb. (This error can be considered to be uniformly distributed.) In computing Walsh coefficient B(i) [Eqn. (3.3)], 64 sampled signal values are used (either added or subtracted) and the final result is divided by 64. Hence a total error in the computed error is ±1 lsb. (The fractional part due to
Table 6.3: Sal and cal coefficients of a sinusoidal signal 4 volts peak to peak

<table>
<thead>
<tr>
<th>sequency</th>
<th>coefficients</th>
<th>sequency</th>
<th>coefficients</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>sal</td>
<td>cal</td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>-2.4396</td>
<td>-0.6720</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>+1.0236</td>
<td>-0.2753</td>
<td>4</td>
</tr>
<tr>
<td>5</td>
<td>+0.2032</td>
<td>+0.0568</td>
<td>6</td>
</tr>
<tr>
<td>7</td>
<td>+0.4974</td>
<td>-0.1276</td>
<td>8</td>
</tr>
<tr>
<td>9</td>
<td>+0.0446</td>
<td>+0.0153</td>
<td>10</td>
</tr>
<tr>
<td>11</td>
<td>-0.0226</td>
<td>+0.0116</td>
<td>12</td>
</tr>
<tr>
<td>13</td>
<td>+0.0995</td>
<td>+0.0214</td>
<td>14</td>
</tr>
<tr>
<td>15</td>
<td>+0.2472</td>
<td>-0.0604</td>
<td>16</td>
</tr>
<tr>
<td>17</td>
<td>+0.0153</td>
<td>+0.0104</td>
<td>18</td>
</tr>
<tr>
<td>19</td>
<td>+0.0018</td>
<td>+0.0018</td>
<td>20</td>
</tr>
<tr>
<td>21</td>
<td>-0.0006</td>
<td>+0.0043</td>
<td>22</td>
</tr>
<tr>
<td>23</td>
<td>-0.0043</td>
<td>-0.0043</td>
<td>24</td>
</tr>
<tr>
<td>25</td>
<td>+0.0262</td>
<td>+0.0067</td>
<td>26</td>
</tr>
<tr>
<td>27</td>
<td>-0.0067</td>
<td>+0.0031</td>
<td>28</td>
</tr>
<tr>
<td>29</td>
<td>+0.0446</td>
<td>+0.0153</td>
<td>30</td>
</tr>
<tr>
<td>31</td>
<td>+0.1239</td>
<td>-0.0323</td>
<td>32</td>
</tr>
</tbody>
</table>

The level crossing detector was adjusted so that the synchronizing pulse for frequency multiplication was displaced by 13 degrees.
<table>
<thead>
<tr>
<th>Frequency</th>
<th>Amplitude</th>
<th>Phase</th>
<th>Frequency</th>
<th>Amplitude</th>
<th>Phase</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>+3.9702</td>
<td>+13.0789</td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td>-</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td></td>
<td></td>
<td>6</td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td>8</td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td>12</td>
<td></td>
<td></td>
</tr>
<tr>
<td>13</td>
<td></td>
<td></td>
<td>14</td>
<td></td>
<td></td>
</tr>
<tr>
<td>15</td>
<td></td>
<td></td>
<td>16</td>
<td></td>
<td></td>
</tr>
<tr>
<td>17</td>
<td></td>
<td></td>
<td>18</td>
<td></td>
<td></td>
</tr>
<tr>
<td>19</td>
<td></td>
<td></td>
<td>20</td>
<td></td>
<td></td>
</tr>
<tr>
<td>21</td>
<td></td>
<td></td>
<td>22</td>
<td></td>
<td></td>
</tr>
<tr>
<td>23</td>
<td></td>
<td>-</td>
<td>24</td>
<td></td>
<td></td>
</tr>
<tr>
<td>25</td>
<td></td>
<td>-</td>
<td>26</td>
<td></td>
<td></td>
</tr>
<tr>
<td>27</td>
<td></td>
<td></td>
<td>28</td>
<td></td>
<td></td>
</tr>
<tr>
<td>29</td>
<td></td>
<td></td>
<td>30</td>
<td></td>
<td></td>
</tr>
<tr>
<td>31</td>
<td></td>
<td></td>
<td>32</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- denotes zero
division by 64 is not neglected.)

6.6.2 Errors in Fourier Coefficients

Each Fourier coefficient is obtained by multiplying a maximum of 16 Walsh coefficients with the appropriate conversion matrix elements see Eqn. [(5.32)]. The conversion matrix elements are represented with a precision of nine fractional bits. After the conversion process, the fractional part is discarded, resulting in a maximum error of $-1 \text{ lsb}$, in addition to errors due to finite precision of the conversion elements. The latter errors are negligible when compared with the truncation error. Finally the computed Fourier coefficient is multiplied with the compensation element and the fractional part is ignored, introducing an additional error of $-1 \text{ lsb}$.

Hence the combined maximum error is $-2 \text{ lsb}$ in a computed Fourier coefficient. This error can be made $+1 \text{ lsb}$ by rounding off the fractional part rather than truncation.
CHAPTER 7

SUMMARY AND CONCLUSIONS

Walsh Spectral Analysers are classified into direct and transform types. Direct type instruments require $N^2$ arithmetic operations and transform type use $N \log_2 N$ operations to obtain $N$ coefficients from a given data length $N$. The former is costlier in terms of hardware and can be justified in situations where all the $N$ coefficients are needed immediately after the $N$th sample.

A serial processor using long shift registers implementing a Fast Walsh Hadamard Transform is suggested. The processor is faster than an earlier design used by Geadah and Corinthios [17] by 1.5 times. An improved method to adapt the above design to obtain coefficients in dyadic and sequency orderings is given. The method consists of storing the incoming data in a permuted manner in an auxiliary memory and retrieving the data sequentially from the memory. The data is sent to the serial processor for further processing. The permutation operation involves a binary to Gray code conversion followed by bit-reversal operation for sequency ordered coefficients and a bit-reversal operation for dyadic ordered coefficients.
This scheme avoids the different permutation operations between the iterations of the FWHT used in the design of Geadah and Corinthios [17]. It also results in identical stages, when a pipeline structure is used for higher input data rates. These processors can accept input data rates of the order of 1 MHz for digital devices available at present.

Microprocessors provide an alternative in realizing a Walsh Spectral Analyser. This is lower in cost and Walsh to Fourier conversion can be implemented through software. An instrument based on an off-the-shelf Intel System 80/10 is developed. It consists of a cabinet containing an 8080-based single-board computer SBC 80/10, a card cage with ready-wired bus and power supply lines for up to three additional boards and a power supply. Only one additional specially-designed board is used. It contains circuits for trigger level adjustment, A/D conversion, 1K of read-write memory, a frequency multiplier module (FMM) and a direct memory access (DMA) controller.

The FMM circuit generates 64 equally-spaced pulses within one period of the input signal, so that the data window spans one cycle. The leakage error in the Fourier spectrum measurement is made negligibly small. Another feature of the circuit is that it does not require a high-speed clock for its proper functioning. The signal frequency may lie anywhere within the range 0.2 Hz to 10 Hz.
without any range switching. A measurement uses two cycles of input signal, the first being used for frequency determination and the second cycle for data acquisition. This is followed by processing of 24 msec duration, to yield the Walsh spectrum.

Given the Walsh coefficients of the signal, the corresponding Fourier coefficients are obtained by way of a matrix multiplication so that both spectra are available. The conversion process takes about 1 sec, most of the time being devoted to software multiplication.

The computed results (Walsh and Fourier components) are transmitted to a Hewlett Packard graphical terminal type 2647A with processing capability for displaying numerical and graphic data.

The present instrument is based on an eight bit microprocessor, accepting digital data in 8 bit, two's complement representation. This limits the dynamic range of the input signal to 42 dB. For better dynamic range, an A/D converter of 12 bits should be used and the use of one of the very recent 16 bit microprocessors is desirable to simplify the software and to reduce the computation time.

The FMM circuit developed in the present work may be extended to a higher speed of operation by incorporating multiple D/A converters for fine tuning adjustment, each D/A to be used in a round robin fashion.
REFERENCES


