1 | The Propagation Environment | 5 | |
1.1 | Introduction to Communications Issues | 5 | |
1.2 | AWGN Channel | 7 | |
1.2.1 | Background | 7 | |
1.2.2 | Practical Gaussian Channels | 8 | |
1.2.3 | Gaussian Noise | 9 | |
1.2.4 | Shannon-hartley Law | 11 | |
1.3 | The Cellular Concept | 12 | |
1.4 | Radio Wave Propagation | 16 | |
1.4.1 | Background | 16 | |
1.4.2 | Narrow-band fading Channels | 19 | |
1.4.3 | Propagation Pathloss Law | 20 | |
1.4.4 | Slow Fading Statistics | 24 | |
1.4.4.1 | Fast Fading Statistics | 25 | |
1.4.4.2 | Doppler Spectrum | 30 | |
1.4.4.3 | Simulation of Narrowband Channels | 32 | |
1.4.4.3.1 | Frequency-domain fading simulation | 34 | |
1.4.4.3.2 | Time-domain fading simulation | 35 | |
1.4.4.3.3 | Box-M\"uller Algorithm of AWGN generation | 35 | |
1.4.5 | Wideband Channels | 36 | |
1.4.5.1 | Modelling of Wideband Channels | 36 | |
1.5 | Shannon's Message for Wireless Channels | 42 | |
2 | Modulation and Transmission | 47 | |
2.1 | The Wireless Communications Scene | 47 | |
2.2 | Modulation Issues | 49 | |
2.2.1 | Choice of Modulation | 49 | |
2.2.2 | Quadrature Amplitude Modulation | 52 | |
2.2.2.1 | Background | 52 | |
2.2.2.2 | Modem Schematic | 53 | |
2.2.2.2.1 | Gray Mapping and Phasor Constellation | 53 | |
2.2.2.2.2 | Nyquist Filtering | 56 | |
2.2.2.2.3 | Modulation and demodulation | 59 | |
2.2.2.2.4 | Data recovery | 61 | |
2.2.2.3 | QAM Constellations | 62 | |
2.2.2.4 | 16QAM BER versus SNR Performance over AWGN Channels | 65 | |
2.2.2.4.1 | Decision Theory | 65 | |
2.2.2.4.2 | QAM modulation and transmission | 69 | |
2.2.2.4.3 | 16-QAM Demodulation in AWGN | 69 | |
2.2.2.5 | Reference Assisted Coherent QAM for Fading Channels | 73 | |
2.2.2.5.1 | PSAM System Description | 73 | |
2.2.2.5.2 | Channel Gain Estimation in PSAM | 76 | |
2.2.2.5.3 | PSAM performance | 78 | |
2.2.2.6 | Differentially detected QAM | 81 | |
2.2.3 | Adaptive Modulation | 84 | |
2.2.3.1 | Background to Adaptive Modulation | 84 | |
2.2.3.2 | Optimisation of Adaptive Modems | 88 | |
2.2.3.3 | Adaptive Modulation Performance | 91 | |
2.2.3.4 | Equalisation Techniques | 92 | |
2.2.4 | Orthogonal Frequency Division Modulation | 93 | |
2.3 | Packet Reservation Multiple Access | 96 | |
2.4 | Flexible Transceiver Architecture | 100 | |
3 | Convolutional Channel Coding | 103 | |
3.1 | Brief Channel Coding History | 103 | |
3.2 | Convolutional Encoding | 104 | |
3.3 | State and Trellis Transitions | 107 | |
3.4 | The Viterbi Algorithm | 110 | |
3.4.1 | Error-free hard-decision Viterbi decoding | 111 | |
3.4.2 | Erroneous hard-decision Viterbi decoding | 115 | |
3.4.3 | Error-free soft-decision Viterbi decoding | 116 | |
4 | Block-based Channel Coding | 121 | |
4.1 | Introduction | 121 | |
4.2 | Finite Fields | 122 | |
4.2.1 | Definitions | 122 | |
4.2.2 | Galois Field Construction | 127 | |
4.2.3 | Galois Field Arithmetic | 128 | |
4.3 | RS and BCH Codes | 131 | |
4.3.1 | Definitions | 131 | |
4.3.2 | RS Encoding | 132 | |
4.3.3 | RS Encoding Example | 135 | |
4.3.4 | Circuits for Cyclic Encoders | 139 | |
4.3.4.1 | Polynomial Multiplication | 139 | |
4.3.4.2 | Shift Register Encoding Example | 140 | |
4.3.5 | RS Decoding | 143 | |
4.3.5.1 | Formulation of the Key-Equations | 143 | |
4.3.5.2 | Peterson-Gorenstein-Zierler Decoder | 150 | |
4.3.5.3 | PGZ Decoding Example | 152 | |
4.3.5.4 | Berlekamp-Massey algorithm | 158 | |
4.3.5.5 | Berlekamp-Massey Decoding Example | 166 | |
4.3.5.6 | Forney Algorithm | 170 | |
4.3.5.7 | Forney Algorithm Example | 175 | |
4.3.5.8 | Error Evaluator Polynomial Computation | 177 | |
4.4 | RS and BCH Codec Performance | 180 | |
4.5 | Summary and Conclusions | 184 |
5 | Speech Signals and Coding | 189 | |
5.1 | Motivation | 189 | |
5.2 | Basic Characterisation of Speech Signals | 190 | |
5.3 | Classification of Speech Codecs | 195 | |
5.3.1 | Waveform Coding | 196 | |
5.3.1.1 | Time-domain waveform Coding | 196 | |
5.3.1.2 | Frequency-domain Waveform Coding | 197 | |
5.3.2 | Vocoders | 197 | |
5.3.3 | Hybrid Coding | 199 | |
5.4 | Waveform Coding | 199 | |
5.4.1 | Digitisation of Speech | 199 | |
5.4.2 | Quantisation Characteristics | 201 | |
5.4.3 | Quantisation Noise and Rate-Distortion Theory | 203 | |
5.4.4 | Non-uniform Quantisation: Companding | 206 | |
5.4.5 | Logarithmic Compression | 208 | |
5.4.5.1 | $\mu $-Law Compander | 210 | |
5.4.5.2 | A-law Companding | 212 | |
5.4.6 | Optimum Non-uniform Quantisation | 213 | |
6 | Predictive Coding | 221 | |
6.1 | Forward Predictive Coding | 221 | |
6.2 | DPCM Codec Schematic | 222 | |
6.3 | Predictor Design | 224 | |
6.3.1 | Problem Formulation | 224 | |
6.3.2 | Covariance Coefficient Computation | 226 | |
6.3.3 | Predictor Coefficient Computation | 228 | |
6.4 | Adaptive One-word-memory Quantization | 233 | |
6.5 | DPCM Performance | 236 | |
6.6 | Backward-Adaptive Prediction | 238 | |
6.6.1 | Background | 238 | |
6.6.2 | Stochastic model processes | 240 | |
6.7 | The 32 kbps G 721 ADPCM Codec | 244 | |
6.7.1 | Functional Description | 244 | |
6.7.2 | Adaptive Quantiser | 246 | |
6.7.3 | Quantiser scale factor adaptation | 246 | |
6.7.4 | Adaptation speed control | 248 | |
6.7.5 | Adaptive prediction and signal reconstruction | 249 | |
6.8 | Speech Quality | 251 | |
6.9 | G726 and G.727 ADPCM Coding | 253 | |
6.9.1 | Motivation | 253 | |
6.9.2 | Embedded ADPCM coding | 254 | |
6.9.3 | Performance of the Embedded G.727 ADPCM Codec | 256 | |
6.10 | Rate-Distortion in Predictive Coding | 263 |
7 | Analysis-by-synthesis Principles | 273 | |
7.1 | Motivation | 273 | |
7.2 | Analysis-by-synthesis codec structure | 274 | |
7.3 | The Short-term Synthesis Filter | 276 | |
7.4 | Long-Term Prediction | 280 | |
7.4.1 | Open-loop Optimisation of LTP parameters | 280 | |
7.4.2 | Closed-loop Optimisation of LTP parameters | 287 | |
7.5 | Excitation Models | 292 | |
7.6 | Adaptive Postfiltering | 295 | |
7.7 | Lattice-based Linear Prediction | 299 | |
8 | Speech Spectral Quantization | 307 | |
8.1 | Log-area ratios | 307 | |
8.2 | Line Spectral Frequencies | 311 | |
8.2.1 | Derivation of Line Spectral Frequencies | 311 | |
8.2.2 | Determination of Line Spectral Frequencies | 316 | |
8.2.3 | Chebyshev-description of Line Spectral Frequencies | 318 | |
8.3 | Spectral Vector Quantization | 325 | |
8.3.1 | Background | 325 | |
8.3.2 | Speaker-adaptive Vector Quantisation of LSFs | 325 | |
8.3.3 | Stochastic VQ of LPC Parameters | 328 | |
8.3.3.1 | Background | 328 | |
8.3.3.2 | The Stochastic VQ Algorithm | 330 | |
8.3.4 | Robust Vector Quantisation Schemes for LSFs | 333 | |
8.3.5 | LSF Vector-quantisers used in standard codecs | 336 | |
9 | RPE Coding | 339 | |
9.1 | Theoretical Background | 339 | |
9.2 | The RPE-LTP GSM Speech encoder | 348 | |
9.2.1 | Pre-processing | 348 | |
9.2.2 | STP analysis filtering | 350 | |
9.2.3 | LTP analysis filtering | 351 | |
9.2.4 | Regular Excitation Pulse Computation | 352 | |
9.3 | The RPE-LTP Speech Decoder | 353 | |
9.4 | Bit-sensitivity of the GSM Codec | 358 | |
9.5 | A Tool-box Based Speech Transceiver | 359 | |
10 | Forward-Adaptive CELP Coding | 363 | |
10.1 | Background | 363 | |
10.2 | The Original CELP Approach | 365 | |
10.3 | Fixed Codebook Search | 369 | |
10.4 | CELP Excitation Models | 371 | |
10.4.1 | Binary Pulse Excitation | 371 | |
10.4.2 | Transformed Binary Pulse Excitation | 373 | |
10.4.2.1 | Excitation Generation | 373 | |
10.4.2.2 | TBPE Bit Sensitivity | 375 | |
10.4.3 | Dual-rate Algebraic CELP Coding | 379 | |
10.4.3.1 | ACELP Codebook Structure | 379 | |
10.4.3.2 | Dual-rate ACELP Bitallocation | 381 | |
10.4.3.3 | Dual-rate ACELP Codec Performance | 382 | |
10.5 | CELP Optimization | 383 | |
10.5.1 | Introduction | 383 | |
10.5.2 | Calculation of the Excitation Parameters | 385 | |
10.5.2.1 | Full Codebook Search Theory | 385 | |
10.5.2.2 | Sequential Search Procedure | 388 | |
10.5.2.3 | Full Search Procedure | 389 | |
10.5.2.4 | Sub-Optimal Search Procedures | 390 | |
10.5.2.5 | Quantization of the Codebook Gains | 393 | |
10.5.3 | Calculation of the Synthesis Filter Parameters | 396 | |
10.5.3.1 | Bandwidth Expansion | 396 | |
10.5.3.2 | Least Squares Techniques | 397 | |
10.5.3.3 | Optimization via Powell's Method | 401 | |
10.5.3.4 | Simulated Annealing and the Effects of Quantization | 403 | |
10.6 | CELP Error-sensitivity | 407 | |
10.6.1 | Introduction | 407 | |
10.6.2 | Improving the Spectral Information Error Sensitivity | 408 | |
10.6.2.1 | LSF Ordering Policies | 408 | |
10.6.2.2 | The Effect of FEC on the Spectral Parameters | 411 | |
10.6.2.3 | The Effect of Interpolation | 412 | |
10.6.3 | Improving the Error Sensitivity of the Excitation Parameters | 414 | |
10.6.3.1 | The Fixed Codebook Index | 414 | |
10.6.3.2 | The Fixed Codebook Gain | 415 | |
10.6.3.3 | Adaptive Codebook Delay | 416 | |
10.6.3.4 | Adaptive Codebook Gain | 417 | |
10.6.4 | Matching Channel Coders to the Speech Coder | 417 | |
10.6.5 | Error Resilience Conclusions | 422 | |
10.7 | Dual-mode Speech Transceiver | 423 | |
10.7.1 | The Transceiver Scheme | 423 | |
10.7.2 | Re-configurable Modulation | 424 | |
10.7.3 | Source-matched Error Protection | 427 | |
10.7.3.1 | Low-quality 3.1 kBd Mode | 427 | |
10.7.3.2 | High-quality 3.1 kBd Mode | 432 | |
10.7.4 | Packet Reservation Multiple Access | 434 | |
10.7.5 | 3.1 kBd System Performance | 437 | |
10.7.6 | 3.1 kBd System Summary | 441 | |
10.8 | Multi-slot PRMA Transceiver | 442 | |
10.8.1 | Background and Motivation | 442 | |
10.8.2 | PRMA-assisted Multi-slot Adaptive Modulation | 443 | |
10.8.3 | Adaptive GSM-like Schemes | 445 | |
10.8.4 | Adaptive DECT-like Schemes | 447 | |
10.8.5 | Summary of Adaptive Multi-slot PRMA | 448 | |
11 | Standard CELP Codecs | 451 | |
11.1 | Background | 451 | |
11.2 | The US DoD FS 1016 4.8 kbits/s CELP Codec | 452 | |
11.2.1 | Introduction | 452 | |
11.2.2 | LPC Analysis and Quantization | 453 | |
11.2.3 | The Adaptive Codebook | 455 | |
11.2.4 | The Fixed Codebook | 456 | |
11.2.5 | Error Concealment Techniques | 458 | |
11.2.6 | Decoder Post-Filtering | 459 | |
11.2.7 | Conclusion | 459 | |
11.3 | The IS-54 DAMPS speech codec | 460 | |
11.4 | The JDC speech codec | 464 | |
11.5 | The Qualcomm Variable Rate CELP Codec | 468 | |
11.5.1 | Introduction | 468 | |
11.5.2 | Codec Schematic and Bit Allocation | 469 | |
11.5.3 | Codec Rate Selection | 470 | |
11.5.4 | LPC Analysis and Quantization | 471 | |
11.5.5 | The Pitch Filter | 473 | |
11.5.6 | The Fixed Codebook | 474 | |
11.5.7 | Rate 1/8 Filter Excitation | 475 | |
11.5.8 | Decoder Post-Filtering | 476 | |
11.5.9 | Error Protection and Concealment Techniques | 477 | |
11.5.10 | Conclusion | 478 | |
11.6 | Japanese Half-Rate Speech Codec | 478 | |
11.6.1 | Introduction | 478 | |
11.6.2 | Codec Schematic and Bit Allocation | 479 | |
11.6.3 | Encoder Pre Processing | 482 | |
11.6.4 | LPC Analysis and Quantization | 482 | |
11.6.5 | The Weighting Filter | 484 | |
11.6.6 | Excitation Vector 1 | 484 | |
11.6.7 | Excitation Vector 2 | 485 | |
11.6.8 | Quantization of the Gains | 489 | |
11.6.9 | Channel Coding | 490 | |
11.6.10 | Decoder Post Processing | 491 | |
11.7 | The half-rate GSM codec | 493 | |
11.7.1 | Half-rate GSM codec outline | 493 | |
11.7.2 | Half-rate GSM codec spectral quantisation | 496 | |
11.7.3 | Error protection | 497 | |
11.8 | The 8 kbits/s G.729 Codec | 499 | |
11.8.1 | Introduction | 499 | |
11.8.2 | Codec Schematic and Bit Allocation | 499 | |
11.8.3 | Encoder Pre-Processing | 501 | |
11.8.4 | LPC Analysis and Quantization | 501 | |
11.8.5 | The Weighting Filter | 505 | |
11.8.6 | The Adaptive Codebook | 506 | |
11.8.7 | The Fixed Algebraic Codebook | 507 | |
11.8.8 | Quantization of the Gains | 511 | |
11.8.9 | Decoder Post Processing | 513 | |
11.8.10 | G.729 Error Concealment Techniques | 515 | |
11.8.11 | G.729 Bit-sensitivity | 517 | |
11.8.12 | Turbo-coded OFDM G.729 Speech Transceiver | 518 | |
11.8.12.1 | Background | 518 | |
11.8.12.2 | System Overview | 519 | |
11.8.12.3 | Turbo Channel Encoding | 519 | |
11.8.12.4 | OFDM in the FRAMES Speech/Data Sub--Burst | 521 | |
11.8.12.5 | Channel model | 522 | |
11.8.12.6 | Turbo-coded G.729 OFDM Parameters | 523 | |
11.8.12.7 | Turbo-coded G.729 OFDM Performance | 524 | |
11.8.12.8 | Turbo-coded G.729 OFDM Summary | 525 | |
11.8.13 | G.729 Summary | 527 | |
11.9 | The Reduced Complexity G.729 Annex A Codec | 527 | |
11.9.1 | Introduction | 527 | |
11.9.2 | The Perceptual Weighting Filter | 528 | |
11.9.3 | The Open Loop Pitch Search | 529 | |
11.9.4 | The Closed Loop Pitch Search | 529 | |
11.9.5 | The Algebraic Codebook Search | 530 | |
11.9.6 | The Decoder Post Processing | 531 | |
11.9.7 | Conclusions | 531 | |
11.10 | The Enhanced GSM codec | 532 | |
11.10.1 | Codec Outline | 532 | |
11.10.2 | Operation of the EFR-GSM Encoder | 534 | |
11.10.2.1 | Spectral quantisation in the EFR-GSM Codec | 534 | |
11.10.2.2 | Adaptive Codebook Search | 537 | |
11.10.2.3 | Fixed Codebook Search | 538 | |
11.11 | The IS-136 speech codec | 539 | |
11.11.1 | IS-136 codec outline | 539 | |
11.11.2 | IS-136 Bitallocation scheme | 541 | |
11.11.3 | Fixed Codebook Search | 543 | |
11.11.4 | IS-136 channel coding | 544 | |
11.12 | The ITU G.723.1 dual rate codec | 545 | |
11.12.1 | Introduction | 545 | |
11.12.2 | G.723.1 Encoding Principle | 546 | |
11.12.3 | Vector-Quantisation of the LSPs | 548 | |
11.12.4 | Formant-based Weighting Filter | 550 | |
11.12.5 | The 6.3 kbps high-rate G.723.1 excitation | 550 | |
11.12.6 | The 5.3 kbps low-rate G.723.1 excitation | 553 | |
11.12.7 | G.723.1 Bitallocation | 554 | |
11.12.8 | G.723.1 Error Sensitivity | 556 | |
11.13 | Summary of Standard CELP-based Codecs | 559 | |
12 | Backward-Adaptive CELP Coding | 563 | |
12.1 | Introduction | 563 | |
12.2 | Motivation and Background | 564 | |
12.3 | Backward-Adaptive G728 Schematic | 568 | |
12.4 | Backward-Adaptive G728 Coding | 571 | |
12.4.1 | Error Weighting | 571 | |
12.4.2 | Windowing | 571 | |
12.4.3 | Codebook Gain Adaption | 576 | |
12.4.4 | Codebook Search | 580 | |
12.4.5 | Excitation Vector Quantization | 583 | |
12.4.6 | Adaptive Postfiltering | 585 | |
12.4.6.1 | Adaptive Long-term Postfiltering | 586 | |
12.4.6.2 | Adaptive Short-term Postfiltering | 588 | |
12.4.7 | Complexity and Performance of the G728 Codec | 589 | |
12.5 | Reduced-Rate 16-8 kbps G728-Like Codec I | 590 | |
12.6 | The Effects of Long Term Prediction | 594 | |
12.7 | Closed-Loop Codebook Training | 601 | |
12.8 | Reduced-Rate 16-8 kbps G728-Like Codec II | 607 | |
12.9 | Programmable-Rate 8-4 kbps CELP Codecs | 609 | |
12.9.1 | Motivation | 609 | |
12.9.2 | Improvements Due to Increasing Codebook Sizes | 609 | |
12.9.3 | Forward Adaption of the Short Term Synthesis Filter | 610 | |
12.9.4 | Forward Adaption of the Long Term Predictor | 613 | |
12.9.4.1 | Initial Experiments | 613 | |
12.9.4.2 | Quantization of Jointly Optimized Gains | 615 | |
12.9.4.3 | Voiced/Unvoiced Switched Codebooks | 619 | |
12.9.5 | Low Delay Codecs at 4-8 kbits/s | 621 | |
12.9.6 | Low Delay ACELP Codeca | 625 | |
12.10 | Error Sensitivity Issues | 629 | |
12.10.1 | The Error Sensitivity of the G728 Codec | 630 | |
12.10.2 | The Error Sensitivity of Our 4-8 kbits/s Low Delay Codecs | 632 | |
12.10.3 | The Error Sensitivity of Our Low Delay ACELP Codec | 638 | |
12.11 | A Low-Delay Multimode Speech Transceiver | 638 | |
12.12 | Background | 638 | |
12.12.1 | 8-16 kbps Codec Performance | 640 | |
12.12.2 | Transmission Issues | 642 | |
12.12.2.1 | Higher-quality Mode | 642 | |
12.12.3 | Lower-quality Mode | 644 | |
12.13 | Speech Transceiver Performance | 644 | |
12.14 | Conclusion | 645 |
13 | Wideband Speech Coding | 649 | |
13.1 | Subband-ADPCM Wideband Coding | 649 | |
13.1.1 | Introduction and Specifications | 649 | |
13.1.2 | G722 Codec Outline | 650 | |
13.1.3 | Principles of Subband Coding | 654 | |
13.1.4 | Quadrature Mirror Filtering | 656 | |
13.1.4.1 | Analysis Filtering | 656 | |
13.1.4.2 | Synthesis Filtering | 660 | |
13.1.4.3 | Practical QMF Design Constraints | 661 | |
13.1.5 | G722 Adaptive Quantisation and Prediction | 668 | |
13.1.6 | G722 Coding Performance | 670 | |
13.2 | Wideband Transform-Coding at 32 kbps | 671 | |
13.2.1 | Background | 671 | |
13.2.2 | Transform-Coding Algorithm | 671 | |
13.3 | Subband-Split Wideband CELP Codecs | 675 | |
13.3.1 | Background | 675 | |
13.3.2 | Subband-based Wideband CELP coding | 676 | |
13.3.2.1 | Motivation | 676 | |
13.3.2.2 | Low-band Coding | 678 | |
13.3.2.3 | Highband Coding | 678 | |
13.3.2.4 | Bit allocation Scheme | 679 | |
13.4 | Fullband Wideband ACELP Coding | 680 | |
13.4.1 | Wideband ACELP Excitation | 680 | |
13.4.2 | Wideband 32 kbps ACELP Coding | 684 | |
13.4.3 | Wideband 9.6 kbps ACELP Coding | 685 | |
13.5 | Conclusions | 687 | |
14 | Introduction to Very Low Rate Speech Coding | 691 | |
14.1 | Sub-4.8 kbps Coding Techniques | 691 | |
14.1.1 | Analysis-by-Synthesis Coding | 693 | |
14.1.2 | Speech Coding at 2400bps | 696 | |
14.1.2.1 | Background to 2400bps Speech Coding | 696 | |
14.1.2.2 | Frequency Selective Harmonic Coder | 699 | |
14.1.2.3 | Sinusoidal Transform Coder | 700 | |
14.1.2.4 | Multiband Excitation Coders | 701 | |
14.1.2.5 | Sub-band Linear Prediction Coder | 703 | |
14.1.2.6 | Mixed Excitation Linear Prediction Coder | 705 | |
14.1.2.7 | Waveform Interpolation Coder | 706 | |
14.1.3 | Speech Coding Below 2400bps | 708 | |
14.2 | Linear Predictive Coding model | 712 | |
14.2.1 | Short Term Prediction | 712 | |
14.2.2 | Long Term Prediction | 714 | |
14.2.3 | Final Analysis-by-Synthesis Model | 715 | |
14.3 | Speech Quality Measurements | 716 | |
14.3.1 | Objective Speech Quality Measures | 717 | |
14.3.2 | Subjective Speech Quality Measures | 717 | |
14.3.3 | 2400bps Selection Process | 718 | |
14.4 | Speech Database | 721 | |
15 | Linear Predictive Vocoder | 723 | |
15.1 | Overview of an Linear Predictive Vocoder | 723 | |
15.2 | Line Spectrum Frequencies Quantization | 724 | |
15.2.1 | Line Spectrum Frequencies Scalar Quantization | 725 | |
15.2.2 | Line Spectrum Frequencies Vector Quantization | 726 | |
15.3 | Pitch Detection | 730 | |
15.3.1 | Voiced-Unvoiced Decision | 733 | |
15.3.2 | Oversampled Pitch Detector | 735 | |
15.3.3 | Pitch Tracking | 738 | |
15.3.3.1 | Computational Complexity | 743 | |
15.3.4 | Integer Pitch Detector | 744 | |
15.4 | Unvoiced Frames | 745 | |
15.5 | Voiced Frames | 747 | |
15.5.1 | Placement of Pulses | 747 | |
15.5.2 | Pulse Energy | 747 | |
15.6 | Adaptive Post Filter | 748 | |
15.7 | Results for Linear Predictive Vocoder | 751 | |
16 | Wavelets | 759 | |
16.1 | Conceptual Introduction to Wavelets | 759 | |
16.1.1 | Fourier Theory | 759 | |
16.1.2 | Wavelet Theory | 761 | |
16.1.3 | Detecting Discontinuities | 762 | |
16.2 | Introduction to Wavelet Mathematics | 764 | |
16.2.1 | Multiresolution Analysis | 765 | |
16.2.2 | Polynomial Spline Wavelets | 766 | |
16.2.3 | Pyramidal Algorithm | 767 | |
16.2.4 | Boundary Effects | 769 | |
16.3 | Preprocessing the Wavelet Transform Signal | 771 | |
16.3.1 | Spurious Pulses | 771 | |
16.3.2 | Normalization | 774 | |
16.3.3 | Candidate Glottal Pulses | 774 | |
16.4 | Voiced-Unvoiced Decision | 774 | |
16.5 | Wavelet Based Pitch Detector | 777 | |
16.5.1 | Dynamic Programming | 778 | |
16.5.2 | Autocorrelation Simplification | 782 | |
17 | Zinc Function Excitation | 787 | |
17.1 | Overview of Interpolated Zinc Function Prototype Excitation | 787 | |
17.1.1 | Coding Scenarios | 787 | |
17.1.1.1 | U-U-U Encoder Scenario | 788 | |
17.1.1.2 | U-U-V Encoder Scenario | 788 | |
17.1.1.3 | V-U-U Encoder Scenario | 788 | |
17.1.1.4 | U-V-U Encoder Scenario | 791 | |
17.1.1.5 | V-V-V Encoder Scenario | 791 | |
17.1.1.6 | V-U-V Encoder Scenario | 791 | |
17.1.1.7 | U-V-V Encoder Scenario | 791 | |
17.1.1.8 | V-V-U Encoder Scenario | 792 | |
17.1.1.9 | U-V Decoder Scenario | 793 | |
17.1.1.10 | U-U Decoder Scenario | 793 | |
17.1.1.11 | V-U Decoder Scenario | 793 | |
17.1.1.12 | V-V Decoder Scenario | 793 | |
17.2 | Zinc Function Modelling | 794 | |
17.2.1 | Error Minimization | 794 | |
17.2.2 | Computational Complexity | 795 | |
17.2.3 | Phases of the Zinc Functions | 797 | |
17.3 | Pitch Detection | 797 | |
17.3.1 | Voiced-Unvoiced Boundaries | 798 | |
17.3.2 | Pitch Prototype Selection | 798 | |
17.4 | Voiced Speech | 800 | |
17.4.1 | Energy Scaling | 803 | |
17.4.2 | Quantization | 804 | |
17.5 | Interpolation | 806 | |
17.5.1 | Amplitude Parameter Interpolation | 808 | |
17.5.2 | Position Parameter Interpolation | 808 | |
17.5.3 | Removal of Position Parameter Interpolation | 809 | |
17.5.4 | Line Spectrum Frequencies Pitch Synchronous Interpolation | 810 | |
17.5.5 | Interpolation Example | 811 | |
17.6 | Unvoiced Speech | 811 | |
17.7 | Adaptive Post Filter | 814 | |
17.8 | Results for Interpolated Zinc Function Prototype Excitation Coder | 814 | |
18 | Mixed Multiband Excitation | 821 | |
18.1 | Overview of Mixed Multiband Excitation | 821 | |
18.2 | Finite Impulse Response Filter | 826 | |
18.2.1 | Computational Complexity | 828 | |
18.3 | Mixed Multiband Excitation Encoder | 828 | |
18.3.1 | Voicing Strengths | 830 | |
18.4 | Mixed Multiband Excitation Decoder | 833 | |
18.5 | Adaptive Post Filter | 837 | |
18.6 | Results for Mixed Multiband Excitation Coder | 837 | |
18.6.1 | Results for a Mixed Multiband Excitation and Linear Predictive Coder | 837 | |
18.6.2 | Results for a Mixed Multiband Excitation and Zinc Function Prototype Excitation Coder | 845 | |
19 | Comparison of Speech Transceivers | 851 | |
19.1 | Background to Speech Quality Evaluation | 851 | |
19.2 | Objective Speech Quality Measures | 852 | |
19.2.1 | Introduction | 852 | |
19.2.2 | Signal to Noise Ratios | 853 | |
19.2.3 | Articulation Index | 854 | |
19.2.4 | Ceptral Distance | 855 | |
19.2.5 | Cepstral Example | 858 | |
19.2.6 | Logarithmic likelihood ratio | 861 | |
19.2.7 | Euclidean Distance | 862 | |
19.3 | Subjective Measures | 862 | |
19.3.1 | Quality Tests | 863 | |
19.4 | Comparison of Quality Measures | 864 | |
19.4.1 | Background | 864 | |
19.4.2 | Intelligibility tests | 865 | |
19.5 | Subjective Speech Quality of Various Codecs | 867 | |
19.6 | Speech Codec Bit-sensitivity | 869 | |
19.7 | Transceiver Speech Performance | 873 | |
20 | Zinc Function Excitation | 879 |