Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Recursively Adaptive Randomized Multi-Tree Coding (RAR MTC) of Speech with VAD/CNG

Abstract

A new form of a tree codec for narrowband speech, “Recursively Adaptive Randomized Multi-tree Coding (RAR MTC) with VAD/CNG”, is developed based on a sample-by-sample analysis-and-synthesis linear predictive model by benchmarking and upgrading the tree coding models suggested by J. D. Gibson, W. Chang and H. C. Woo. in the 1990s. A simple structure of the Voice Activity Detection/Comfort Noise Generation (VAD/CNG) algorithm is newly applied to the prior speech tree coder to lower the average bit rate by increasing encoding efficiency. A backward adaptive all-pole short-term predictor, which was cascaded to a pitch-based long-term predictor, is replaced with a backward adaptive pole-zero predictor for better input waveform-tracking performance with higher accuracy of prediction. The RAR MTC encodes the initial samples of each voiced region by spanning a 5-level Pitch Compensating Quantizer (PCQ) tree, and then, our randomly interleaved 4-level and 2-level multitree (4-2 MTC) is used to encode the rest of voiced samples with a set of prediction parameters initialized by the 5-level tree coding. A newly developed gain control algorithm for a 2-level tree based on the polarity pattern of the past 5 excitation values advances its gain tracking performance.In our simulations, the results show that those new features we have developed enable the RAR MTC codec to achieve very competitive performance with a lower delay and more natural tone recovery compared to the widely used standard, AMR-NB, which is built on a CELP structure based on a block-based predictive model.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View