A Physiologically-Motivated Analysis of the Performance of Multichannel Linear Predictive Approaches to Dereverberation
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In practical acoustic environments, reflections give rise to reverberation which makes
speech perception more challenging, especially for individuals with hearing impairment.
This creates a need for speech reproduction systems such as hearing aids to
include strategies for reducing the perceptual impacts of reverberation (i.e., dereverberation
algorithms). In this thesis, an evaluation of one of the most prevalent techniques,
namely delay-and-predict dereverberation (Triki and Slock, 2006), is provided.
Recent advancements in physiologically motivated predictors of speech intelligibility
(SI) are leveraged to explain the complex impacts of reverberation/dereverberation
on speech perception. In particular, the neurogram similarity index measure (NSIM)
and the spectro-temporal modulation index (STMI) are utilized in addition to the
well-known hearing aid speech perception index (HASPI) and short-time objective
intelligibility (STOI). The results suggest that delay-and-predict dereverberation is
relatively effective at reducing the earlier part of room impulse responses (RIRs),
which provides sufficient restoration of temporal fine structure (TFS) and envelope
(ENV) acoustic cues to reduce listening effort (LE) and compensate deficits in SI for
normal-hearing and hearing-impaired listeners. The algorithm is incapable of cancelling
the later part of RIRs, but by introducing a small amount of autocorrelation
regularization to the algorithm, its impact on this late reverberation is shown to greatly improve. In practice however, delay-and-predict performance is shown to be
limited by the number of microphones available, the need for large amounts of signal
data, the presence of interfering acoustic signals, and potentially by time-varying
acoustics. The evaluation also demonstrates that the NSIM and STMI provide a more
complete picture of the perceptual impacts of reverberation than HASPI or STOI.
However, the NSIM is found to be highly sensitive to phase distortions which may or
may not reflect a realistic impact on speech perception, thus potentially limiting its
usefulness in the evaluation of complex signal processing algorithms.