A Physiologically-Motivated Analysis of the Performance of Multichannel Linear Predictive Approaches to Dereverberation

O'Shaughnessy, Kyle

Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/32571

Title:	A Physiologically-Motivated Analysis of the Performance of Multichannel Linear Predictive Approaches to Dereverberation
Other Titles:	A Perceptual Evaluation of Multichannel Linear Predictive Dereverberation
Authors:	O'Shaughnessy, Kyle
Advisor:	Bruce, Ian
Department:	Electrical and Computer Engineering
Keywords:	Audio Signal Processing, Dereverberation, Hearing Aids, Speech Perception
Publication Date:	2025
Abstract:	In practical acoustic environments, reflections give rise to reverberation which makes speech perception more challenging, especially for individuals with hearing impairment. This creates a need for speech reproduction systems such as hearing aids to include strategies for reducing the perceptual impacts of reverberation (i.e., dereverberation algorithms). In this thesis, an evaluation of one of the most prevalent techniques, namely delay-and-predict dereverberation (Triki and Slock, 2006), is provided. Recent advancements in physiologically motivated predictors of speech intelligibility (SI) are leveraged to explain the complex impacts of reverberation/dereverberation on speech perception. In particular, the neurogram similarity index measure (NSIM) and the spectro-temporal modulation index (STMI) are utilized in addition to the well-known hearing aid speech perception index (HASPI) and short-time objective intelligibility (STOI). The results suggest that delay-and-predict dereverberation is relatively effective at reducing the earlier part of room impulse responses (RIRs), which provides sufficient restoration of temporal fine structure (TFS) and envelope (ENV) acoustic cues to reduce listening effort (LE) and compensate deficits in SI for normal-hearing and hearing-impaired listeners. The algorithm is incapable of cancelling the later part of RIRs, but by introducing a small amount of autocorrelation regularization to the algorithm, its impact on this late reverberation is shown to greatly improve. In practice however, delay-and-predict performance is shown to be limited by the number of microphones available, the need for large amounts of signal data, the presence of interfering acoustic signals, and potentially by time-varying acoustics. The evaluation also demonstrates that the NSIM and STMI provide a more complete picture of the perceptual impacts of reverberation than HASPI or STOI. However, the NSIM is found to be highly sensitive to phase distortions which may or may not reflect a realistic impact on speech perception, thus potentially limiting its usefulness in the evaluation of complex signal processing algorithms.
URI:	http://hdl.handle.net/11375/32571
Appears in Collections:	Open Access Dissertations and Theses

Files in This Item:

File	Description	Size	Format
O'Shaughnessy_Kyle_J_2025Sept_MASc.pdf Open Access		12.96 MB	Adobe PDF	View/Open

Show full item record