Skip navigation
  • Home
  • Browse
    • Communities
      & Collections
    • Browse Items by:
    • Publication Date
    • Author
    • Title
    • Subject
    • Department
  • Sign on to:
    • My MacSphere
    • Receive email
      updates
    • Edit Profile


McMaster University Home Page
  1. MacSphere
  2. Open Access Dissertations and Theses Community
  3. Open Access Dissertations and Theses
Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/29764
Title: EVALUATING THE INTELLIGIBILITY OF SPEECH CHIMAERAS BASED ON ENVELOPE AND TEMPORAL FINE STRUCTURE CUES
Authors: Li, Yujie
Advisor: Bruce, Ian
Department: Electrical and Computer Engineering
Keywords: speech intelligibility;envelope;temporal fine structure;mean-rate;fine-timing;speech chimaeras
Publication Date: 2024
Abstract: Speech intelligibility is a measure of the human ability to understand speech signals. While the speech signal can be decomposed into its envelope and temporal fine structure, where the envelope is the contour of the signal amplitude over time, revealing the rhythm and intensity of the speech signal, the temporal fine structure is the rapidly oscillating portion of the signal that carries pitch and timbre information. Studies on speech intelligibility have shown that the acoustic envelope and the temporal fine structure contribute to speech intelligibility and play an important role in quiet and background noise, respectively. In this thesis, two speech signals are selected, one signal retains the envelope, the other signal retains the temporal fine structure, and then the envelope of one signal is combined with the temporal fine structure of the other signal to generate different speech chimera signals. Three methods are applied to evaluate the speech intelligibility, namely Spectro-Temporal Modulation Index (STMI), Neurogram Similarity Index Measure (NSIM), and Cross-Correlation Coefficients (CCC). This thesis describes these three methods in detail, in particular the creation of physiologically based assessment matrices, and then analyzes and compares the results by creating regression models of the predicted values of the different algorithms with experimentally measured subjective perceptions. This thesis shows that the combination of the STMI with either the fine-timing NSIM or the temporal fine structure CCC provides the optimal prediction model for speech chimera signals, and provides some implications for speech intelligibility research.
Description: Speech intelligibility is known as a measure of how much speech information the listener perceives. It is important to note that speech quality and intelligibility are not synonymous, good intelligibility can be achieved with degraded speech quality in some cases. If speech quality is about the ``how", intelligibility is about the ``what". In other words, speech quality is concerned with the ``how" the speech sounds - whether it is clear, lossless, noiseless, etc. While speech intelligibility is concerned with the comprehensibility of speech information, that is, whether the listener can accurately understand the ``what" in the speech signal, i.e., the conveyed messages or vocabulary. After understanding the concept of speech intelligibility, this thesis investigates various methods for predicting speech intelligibility.
URI: http://hdl.handle.net/11375/29764
Appears in Collections:Open Access Dissertations and Theses

Files in This Item:
File Description SizeFormat 
Li_Yujie_2024April_MASc.pdf
Open Access
6.29 MBAdobe PDFView/Open
Show full item record Statistics


Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.

Sherman Centre for Digital Scholarship     McMaster University Libraries
©2022 McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4L8 | 905-525-9140 | Contact Us | Terms of Use & Privacy Policy | Feedback

Report Accessibility Issue