Identification of speech transients using variable frame rate analysis and wavelet packets

Daniel M Rasetshwane, J. Robert Boston, Ching Chung Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al and Tantibundhit et al were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise [3] [4]. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

Original languageEnglish (US)
Title of host publication28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06
Pages1727-1730
Number of pages4
DOIs
StatePublished - Dec 1 2006
Event28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06 - New York, NY, United States
Duration: Aug 30 2006Sep 3 2006

Publication series

NameAnnual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings
ISSN (Print)0589-1019

Conference

Conference28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06
CountryUnited States
CityNew York, NY
Period8/30/069/3/06

Fingerprint

Speech intelligibility
Inverse transforms
Acoustic waves
Decomposition

ASJC Scopus subject areas

  • Signal Processing
  • Biomedical Engineering
  • Computer Vision and Pattern Recognition
  • Health Informatics

Cite this

Rasetshwane, D. M., Boston, J. R., & Li, C. C. (2006). Identification of speech transients using variable frame rate analysis and wavelet packets. In 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06 (pp. 1727-1730). [4030147] (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings). https://doi.org/10.1109/IEMBS.2006.260720

Identification of speech transients using variable frame rate analysis and wavelet packets. / Rasetshwane, Daniel M; Boston, J. Robert; Li, Ching Chung.

28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06. 2006. p. 1727-1730 4030147 (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rasetshwane, DM, Boston, JR & Li, CC 2006, Identification of speech transients using variable frame rate analysis and wavelet packets. in 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06., 4030147, Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings, pp. 1727-1730, 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06, New York, NY, United States, 8/30/06. https://doi.org/10.1109/IEMBS.2006.260720
Rasetshwane DM, Boston JR, Li CC. Identification of speech transients using variable frame rate analysis and wavelet packets. In 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06. 2006. p. 1727-1730. 4030147. (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings). https://doi.org/10.1109/IEMBS.2006.260720
Rasetshwane, Daniel M ; Boston, J. Robert ; Li, Ching Chung. / Identification of speech transients using variable frame rate analysis and wavelet packets. 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06. 2006. pp. 1727-1730 (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings).
@inproceedings{4fab9e0ee7664e8b82a31ae91baf03ff,
title = "Identification of speech transients using variable frame rate analysis and wavelet packets",
abstract = "Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al and Tantibundhit et al were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise [3] [4]. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.",
author = "Rasetshwane, {Daniel M} and Boston, {J. Robert} and Li, {Ching Chung}",
year = "2006",
month = "12",
day = "1",
doi = "10.1109/IEMBS.2006.260720",
language = "English (US)",
isbn = "1424400325",
series = "Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings",
pages = "1727--1730",
booktitle = "28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06",

}

TY - GEN

T1 - Identification of speech transients using variable frame rate analysis and wavelet packets

AU - Rasetshwane, Daniel M

AU - Boston, J. Robert

AU - Li, Ching Chung

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al and Tantibundhit et al were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise [3] [4]. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

AB - Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al and Tantibundhit et al were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise [3] [4]. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

UR - http://www.scopus.com/inward/record.url?scp=34047174892&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34047174892&partnerID=8YFLogxK

U2 - 10.1109/IEMBS.2006.260720

DO - 10.1109/IEMBS.2006.260720

M3 - Conference contribution

SN - 1424400325

SN - 9781424400324

T3 - Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings

SP - 1727

EP - 1730

BT - 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS'06

ER -