Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure

Nai Ding, Monita Chatterjee, Jonathan Z. Simon

Research output: Contribution to journalArticle

77 Citations (Scopus)

Abstract

Speech recognition is robust to background noise. One underlying neural mechanism is that the auditory system segregates speech from the listening background and encodes it reliably. Such robust internal representation has been demonstrated in auditory cortex by neural activity entrained to the temporal envelope of speech. A paradox, however, then arises, as the spectro-temporal fine structure rather than the temporal envelope is known to be the major cue to segregate target speech from background noise. Does the reliable cortical entrainment in fact reflect a robust internal "synthesis" of the attended speech stream rather than direct tracking of the acoustic envelope? Here, we test this hypothesis by degrading the spectro-temporal fine structure while preserving the temporal envelope using vocoders. Magnetoencephalography (MEG) recordings reveal that cortical entrainment to vocoded speech is severely degraded by background noise, in contrast to the robust entrainment to natural speech. Furthermore, cortical entrainment in the delta-band (1-4. Hz) predicts the speech recognition score at the level of individual listeners. These results demonstrate that reliable cortical entrainment to speech relies on the spectro-temporal fine structure, and suggest that cortical entrainment to the speech envelope is not merely a representation of the speech envelope but a coherent representation of multiscale spectro-temporal features that are synchronized to the syllabic and phrasal rhythms of speech.

Original languageEnglish (US)
Pages (from-to)41-46
Number of pages6
JournalNeuroImage
Volume88
DOIs
StatePublished - Mar 1 2014

Fingerprint

Noise
Magnetoencephalography
Auditory Cortex
Acoustics
Cues
Recognition (Psychology)

Keywords

  • Auditory cortex
  • Auditory scene analysis
  • Envelope entrainment
  • MEG

ASJC Scopus subject areas

  • Neurology
  • Cognitive Neuroscience

Cite this

Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure. / Ding, Nai; Chatterjee, Monita; Simon, Jonathan Z.

In: NeuroImage, Vol. 88, 01.03.2014, p. 41-46.

Research output: Contribution to journalArticle

@article{c059942cc7244ba89dcc3bf9bcee19ba,
title = "Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure",
abstract = "Speech recognition is robust to background noise. One underlying neural mechanism is that the auditory system segregates speech from the listening background and encodes it reliably. Such robust internal representation has been demonstrated in auditory cortex by neural activity entrained to the temporal envelope of speech. A paradox, however, then arises, as the spectro-temporal fine structure rather than the temporal envelope is known to be the major cue to segregate target speech from background noise. Does the reliable cortical entrainment in fact reflect a robust internal {"}synthesis{"} of the attended speech stream rather than direct tracking of the acoustic envelope? Here, we test this hypothesis by degrading the spectro-temporal fine structure while preserving the temporal envelope using vocoders. Magnetoencephalography (MEG) recordings reveal that cortical entrainment to vocoded speech is severely degraded by background noise, in contrast to the robust entrainment to natural speech. Furthermore, cortical entrainment in the delta-band (1-4. Hz) predicts the speech recognition score at the level of individual listeners. These results demonstrate that reliable cortical entrainment to speech relies on the spectro-temporal fine structure, and suggest that cortical entrainment to the speech envelope is not merely a representation of the speech envelope but a coherent representation of multiscale spectro-temporal features that are synchronized to the syllabic and phrasal rhythms of speech.",
keywords = "Auditory cortex, Auditory scene analysis, Envelope entrainment, MEG",
author = "Nai Ding and Monita Chatterjee and Simon, {Jonathan Z.}",
year = "2014",
month = "3",
day = "1",
doi = "10.1016/j.neuroimage.2013.10.054",
language = "English (US)",
volume = "88",
pages = "41--46",
journal = "NeuroImage",
issn = "1053-8119",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure

AU - Ding, Nai

AU - Chatterjee, Monita

AU - Simon, Jonathan Z.

PY - 2014/3/1

Y1 - 2014/3/1

N2 - Speech recognition is robust to background noise. One underlying neural mechanism is that the auditory system segregates speech from the listening background and encodes it reliably. Such robust internal representation has been demonstrated in auditory cortex by neural activity entrained to the temporal envelope of speech. A paradox, however, then arises, as the spectro-temporal fine structure rather than the temporal envelope is known to be the major cue to segregate target speech from background noise. Does the reliable cortical entrainment in fact reflect a robust internal "synthesis" of the attended speech stream rather than direct tracking of the acoustic envelope? Here, we test this hypothesis by degrading the spectro-temporal fine structure while preserving the temporal envelope using vocoders. Magnetoencephalography (MEG) recordings reveal that cortical entrainment to vocoded speech is severely degraded by background noise, in contrast to the robust entrainment to natural speech. Furthermore, cortical entrainment in the delta-band (1-4. Hz) predicts the speech recognition score at the level of individual listeners. These results demonstrate that reliable cortical entrainment to speech relies on the spectro-temporal fine structure, and suggest that cortical entrainment to the speech envelope is not merely a representation of the speech envelope but a coherent representation of multiscale spectro-temporal features that are synchronized to the syllabic and phrasal rhythms of speech.

AB - Speech recognition is robust to background noise. One underlying neural mechanism is that the auditory system segregates speech from the listening background and encodes it reliably. Such robust internal representation has been demonstrated in auditory cortex by neural activity entrained to the temporal envelope of speech. A paradox, however, then arises, as the spectro-temporal fine structure rather than the temporal envelope is known to be the major cue to segregate target speech from background noise. Does the reliable cortical entrainment in fact reflect a robust internal "synthesis" of the attended speech stream rather than direct tracking of the acoustic envelope? Here, we test this hypothesis by degrading the spectro-temporal fine structure while preserving the temporal envelope using vocoders. Magnetoencephalography (MEG) recordings reveal that cortical entrainment to vocoded speech is severely degraded by background noise, in contrast to the robust entrainment to natural speech. Furthermore, cortical entrainment in the delta-band (1-4. Hz) predicts the speech recognition score at the level of individual listeners. These results demonstrate that reliable cortical entrainment to speech relies on the spectro-temporal fine structure, and suggest that cortical entrainment to the speech envelope is not merely a representation of the speech envelope but a coherent representation of multiscale spectro-temporal features that are synchronized to the syllabic and phrasal rhythms of speech.

KW - Auditory cortex

KW - Auditory scene analysis

KW - Envelope entrainment

KW - MEG

UR - http://www.scopus.com/inward/record.url?scp=84892867072&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892867072&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2013.10.054

DO - 10.1016/j.neuroimage.2013.10.054

M3 - Article

C2 - 24188816

AN - SCOPUS:84892867072

VL - 88

SP - 41

EP - 46

JO - NeuroImage

JF - NeuroImage

SN - 1053-8119

ER -