Vowel recognition from articulatory position time-series data

Jun Wang, Ashok K Samal, Jordan R. Green, Thomas D Carrell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

A new approach of recognizing vowels from articulatory position time-series data was proposed and tested in this paper. This approach directly mapped articulatory position time-series data to vowels without extracting articulatory features such as mouth opening. The input time-series data were time-normalized and sampled to fixed-width vectors of articulatory positions. Three commonly used classifiers, Neural Network, Support Vector Machine and Decision Tree were used and their performances were compared on the vectors. A single speaker dataset of eight major English vowels acquired using Electromagnetic Articulograph (EMA) AG500 was used. Recognition rate using cross validation ranged from 76.07% to 91.32% for the three classifiers. In addition, the trained decision trees were consistent with articulatory features commonly used to descriptively distinguish vowels in classical phonetics. The findings are intended to improve the accuracy and response time of a real-time articulatory-to-acoustics synthesizer.

Original languageEnglish (US)
Title of host publication3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings
DOIs
StatePublished - Dec 2 2009
Event3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Omaha, NE, United States
Duration: Sep 28 2009Sep 30 2009

Publication series

Name3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings

Conference

Conference3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009
CountryUnited States
CityOmaha, NE
Period9/28/099/30/09

Fingerprint

Time series
Decision trees
Classifiers
Speech analysis
Support vector machines
Acoustics
Neural networks

Keywords

  • Articulatory speech recognition
  • Decision tree
  • Neural network
  • Support vector machine
  • Time-series

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Wang, J., Samal, A. K., Green, J. R., & Carrell, T. D. (2009). Vowel recognition from articulatory position time-series data. In 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings [5306418] (3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings). https://doi.org/10.1109/ICSPCS.2009.5306418

Vowel recognition from articulatory position time-series data. / Wang, Jun; Samal, Ashok K; Green, Jordan R.; Carrell, Thomas D.

3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings. 2009. 5306418 (3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, J, Samal, AK, Green, JR & Carrell, TD 2009, Vowel recognition from articulatory position time-series data. in 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings., 5306418, 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings, 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009, Omaha, NE, United States, 9/28/09. https://doi.org/10.1109/ICSPCS.2009.5306418
Wang J, Samal AK, Green JR, Carrell TD. Vowel recognition from articulatory position time-series data. In 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings. 2009. 5306418. (3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings). https://doi.org/10.1109/ICSPCS.2009.5306418
Wang, Jun ; Samal, Ashok K ; Green, Jordan R. ; Carrell, Thomas D. / Vowel recognition from articulatory position time-series data. 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings. 2009. (3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings).
@inproceedings{45627e4fae6b45889d1d831b79d9b1f5,
title = "Vowel recognition from articulatory position time-series data",
abstract = "A new approach of recognizing vowels from articulatory position time-series data was proposed and tested in this paper. This approach directly mapped articulatory position time-series data to vowels without extracting articulatory features such as mouth opening. The input time-series data were time-normalized and sampled to fixed-width vectors of articulatory positions. Three commonly used classifiers, Neural Network, Support Vector Machine and Decision Tree were used and their performances were compared on the vectors. A single speaker dataset of eight major English vowels acquired using Electromagnetic Articulograph (EMA) AG500 was used. Recognition rate using cross validation ranged from 76.07{\%} to 91.32{\%} for the three classifiers. In addition, the trained decision trees were consistent with articulatory features commonly used to descriptively distinguish vowels in classical phonetics. The findings are intended to improve the accuracy and response time of a real-time articulatory-to-acoustics synthesizer.",
keywords = "Articulatory speech recognition, Decision tree, Neural network, Support vector machine, Time-series",
author = "Jun Wang and Samal, {Ashok K} and Green, {Jordan R.} and Carrell, {Thomas D}",
year = "2009",
month = "12",
day = "2",
doi = "10.1109/ICSPCS.2009.5306418",
language = "English (US)",
isbn = "9781424444748",
series = "3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings",
booktitle = "3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings",

}

TY - GEN

T1 - Vowel recognition from articulatory position time-series data

AU - Wang, Jun

AU - Samal, Ashok K

AU - Green, Jordan R.

AU - Carrell, Thomas D

PY - 2009/12/2

Y1 - 2009/12/2

N2 - A new approach of recognizing vowels from articulatory position time-series data was proposed and tested in this paper. This approach directly mapped articulatory position time-series data to vowels without extracting articulatory features such as mouth opening. The input time-series data were time-normalized and sampled to fixed-width vectors of articulatory positions. Three commonly used classifiers, Neural Network, Support Vector Machine and Decision Tree were used and their performances were compared on the vectors. A single speaker dataset of eight major English vowels acquired using Electromagnetic Articulograph (EMA) AG500 was used. Recognition rate using cross validation ranged from 76.07% to 91.32% for the three classifiers. In addition, the trained decision trees were consistent with articulatory features commonly used to descriptively distinguish vowels in classical phonetics. The findings are intended to improve the accuracy and response time of a real-time articulatory-to-acoustics synthesizer.

AB - A new approach of recognizing vowels from articulatory position time-series data was proposed and tested in this paper. This approach directly mapped articulatory position time-series data to vowels without extracting articulatory features such as mouth opening. The input time-series data were time-normalized and sampled to fixed-width vectors of articulatory positions. Three commonly used classifiers, Neural Network, Support Vector Machine and Decision Tree were used and their performances were compared on the vectors. A single speaker dataset of eight major English vowels acquired using Electromagnetic Articulograph (EMA) AG500 was used. Recognition rate using cross validation ranged from 76.07% to 91.32% for the three classifiers. In addition, the trained decision trees were consistent with articulatory features commonly used to descriptively distinguish vowels in classical phonetics. The findings are intended to improve the accuracy and response time of a real-time articulatory-to-acoustics synthesizer.

KW - Articulatory speech recognition

KW - Decision tree

KW - Neural network

KW - Support vector machine

KW - Time-series

UR - http://www.scopus.com/inward/record.url?scp=70649090415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70649090415&partnerID=8YFLogxK

U2 - 10.1109/ICSPCS.2009.5306418

DO - 10.1109/ICSPCS.2009.5306418

M3 - Conference contribution

SN - 9781424444748

T3 - 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings

BT - 3rd International Conference on Signal Processing and Communication Systems, ICSPCS'2009 - Proceedings

ER -