Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea

Elease J. McLaurin, John D. Lee, Anthony D. McDonald, Nazan Aksan, Jeffrey Dawson, Jon Tippin, Matthew Rizzo

Research output: Contribution to journalArticle

Abstract

One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets.

Original languageEnglish (US)
Pages (from-to)25-38
Number of pages14
JournalTransportation Research Part F: Traffic Psychology and Behaviour
Volume58
DOIs
StatePublished - Oct 2018

Fingerprint

Sleep Apnea Syndromes
Obstructive Sleep Apnea
sleep
driver
Area Under Curve
Sleep
event
regularity
Datasets

Keywords

  • Driver behavior
  • Drowsiness
  • Machine learning
  • Naturalistic driving data
  • Sleep apnea
  • Topic modeling

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Automotive Engineering
  • Transportation
  • Applied Psychology

Cite this

Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea. / McLaurin, Elease J.; Lee, John D.; McDonald, Anthony D.; Aksan, Nazan; Dawson, Jeffrey; Tippin, Jon; Rizzo, Matthew.

In: Transportation Research Part F: Traffic Psychology and Behaviour, Vol. 58, 10.2018, p. 25-38.

Research output: Contribution to journalArticle

McLaurin, Elease J. ; Lee, John D. ; McDonald, Anthony D. ; Aksan, Nazan ; Dawson, Jeffrey ; Tippin, Jon ; Rizzo, Matthew. / Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea. In: Transportation Research Part F: Traffic Psychology and Behaviour. 2018 ; Vol. 58. pp. 25-38.
@article{17419d7b8bc44d62be059b773d53b6ec,
title = "Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea",
abstract = "One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets.",
keywords = "Driver behavior, Drowsiness, Machine learning, Naturalistic driving data, Sleep apnea, Topic modeling",
author = "McLaurin, {Elease J.} and Lee, {John D.} and McDonald, {Anthony D.} and Nazan Aksan and Jeffrey Dawson and Jon Tippin and Matthew Rizzo",
year = "2018",
month = "10",
doi = "10.1016/j.trf.2018.05.019",
language = "English (US)",
volume = "58",
pages = "25--38",
journal = "Transportation Research Part F: Traffic Psychology and Behaviour",
issn = "1369-8478",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea

AU - McLaurin, Elease J.

AU - Lee, John D.

AU - McDonald, Anthony D.

AU - Aksan, Nazan

AU - Dawson, Jeffrey

AU - Tippin, Jon

AU - Rizzo, Matthew

PY - 2018/10

Y1 - 2018/10

N2 - One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets.

AB - One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets.

KW - Driver behavior

KW - Drowsiness

KW - Machine learning

KW - Naturalistic driving data

KW - Sleep apnea

KW - Topic modeling

UR - http://www.scopus.com/inward/record.url?scp=85048196361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048196361&partnerID=8YFLogxK

U2 - 10.1016/j.trf.2018.05.019

DO - 10.1016/j.trf.2018.05.019

M3 - Article

C2 - 30559601

AN - SCOPUS:85048196361

VL - 58

SP - 25

EP - 38

JO - Transportation Research Part F: Traffic Psychology and Behaviour

JF - Transportation Research Part F: Traffic Psychology and Behaviour

SN - 1369-8478

ER -