Identifying Personal Messages: A Step towards Product/Service Review and Opinion Mining

Sasan Azizian, Elham Rastegari, Brian Ricks, Magie Hall

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Twitter is one of the most popular micro-blogging services, with millions of users exchanging information. Twitter's popularity and low barriers has led many commercial entities to start using the service. As a result, the Twitter stream has a combination of personal and professional tweets. These professional tweets are marketing messages and do not provide insight into individual people's experiences. Thus, filtering personal tweets from commercial or professional ones is a crucial, though often overlooked, first step in mining micro-blogging data. Identifying personal messages is essential for opinion mining or product/service review in every domain, and it is specifically crucial in the healthcare domain. In this research study, we propose a method of classifying tweets as either personal or professional tweets using a novel feature set. Here we collected and analyzed three data sets from the Twitter stream related to the healthcare domain. Using a large number of hand-labeled tweets as input, we trained several classifiers on our proposed set of features and compared classifiers' accuracy, precision, and recall using 10-fold cross validation technique. On a combination of three health-related data sets, random forest classifier provided the maximum accuracy of 91.5%. This result shows that our approach can significantly increase the accuracy of data mining on the Twitter stream.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017
EditorsFernando G. Tinetti, Quoc-Nam Tran, Leonidas Deligiannidis, Mary Qu Yang, Mary Qu Yang, Hamid R. Arabnia
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages876-881
Number of pages6
ISBN (Electronic)9781538626528
DOIs
StatePublished - Dec 4 2018
Event2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017 - Las Vegas, United States
Duration: Dec 14 2017Dec 16 2017

Publication series

NameProceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017

Conference

Conference2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017
CountryUnited States
CityLas Vegas
Period12/14/1712/16/17

Fingerprint

Classifiers
Data mining
Marketing
Health

Keywords

  • classification
  • data mining
  • features selection
  • healthcare
  • twitter

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Safety, Risk, Reliability and Quality

Cite this

Azizian, S., Rastegari, E., Ricks, B., & Hall, M. (2018). Identifying Personal Messages: A Step towards Product/Service Review and Opinion Mining. In F. G. Tinetti, Q-N. Tran, L. Deligiannidis, M. Q. Yang, M. Q. Yang, & H. R. Arabnia (Eds.), Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017 (pp. 876-881). [8560911] (Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CSCI.2017.152

Identifying Personal Messages : A Step towards Product/Service Review and Opinion Mining. / Azizian, Sasan; Rastegari, Elham; Ricks, Brian; Hall, Magie.

Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017. ed. / Fernando G. Tinetti; Quoc-Nam Tran; Leonidas Deligiannidis; Mary Qu Yang; Mary Qu Yang; Hamid R. Arabnia. Institute of Electrical and Electronics Engineers Inc., 2018. p. 876-881 8560911 (Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Azizian, S, Rastegari, E, Ricks, B & Hall, M 2018, Identifying Personal Messages: A Step towards Product/Service Review and Opinion Mining. in FG Tinetti, Q-N Tran, L Deligiannidis, MQ Yang, MQ Yang & HR Arabnia (eds), Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017., 8560911, Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017, Institute of Electrical and Electronics Engineers Inc., pp. 876-881, 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017, Las Vegas, United States, 12/14/17. https://doi.org/10.1109/CSCI.2017.152
Azizian S, Rastegari E, Ricks B, Hall M. Identifying Personal Messages: A Step towards Product/Service Review and Opinion Mining. In Tinetti FG, Tran Q-N, Deligiannidis L, Yang MQ, Yang MQ, Arabnia HR, editors, Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017. Institute of Electrical and Electronics Engineers Inc. 2018. p. 876-881. 8560911. (Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017). https://doi.org/10.1109/CSCI.2017.152
Azizian, Sasan ; Rastegari, Elham ; Ricks, Brian ; Hall, Magie. / Identifying Personal Messages : A Step towards Product/Service Review and Opinion Mining. Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017. editor / Fernando G. Tinetti ; Quoc-Nam Tran ; Leonidas Deligiannidis ; Mary Qu Yang ; Mary Qu Yang ; Hamid R. Arabnia. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 876-881 (Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017).
@inproceedings{ed193c4064e14d56ba67b4d575c387de,
title = "Identifying Personal Messages: A Step towards Product/Service Review and Opinion Mining",
abstract = "Twitter is one of the most popular micro-blogging services, with millions of users exchanging information. Twitter's popularity and low barriers has led many commercial entities to start using the service. As a result, the Twitter stream has a combination of personal and professional tweets. These professional tweets are marketing messages and do not provide insight into individual people's experiences. Thus, filtering personal tweets from commercial or professional ones is a crucial, though often overlooked, first step in mining micro-blogging data. Identifying personal messages is essential for opinion mining or product/service review in every domain, and it is specifically crucial in the healthcare domain. In this research study, we propose a method of classifying tweets as either personal or professional tweets using a novel feature set. Here we collected and analyzed three data sets from the Twitter stream related to the healthcare domain. Using a large number of hand-labeled tweets as input, we trained several classifiers on our proposed set of features and compared classifiers' accuracy, precision, and recall using 10-fold cross validation technique. On a combination of three health-related data sets, random forest classifier provided the maximum accuracy of 91.5{\%}. This result shows that our approach can significantly increase the accuracy of data mining on the Twitter stream.",
keywords = "classification, data mining, features selection, healthcare, twitter",
author = "Sasan Azizian and Elham Rastegari and Brian Ricks and Magie Hall",
year = "2018",
month = "12",
day = "4",
doi = "10.1109/CSCI.2017.152",
language = "English (US)",
series = "Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "876--881",
editor = "Tinetti, {Fernando G.} and Quoc-Nam Tran and Leonidas Deligiannidis and Yang, {Mary Qu} and Yang, {Mary Qu} and Arabnia, {Hamid R.}",
booktitle = "Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017",

}

TY - GEN

T1 - Identifying Personal Messages

T2 - A Step towards Product/Service Review and Opinion Mining

AU - Azizian, Sasan

AU - Rastegari, Elham

AU - Ricks, Brian

AU - Hall, Magie

PY - 2018/12/4

Y1 - 2018/12/4

N2 - Twitter is one of the most popular micro-blogging services, with millions of users exchanging information. Twitter's popularity and low barriers has led many commercial entities to start using the service. As a result, the Twitter stream has a combination of personal and professional tweets. These professional tweets are marketing messages and do not provide insight into individual people's experiences. Thus, filtering personal tweets from commercial or professional ones is a crucial, though often overlooked, first step in mining micro-blogging data. Identifying personal messages is essential for opinion mining or product/service review in every domain, and it is specifically crucial in the healthcare domain. In this research study, we propose a method of classifying tweets as either personal or professional tweets using a novel feature set. Here we collected and analyzed three data sets from the Twitter stream related to the healthcare domain. Using a large number of hand-labeled tweets as input, we trained several classifiers on our proposed set of features and compared classifiers' accuracy, precision, and recall using 10-fold cross validation technique. On a combination of three health-related data sets, random forest classifier provided the maximum accuracy of 91.5%. This result shows that our approach can significantly increase the accuracy of data mining on the Twitter stream.

AB - Twitter is one of the most popular micro-blogging services, with millions of users exchanging information. Twitter's popularity and low barriers has led many commercial entities to start using the service. As a result, the Twitter stream has a combination of personal and professional tweets. These professional tweets are marketing messages and do not provide insight into individual people's experiences. Thus, filtering personal tweets from commercial or professional ones is a crucial, though often overlooked, first step in mining micro-blogging data. Identifying personal messages is essential for opinion mining or product/service review in every domain, and it is specifically crucial in the healthcare domain. In this research study, we propose a method of classifying tweets as either personal or professional tweets using a novel feature set. Here we collected and analyzed three data sets from the Twitter stream related to the healthcare domain. Using a large number of hand-labeled tweets as input, we trained several classifiers on our proposed set of features and compared classifiers' accuracy, precision, and recall using 10-fold cross validation technique. On a combination of three health-related data sets, random forest classifier provided the maximum accuracy of 91.5%. This result shows that our approach can significantly increase the accuracy of data mining on the Twitter stream.

KW - classification

KW - data mining

KW - features selection

KW - healthcare

KW - twitter

UR - http://www.scopus.com/inward/record.url?scp=85050882639&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050882639&partnerID=8YFLogxK

U2 - 10.1109/CSCI.2017.152

DO - 10.1109/CSCI.2017.152

M3 - Conference contribution

AN - SCOPUS:85050882639

T3 - Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017

SP - 876

EP - 881

BT - Proceedings - 2017 International Conference on Computational Science and Computational Intelligence, CSCI 2017

A2 - Tinetti, Fernando G.

A2 - Tran, Quoc-Nam

A2 - Deligiannidis, Leonidas

A2 - Yang, Mary Qu

A2 - Yang, Mary Qu

A2 - Arabnia, Hamid R.

PB - Institute of Electrical and Electronics Engineers Inc.

ER -