Identifying malware genera using the Jensen-Shannon distance between system call traces

Jeremy D. Seideman, Bilal Khan, Antonio Cesar Vargas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.

Original languageEnglish (US)
Title of host publicationProceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-7
Number of pages7
ISBN (Electronic)9781479973293
DOIs
StatePublished - Dec 29 2014
Event9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014 - Fajardo, Puerto Rico
Duration: Oct 28 2014Oct 30 2014

Publication series

NameProceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014

Other

Other9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014
CountryPuerto Rico
CityFajardo
Period10/28/1410/30/14

Fingerprint

Computer systems
Malware
Taxonomies
Grouping
Organism

ASJC Scopus subject areas

  • Artificial Intelligence
  • Visual Arts and Performing Arts

Cite this

Seideman, J. D., Khan, B., & Vargas, A. C. (2014). Identifying malware genera using the Jensen-Shannon distance between system call traces. In Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014 (pp. 1-7). [6999409] (Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/MALWARE.2014.6999409

Identifying malware genera using the Jensen-Shannon distance between system call traces. / Seideman, Jeremy D.; Khan, Bilal; Vargas, Antonio Cesar.

Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 1-7 6999409 (Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Seideman, JD, Khan, B & Vargas, AC 2014, Identifying malware genera using the Jensen-Shannon distance between system call traces. in Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014., 6999409, Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014, Institute of Electrical and Electronics Engineers Inc., pp. 1-7, 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014, Fajardo, Puerto Rico, 10/28/14. https://doi.org/10.1109/MALWARE.2014.6999409
Seideman JD, Khan B, Vargas AC. Identifying malware genera using the Jensen-Shannon distance between system call traces. In Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 1-7. 6999409. (Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014). https://doi.org/10.1109/MALWARE.2014.6999409
Seideman, Jeremy D. ; Khan, Bilal ; Vargas, Antonio Cesar. / Identifying malware genera using the Jensen-Shannon distance between system call traces. Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 1-7 (Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014).
@inproceedings{351f14e284814946950d0021a56832dc,
title = "Identifying malware genera using the Jensen-Shannon distance between system call traces",
abstract = "The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.",
author = "Seideman, {Jeremy D.} and Bilal Khan and Vargas, {Antonio Cesar}",
year = "2014",
month = "12",
day = "29",
doi = "10.1109/MALWARE.2014.6999409",
language = "English (US)",
series = "Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1--7",
booktitle = "Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014",

}

TY - GEN

T1 - Identifying malware genera using the Jensen-Shannon distance between system call traces

AU - Seideman, Jeremy D.

AU - Khan, Bilal

AU - Vargas, Antonio Cesar

PY - 2014/12/29

Y1 - 2014/12/29

N2 - The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.

AB - The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.

UR - http://www.scopus.com/inward/record.url?scp=84922516061&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84922516061&partnerID=8YFLogxK

U2 - 10.1109/MALWARE.2014.6999409

DO - 10.1109/MALWARE.2014.6999409

M3 - Conference contribution

AN - SCOPUS:84922516061

T3 - Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014

SP - 1

EP - 7

BT - Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -