An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems

Xiao Qin, Hong Jiang, D. R. Swanson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

66 Citations (Scopus)

Abstract

In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4% improvement in reliability and an average of 49.3% improvement in performability.

Original languageEnglish (US)
Title of host publicationProceedings - International Conference on Parallel Processing, ICPP 2002
EditorsTarek S. Abdelrahman
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages360-368
Number of pages9
ISBN (Electronic)0769516777
DOIs
StatePublished - Jan 1 2002
EventInternational Conference on Parallel Processing, ICPP 2002 - Vancouver, Canada
Duration: Aug 18 2002Aug 21 2002

Publication series

NameProceedings of the International Conference on Parallel Processing
Volume2002-January
ISSN (Print)0190-3918

Other

OtherInternational Conference on Parallel Processing, ICPP 2002
CountryCanada
CityVancouver
Period8/18/028/21/02

Fingerprint

Precedence Constraints
Heterogeneous Systems
Scheduling algorithms
Fault-tolerant
Scheduling Algorithm
Performability
Real-time
Schedule
Heterogeneous Environment
Failure Rate
Fault tolerance
Real time systems
Fault Tolerance
Execution Time
Overlap
Quantify
Fault
Scheduling
Line
Communication

Keywords

  • Computer science
  • Costs
  • Distributed computing
  • Fault detection
  • Fault tolerance
  • Fault tolerant systems
  • Performance evaluation
  • Processor scheduling
  • Real time systems
  • Scheduling algorithm

ASJC Scopus subject areas

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Cite this

Qin, X., Jiang, H., & Swanson, D. R. (2002). An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. In T. S. Abdelrahman (Ed.), Proceedings - International Conference on Parallel Processing, ICPP 2002 (pp. 360-368). [1040892] (Proceedings of the International Conference on Parallel Processing; Vol. 2002-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICPP.2002.1040892

An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. / Qin, Xiao; Jiang, Hong; Swanson, D. R.

Proceedings - International Conference on Parallel Processing, ICPP 2002. ed. / Tarek S. Abdelrahman. Institute of Electrical and Electronics Engineers Inc., 2002. p. 360-368 1040892 (Proceedings of the International Conference on Parallel Processing; Vol. 2002-January).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qin, X, Jiang, H & Swanson, DR 2002, An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. in TS Abdelrahman (ed.), Proceedings - International Conference on Parallel Processing, ICPP 2002., 1040892, Proceedings of the International Conference on Parallel Processing, vol. 2002-January, Institute of Electrical and Electronics Engineers Inc., pp. 360-368, International Conference on Parallel Processing, ICPP 2002, Vancouver, Canada, 8/18/02. https://doi.org/10.1109/ICPP.2002.1040892
Qin X, Jiang H, Swanson DR. An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. In Abdelrahman TS, editor, Proceedings - International Conference on Parallel Processing, ICPP 2002. Institute of Electrical and Electronics Engineers Inc. 2002. p. 360-368. 1040892. (Proceedings of the International Conference on Parallel Processing). https://doi.org/10.1109/ICPP.2002.1040892
Qin, Xiao ; Jiang, Hong ; Swanson, D. R. / An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. Proceedings - International Conference on Parallel Processing, ICPP 2002. editor / Tarek S. Abdelrahman. Institute of Electrical and Electronics Engineers Inc., 2002. pp. 360-368 (Proceedings of the International Conference on Parallel Processing).
@inproceedings{4fb479f7de1e4fd189891cd66449f192,
title = "An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems",
abstract = "In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4{\%} improvement in reliability and an average of 49.3{\%} improvement in performability.",
keywords = "Computer science, Costs, Distributed computing, Fault detection, Fault tolerance, Fault tolerant systems, Performance evaluation, Processor scheduling, Real time systems, Scheduling algorithm",
author = "Xiao Qin and Hong Jiang and Swanson, {D. R.}",
year = "2002",
month = "1",
day = "1",
doi = "10.1109/ICPP.2002.1040892",
language = "English (US)",
series = "Proceedings of the International Conference on Parallel Processing",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "360--368",
editor = "Abdelrahman, {Tarek S.}",
booktitle = "Proceedings - International Conference on Parallel Processing, ICPP 2002",

}

TY - GEN

T1 - An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems

AU - Qin, Xiao

AU - Jiang, Hong

AU - Swanson, D. R.

PY - 2002/1/1

Y1 - 2002/1/1

N2 - In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4% improvement in reliability and an average of 49.3% improvement in performability.

AB - In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4% improvement in reliability and an average of 49.3% improvement in performability.

KW - Computer science

KW - Costs

KW - Distributed computing

KW - Fault detection

KW - Fault tolerance

KW - Fault tolerant systems

KW - Performance evaluation

KW - Processor scheduling

KW - Real time systems

KW - Scheduling algorithm

UR - http://www.scopus.com/inward/record.url?scp=84948470299&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948470299&partnerID=8YFLogxK

U2 - 10.1109/ICPP.2002.1040892

DO - 10.1109/ICPP.2002.1040892

M3 - Conference contribution

AN - SCOPUS:84948470299

T3 - Proceedings of the International Conference on Parallel Processing

SP - 360

EP - 368

BT - Proceedings - International Conference on Parallel Processing, ICPP 2002

A2 - Abdelrahman, Tarek S.

PB - Institute of Electrical and Electronics Engineers Inc.

ER -