Locating text in color document images

Erel Ortaçaǧ, Bülent Sankur, Khalid Sayood

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

A novel text extraction algorithm from cluttered color document images is developed and tested. The algorithm consists of a color segmentation stage followed by rule-based filtering of non-text regions. Extraction of text segments algorithm uses the measurement of geometrical properties as well as characterness properties and a set of heuristic rules. The algorithm includes a fusion cycle of three different segmentation maps, and a restitution cycle to restore any deleted characters and/or their diacritical marks. The proposed method, proven successful in extraction of texts from many color document images, has applications in color image indexing and retrieval.

Original languageEnglish (US)
JournalEuropean Signal Processing Conference
Volume1998-January
StatePublished - Jan 1 1998
Event9th European Signal Processing Conference, EUSIPCO 1998 - Island of Rhodes, Greece
Duration: Sep 8 1998Sep 11 1998

Fingerprint

Color
Fusion reactions

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Locating text in color document images. / Ortaçaǧ, Erel; Sankur, Bülent; Sayood, Khalid.

In: European Signal Processing Conference, Vol. 1998-January, 01.01.1998.

Research output: Contribution to journalConference article

Ortaçaǧ, Erel ; Sankur, Bülent ; Sayood, Khalid. / Locating text in color document images. In: European Signal Processing Conference. 1998 ; Vol. 1998-January.
@article{c00af39c277841038b5dfabc9f3a5727,
title = "Locating text in color document images",
abstract = "A novel text extraction algorithm from cluttered color document images is developed and tested. The algorithm consists of a color segmentation stage followed by rule-based filtering of non-text regions. Extraction of text segments algorithm uses the measurement of geometrical properties as well as characterness properties and a set of heuristic rules. The algorithm includes a fusion cycle of three different segmentation maps, and a restitution cycle to restore any deleted characters and/or their diacritical marks. The proposed method, proven successful in extraction of texts from many color document images, has applications in color image indexing and retrieval.",
author = "Erel Orta{\cc}aǧ and B{\"u}lent Sankur and Khalid Sayood",
year = "1998",
month = "1",
day = "1",
language = "English (US)",
volume = "1998-January",
journal = "European Signal Processing Conference",
issn = "2219-5491",

}

TY - JOUR

T1 - Locating text in color document images

AU - Ortaçaǧ, Erel

AU - Sankur, Bülent

AU - Sayood, Khalid

PY - 1998/1/1

Y1 - 1998/1/1

N2 - A novel text extraction algorithm from cluttered color document images is developed and tested. The algorithm consists of a color segmentation stage followed by rule-based filtering of non-text regions. Extraction of text segments algorithm uses the measurement of geometrical properties as well as characterness properties and a set of heuristic rules. The algorithm includes a fusion cycle of three different segmentation maps, and a restitution cycle to restore any deleted characters and/or their diacritical marks. The proposed method, proven successful in extraction of texts from many color document images, has applications in color image indexing and retrieval.

AB - A novel text extraction algorithm from cluttered color document images is developed and tested. The algorithm consists of a color segmentation stage followed by rule-based filtering of non-text regions. Extraction of text segments algorithm uses the measurement of geometrical properties as well as characterness properties and a set of heuristic rules. The algorithm includes a fusion cycle of three different segmentation maps, and a restitution cycle to restore any deleted characters and/or their diacritical marks. The proposed method, proven successful in extraction of texts from many color document images, has applications in color image indexing and retrieval.

UR - http://www.scopus.com/inward/record.url?scp=85019541010&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85019541010&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85019541010

VL - 1998-January

JO - European Signal Processing Conference

JF - European Signal Processing Conference

SN - 2219-5491

ER -