Concept based retrieval using generalized retrieval functions

Minkoo Kim, Jitender S. Deogun, Vijay V. Raghavan

Research output: Contribution to journalArticle

Abstract

One of the essential goals in information retrieval is to bridge the gap between the way users would prefer to specify their information needs and the way queries are required to be expressed. Rule Based Information Retrieval by Computer (RUBRIC) is one of the approaches proposed to achieve this goal. This approach involves the use of production rules to capture user-query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree, or alternatively by a disjunction of Minimal Term Sets (MTSs). The retrieval output is determined by the evaluation of the weighted Boolean expressions of the AND/OR tree, and processing efficiency can be enhanced by employing MTSs. However, since the weighted Boolean expression ignores the term-term association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs may still remain. To solve this problem, we adopt the generalized vector space model (GVSM) and the p-norm based extended Boolean model. Experiments are performed for two variations of the RUBRIC model, extended with GVSM, as well as for the integrated use of RUBRIC with the p-norm based extended Boolean model. The results are compared to the original RUBRIC model based on recall-precision.

Original languageEnglish (US)
Pages (from-to)119-135
Number of pages17
JournalFundamenta Informaticae
Volume47
Issue number1-2
StatePublished - Jul 1 2001

Fingerprint

Information retrieval
Information Retrieval
Retrieval
Boolean Model
Vector Space Model
Production Rules
Computer Model
Query
Term
Vector spaces
Norm
Concepts
Model-based
Output
Evaluation
Processing
Experiment
Experiments

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Algebra and Number Theory
  • Information Systems
  • Computational Theory and Mathematics

Cite this

Concept based retrieval using generalized retrieval functions. / Kim, Minkoo; Deogun, Jitender S.; Raghavan, Vijay V.

In: Fundamenta Informaticae, Vol. 47, No. 1-2, 01.07.2001, p. 119-135.

Research output: Contribution to journalArticle

Kim, M, Deogun, JS & Raghavan, VV 2001, 'Concept based retrieval using generalized retrieval functions', Fundamenta Informaticae, vol. 47, no. 1-2, pp. 119-135.
Kim, Minkoo ; Deogun, Jitender S. ; Raghavan, Vijay V. / Concept based retrieval using generalized retrieval functions. In: Fundamenta Informaticae. 2001 ; Vol. 47, No. 1-2. pp. 119-135.
@article{41a0a34aa8b3446b9a6a7a1eece9e896,
title = "Concept based retrieval using generalized retrieval functions",
abstract = "One of the essential goals in information retrieval is to bridge the gap between the way users would prefer to specify their information needs and the way queries are required to be expressed. Rule Based Information Retrieval by Computer (RUBRIC) is one of the approaches proposed to achieve this goal. This approach involves the use of production rules to capture user-query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree, or alternatively by a disjunction of Minimal Term Sets (MTSs). The retrieval output is determined by the evaluation of the weighted Boolean expressions of the AND/OR tree, and processing efficiency can be enhanced by employing MTSs. However, since the weighted Boolean expression ignores the term-term association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs may still remain. To solve this problem, we adopt the generalized vector space model (GVSM) and the p-norm based extended Boolean model. Experiments are performed for two variations of the RUBRIC model, extended with GVSM, as well as for the integrated use of RUBRIC with the p-norm based extended Boolean model. The results are compared to the original RUBRIC model based on recall-precision.",
author = "Minkoo Kim and Deogun, {Jitender S.} and Raghavan, {Vijay V.}",
year = "2001",
month = "7",
day = "1",
language = "English (US)",
volume = "47",
pages = "119--135",
journal = "Fundamenta Informaticae",
issn = "0169-2968",
publisher = "IOS Press",
number = "1-2",

}

TY - JOUR

T1 - Concept based retrieval using generalized retrieval functions

AU - Kim, Minkoo

AU - Deogun, Jitender S.

AU - Raghavan, Vijay V.

PY - 2001/7/1

Y1 - 2001/7/1

N2 - One of the essential goals in information retrieval is to bridge the gap between the way users would prefer to specify their information needs and the way queries are required to be expressed. Rule Based Information Retrieval by Computer (RUBRIC) is one of the approaches proposed to achieve this goal. This approach involves the use of production rules to capture user-query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree, or alternatively by a disjunction of Minimal Term Sets (MTSs). The retrieval output is determined by the evaluation of the weighted Boolean expressions of the AND/OR tree, and processing efficiency can be enhanced by employing MTSs. However, since the weighted Boolean expression ignores the term-term association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs may still remain. To solve this problem, we adopt the generalized vector space model (GVSM) and the p-norm based extended Boolean model. Experiments are performed for two variations of the RUBRIC model, extended with GVSM, as well as for the integrated use of RUBRIC with the p-norm based extended Boolean model. The results are compared to the original RUBRIC model based on recall-precision.

AB - One of the essential goals in information retrieval is to bridge the gap between the way users would prefer to specify their information needs and the way queries are required to be expressed. Rule Based Information Retrieval by Computer (RUBRIC) is one of the approaches proposed to achieve this goal. This approach involves the use of production rules to capture user-query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree, or alternatively by a disjunction of Minimal Term Sets (MTSs). The retrieval output is determined by the evaluation of the weighted Boolean expressions of the AND/OR tree, and processing efficiency can be enhanced by employing MTSs. However, since the weighted Boolean expression ignores the term-term association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs may still remain. To solve this problem, we adopt the generalized vector space model (GVSM) and the p-norm based extended Boolean model. Experiments are performed for two variations of the RUBRIC model, extended with GVSM, as well as for the integrated use of RUBRIC with the p-norm based extended Boolean model. The results are compared to the original RUBRIC model based on recall-precision.

UR - http://www.scopus.com/inward/record.url?scp=0035401770&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035401770&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0035401770

VL - 47

SP - 119

EP - 135

JO - Fundamenta Informaticae

JF - Fundamenta Informaticae

SN - 0169-2968

IS - 1-2

ER -