Computing information gain for spatial data support

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Widespread use of GPS devices and explosion of remotely sensed geospatial images along with cheap storage devices has resulted in vast amounts of data. More recently, with the advent of wireless technology, a large number of sensor networks have been deployed to monitor many human, biological and natural processes. This poses a challenge in many data rich application domains. The problem now is how best to choose the datasets to solve specific problems. Some of the datasets may be redundant and their inclusion in analysis may not only be time consuming, but may lead to erroneous conclusions. We propose the concept of data support as the basis for efficient, cost-effective and intelligent use of geospatial data in order to reduce uncertainty in the analysis and consequently in the results. Data support is defined as the process of determining the information utility of a data source to help decide which one to include or exclude to improve cost-effectiveness in existing data analysis. In this article we use mutual information as the basis of computing data support. The concept of mutual information is defined in information theory as a measure to compute information gain or loss between two disjoint datasets. We use this to compute the optimal datasets in specific applications. The effectiveness of the approach is demonstrated using an application in the hydrological analysis domain.

Original languageEnglish (US)
Title of host publicationProceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
Pages431-434
Number of pages4
DOIs
StatePublished - Dec 1 2008
Event16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008 - Irvine, CA, United States
Duration: Nov 5 2008Nov 7 2008

Publication series

NameGIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems

Conference

Conference16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
CountryUnited States
CityIrvine, CA
Period11/5/0811/7/08

Fingerprint

Information Gain
Spatial Data
spatial data
Computing
Information use
Information theory
Cost effectiveness
Sensor networks
Explosions
Global positioning system
Mutual Information
Domain Analysis
Information Loss
Cost-effectiveness
Information Theory
Costs
cost
Explosion
Sensor Networks
explosion

Keywords

  • Information gain/loss
  • Spatial data support

ASJC Scopus subject areas

  • Earth-Surface Processes
  • Computer Science Applications
  • Modeling and Simulation
  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Hong, T., Samal, A. K., & Soh, L-K. (2008). Computing information gain for spatial data support. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008 (pp. 431-434). [1463502] (GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems). https://doi.org/10.1145/1463434.1463502

Computing information gain for spatial data support. / Hong, Tao; Samal, Ashok K; Soh, Leen-Kiat.

Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008. 2008. p. 431-434 1463502 (GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hong, T, Samal, AK & Soh, L-K 2008, Computing information gain for spatial data support. in Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008., 1463502, GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems, pp. 431-434, 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008, Irvine, CA, United States, 11/5/08. https://doi.org/10.1145/1463434.1463502
Hong T, Samal AK, Soh L-K. Computing information gain for spatial data support. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008. 2008. p. 431-434. 1463502. (GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems). https://doi.org/10.1145/1463434.1463502
Hong, Tao ; Samal, Ashok K ; Soh, Leen-Kiat. / Computing information gain for spatial data support. Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008. 2008. pp. 431-434 (GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems).
@inproceedings{25476014ed1f41b1a507159a6df89f14,
title = "Computing information gain for spatial data support",
abstract = "Widespread use of GPS devices and explosion of remotely sensed geospatial images along with cheap storage devices has resulted in vast amounts of data. More recently, with the advent of wireless technology, a large number of sensor networks have been deployed to monitor many human, biological and natural processes. This poses a challenge in many data rich application domains. The problem now is how best to choose the datasets to solve specific problems. Some of the datasets may be redundant and their inclusion in analysis may not only be time consuming, but may lead to erroneous conclusions. We propose the concept of data support as the basis for efficient, cost-effective and intelligent use of geospatial data in order to reduce uncertainty in the analysis and consequently in the results. Data support is defined as the process of determining the information utility of a data source to help decide which one to include or exclude to improve cost-effectiveness in existing data analysis. In this article we use mutual information as the basis of computing data support. The concept of mutual information is defined in information theory as a measure to compute information gain or loss between two disjoint datasets. We use this to compute the optimal datasets in specific applications. The effectiveness of the approach is demonstrated using an application in the hydrological analysis domain.",
keywords = "Information gain/loss, Spatial data support",
author = "Tao Hong and Samal, {Ashok K} and Leen-Kiat Soh",
year = "2008",
month = "12",
day = "1",
doi = "10.1145/1463434.1463502",
language = "English (US)",
isbn = "9781605583235",
series = "GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems",
pages = "431--434",
booktitle = "Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008",

}

TY - GEN

T1 - Computing information gain for spatial data support

AU - Hong, Tao

AU - Samal, Ashok K

AU - Soh, Leen-Kiat

PY - 2008/12/1

Y1 - 2008/12/1

N2 - Widespread use of GPS devices and explosion of remotely sensed geospatial images along with cheap storage devices has resulted in vast amounts of data. More recently, with the advent of wireless technology, a large number of sensor networks have been deployed to monitor many human, biological and natural processes. This poses a challenge in many data rich application domains. The problem now is how best to choose the datasets to solve specific problems. Some of the datasets may be redundant and their inclusion in analysis may not only be time consuming, but may lead to erroneous conclusions. We propose the concept of data support as the basis for efficient, cost-effective and intelligent use of geospatial data in order to reduce uncertainty in the analysis and consequently in the results. Data support is defined as the process of determining the information utility of a data source to help decide which one to include or exclude to improve cost-effectiveness in existing data analysis. In this article we use mutual information as the basis of computing data support. The concept of mutual information is defined in information theory as a measure to compute information gain or loss between two disjoint datasets. We use this to compute the optimal datasets in specific applications. The effectiveness of the approach is demonstrated using an application in the hydrological analysis domain.

AB - Widespread use of GPS devices and explosion of remotely sensed geospatial images along with cheap storage devices has resulted in vast amounts of data. More recently, with the advent of wireless technology, a large number of sensor networks have been deployed to monitor many human, biological and natural processes. This poses a challenge in many data rich application domains. The problem now is how best to choose the datasets to solve specific problems. Some of the datasets may be redundant and their inclusion in analysis may not only be time consuming, but may lead to erroneous conclusions. We propose the concept of data support as the basis for efficient, cost-effective and intelligent use of geospatial data in order to reduce uncertainty in the analysis and consequently in the results. Data support is defined as the process of determining the information utility of a data source to help decide which one to include or exclude to improve cost-effectiveness in existing data analysis. In this article we use mutual information as the basis of computing data support. The concept of mutual information is defined in information theory as a measure to compute information gain or loss between two disjoint datasets. We use this to compute the optimal datasets in specific applications. The effectiveness of the approach is demonstrated using an application in the hydrological analysis domain.

KW - Information gain/loss

KW - Spatial data support

UR - http://www.scopus.com/inward/record.url?scp=70449730071&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449730071&partnerID=8YFLogxK

U2 - 10.1145/1463434.1463502

DO - 10.1145/1463434.1463502

M3 - Conference contribution

SN - 9781605583235

T3 - GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems

SP - 431

EP - 434

BT - Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008

ER -