Strategies for bias reduction in estimation of marginal means with data missing at random

Baojiang Chen, Richard J. Cook

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Citation (Scopus)

Abstract

Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency.We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditionalmean (ECM) yield consistent estimators. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECMapproach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.

Original languageEnglish (US)
Title of host publicationOptimization and Data Analysis in Biomedical Informatics
EditorsPanos M Pardalos
Pages99-115
Number of pages17
DOIs
StatePublished - Dec 1 2012

Publication series

NameFields Institute Communications
Volume63
ISSN (Print)1069-5265

Fingerprint

Weighted Estimating Equations
Bias Reduction
Missing at Random
Missing Data Mechanism
Asymptotic Bias
Categorical variable
Auxiliary Variables
Consistent Estimator
Incomplete Data
Appeal
Continuous Variables
Higher Dimensions
Biased
Covariates
Estimator
Evaluate
Model
Strategy

ASJC Scopus subject areas

  • Mathematics(all)

Cite this

Chen, B., & Cook, R. J. (2012). Strategies for bias reduction in estimation of marginal means with data missing at random. In P. M. Pardalos (Ed.), Optimization and Data Analysis in Biomedical Informatics (pp. 99-115). (Fields Institute Communications; Vol. 63). https://doi.org/10.1007/978-1-4614-4133-5_5

Strategies for bias reduction in estimation of marginal means with data missing at random. / Chen, Baojiang; Cook, Richard J.

Optimization and Data Analysis in Biomedical Informatics. ed. / Panos M Pardalos. 2012. p. 99-115 (Fields Institute Communications; Vol. 63).

Research output: Chapter in Book/Report/Conference proceedingChapter

Chen, B & Cook, RJ 2012, Strategies for bias reduction in estimation of marginal means with data missing at random. in PM Pardalos (ed.), Optimization and Data Analysis in Biomedical Informatics. Fields Institute Communications, vol. 63, pp. 99-115. https://doi.org/10.1007/978-1-4614-4133-5_5
Chen B, Cook RJ. Strategies for bias reduction in estimation of marginal means with data missing at random. In Pardalos PM, editor, Optimization and Data Analysis in Biomedical Informatics. 2012. p. 99-115. (Fields Institute Communications). https://doi.org/10.1007/978-1-4614-4133-5_5
Chen, Baojiang ; Cook, Richard J. / Strategies for bias reduction in estimation of marginal means with data missing at random. Optimization and Data Analysis in Biomedical Informatics. editor / Panos M Pardalos. 2012. pp. 99-115 (Fields Institute Communications).
@inbook{9b7fdad6e0b44f6fa92548e1fc735e4e,
title = "Strategies for bias reduction in estimation of marginal means with data missing at random",
abstract = "Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency.We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditionalmean (ECM) yield consistent estimators. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECMapproach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.",
author = "Baojiang Chen and Cook, {Richard J.}",
year = "2012",
month = "12",
day = "1",
doi = "10.1007/978-1-4614-4133-5_5",
language = "English (US)",
isbn = "9781461441328",
series = "Fields Institute Communications",
pages = "99--115",
editor = "Pardalos, {Panos M}",
booktitle = "Optimization and Data Analysis in Biomedical Informatics",

}

TY - CHAP

T1 - Strategies for bias reduction in estimation of marginal means with data missing at random

AU - Chen, Baojiang

AU - Cook, Richard J.

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency.We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditionalmean (ECM) yield consistent estimators. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECMapproach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.

AB - Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency.We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditionalmean (ECM) yield consistent estimators. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECMapproach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.

UR - http://www.scopus.com/inward/record.url?scp=84874341159&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874341159&partnerID=8YFLogxK

U2 - 10.1007/978-1-4614-4133-5_5

DO - 10.1007/978-1-4614-4133-5_5

M3 - Chapter

AN - SCOPUS:84874341159

SN - 9781461441328

T3 - Fields Institute Communications

SP - 99

EP - 115

BT - Optimization and Data Analysis in Biomedical Informatics

A2 - Pardalos, Panos M

ER -