On the use of a log-rate model for survey-weighted categorical data

Thomas M. Loughin, Christopher R Bilder

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

For the analysis of survey-weighted categorical data, one recommended method of analysis is a log-rate model. For each cell in a contingency table, the survey weights are averaged across subjects and incorporated into an offset for a loglinear model. Supposedly, one can then proceed with the analysis of unweighted observed cell counts. We provide theoretical and simulation-based evidence to show that the log-rate analysis is not an effective statistical analysis method and should not be used in general. The root of the problem is in its failure to properly account for variability in the individual weights within cells of a contingency table. This results in goodness-of-fit tests that have higher-than-nominal error rates and confidence intervals for odds ratios that have lower-than-nominal coverage.

Original languageEnglish (US)
Pages (from-to)2661-2669
Number of pages9
JournalCommunications in Statistics - Theory and Methods
Volume40
Issue number15
DOIs
StatePublished - Jan 1 2011

Fingerprint

Nominal or categorical data
Contingency Table
Categorical or nominal
Cell
Analysis and Statistical Methods
Log-linear Models
Odds Ratio
Goodness of Fit Test
Model
Confidence interval
Error Rate
Count
Coverage
Roots
Simulation

Keywords

  • Clogg-Eliason
  • Contingency table
  • Loglinear model
  • Offset
  • Rao-Scott
  • Survey sampling

ASJC Scopus subject areas

  • Statistics and Probability

Cite this

On the use of a log-rate model for survey-weighted categorical data. / Loughin, Thomas M.; Bilder, Christopher R.

In: Communications in Statistics - Theory and Methods, Vol. 40, No. 15, 01.01.2011, p. 2661-2669.

Research output: Contribution to journalArticle

@article{8a7db94c199b4cdab3991aa76589876d,
title = "On the use of a log-rate model for survey-weighted categorical data",
abstract = "For the analysis of survey-weighted categorical data, one recommended method of analysis is a log-rate model. For each cell in a contingency table, the survey weights are averaged across subjects and incorporated into an offset for a loglinear model. Supposedly, one can then proceed with the analysis of unweighted observed cell counts. We provide theoretical and simulation-based evidence to show that the log-rate analysis is not an effective statistical analysis method and should not be used in general. The root of the problem is in its failure to properly account for variability in the individual weights within cells of a contingency table. This results in goodness-of-fit tests that have higher-than-nominal error rates and confidence intervals for odds ratios that have lower-than-nominal coverage.",
keywords = "Clogg-Eliason, Contingency table, Loglinear model, Offset, Rao-Scott, Survey sampling",
author = "Loughin, {Thomas M.} and Bilder, {Christopher R}",
year = "2011",
month = "1",
day = "1",
doi = "10.1080/03610926.2010.489178",
language = "English (US)",
volume = "40",
pages = "2661--2669",
journal = "Communications in Statistics - Theory and Methods",
issn = "0361-0926",
publisher = "Taylor and Francis Ltd.",
number = "15",

}

TY - JOUR

T1 - On the use of a log-rate model for survey-weighted categorical data

AU - Loughin, Thomas M.

AU - Bilder, Christopher R

PY - 2011/1/1

Y1 - 2011/1/1

N2 - For the analysis of survey-weighted categorical data, one recommended method of analysis is a log-rate model. For each cell in a contingency table, the survey weights are averaged across subjects and incorporated into an offset for a loglinear model. Supposedly, one can then proceed with the analysis of unweighted observed cell counts. We provide theoretical and simulation-based evidence to show that the log-rate analysis is not an effective statistical analysis method and should not be used in general. The root of the problem is in its failure to properly account for variability in the individual weights within cells of a contingency table. This results in goodness-of-fit tests that have higher-than-nominal error rates and confidence intervals for odds ratios that have lower-than-nominal coverage.

AB - For the analysis of survey-weighted categorical data, one recommended method of analysis is a log-rate model. For each cell in a contingency table, the survey weights are averaged across subjects and incorporated into an offset for a loglinear model. Supposedly, one can then proceed with the analysis of unweighted observed cell counts. We provide theoretical and simulation-based evidence to show that the log-rate analysis is not an effective statistical analysis method and should not be used in general. The root of the problem is in its failure to properly account for variability in the individual weights within cells of a contingency table. This results in goodness-of-fit tests that have higher-than-nominal error rates and confidence intervals for odds ratios that have lower-than-nominal coverage.

KW - Clogg-Eliason

KW - Contingency table

KW - Loglinear model

KW - Offset

KW - Rao-Scott

KW - Survey sampling

UR - http://www.scopus.com/inward/record.url?scp=79956127880&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79956127880&partnerID=8YFLogxK

U2 - 10.1080/03610926.2010.489178

DO - 10.1080/03610926.2010.489178

M3 - Article

VL - 40

SP - 2661

EP - 2669

JO - Communications in Statistics - Theory and Methods

JF - Communications in Statistics - Theory and Methods

SN - 0361-0926

IS - 15

ER -