A Comparison of Population-Averaged and Cluster-Specific Approaches in the Context of Unequal Probabilities of Selection

Natalie A. Koziol, James A. Bovaird, Sonia Suarez

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Sampling designs of large-scale survey studies are typically complex, involving multiple design features such as clustering and unequal probabilities of selection. Single-level (i.e., population-averaged) methods that use adjusted variance estimators and multilevel (i.e., cluster-specific) methods provide two alternatives for modeling clustered data. Although the literature comparing these methods is vast, comparisons have been limited to the context in which all sampling units are selected with equal probabilities (thus circumventing the need for sampling weights). The goal of this study was to determine under what conditions single-level and multilevel estimators outperform one another in the context of a two-stage sampling design with unequal probabilities of selection. Monte Carlo simulation methods were used to evaluate the impact of several factors, including population model, informativeness of the design, distribution of the outcome variable, intraclass correlation coefficient, cluster size, and estimation method. Results indicated that the unweighted estimators performed similarly across conditions, whereas the weighted single-level estimators tended to outperform the weighted multilevel estimators, particularly under nonideal sample conditions. Multilevel weight approximation methods did not perform well when the design was informative. An empirical example is provided to demonstrate how researchers might investigate the implications of the simulation results in practice.

Original languageEnglish (US)
Pages (from-to)325-349
Number of pages25
JournalMultivariate Behavioral Research
Volume52
Issue number3
DOIs
StatePublished - May 4 2017

Fingerprint

Unequal
Estimator
Sampling Design
Population
Two-stage Sampling
Intraclass Correlation Coefficient
Two-stage Design
Clustered Data
Variance Estimator
Factor Models
Population Model
Weights and Measures
Monte Carlo Method
Approximation Methods
Simulation Methods
Monte Carlo method
Monte Carlo Simulation
Clustering
Cluster Analysis
Unit

Keywords

  • Clustered data
  • cluster-specific models
  • multilevel models
  • population-averaged models
  • sampling weights

ASJC Scopus subject areas

  • Statistics and Probability
  • Experimental and Cognitive Psychology
  • Arts and Humanities (miscellaneous)

Cite this

A Comparison of Population-Averaged and Cluster-Specific Approaches in the Context of Unequal Probabilities of Selection. / Koziol, Natalie A.; Bovaird, James A.; Suarez, Sonia.

In: Multivariate Behavioral Research, Vol. 52, No. 3, 04.05.2017, p. 325-349.

Research output: Contribution to journalArticle

@article{b9671ff3b6ce4c7d826a31fc64989b75,
title = "A Comparison of Population-Averaged and Cluster-Specific Approaches in the Context of Unequal Probabilities of Selection",
abstract = "Sampling designs of large-scale survey studies are typically complex, involving multiple design features such as clustering and unequal probabilities of selection. Single-level (i.e., population-averaged) methods that use adjusted variance estimators and multilevel (i.e., cluster-specific) methods provide two alternatives for modeling clustered data. Although the literature comparing these methods is vast, comparisons have been limited to the context in which all sampling units are selected with equal probabilities (thus circumventing the need for sampling weights). The goal of this study was to determine under what conditions single-level and multilevel estimators outperform one another in the context of a two-stage sampling design with unequal probabilities of selection. Monte Carlo simulation methods were used to evaluate the impact of several factors, including population model, informativeness of the design, distribution of the outcome variable, intraclass correlation coefficient, cluster size, and estimation method. Results indicated that the unweighted estimators performed similarly across conditions, whereas the weighted single-level estimators tended to outperform the weighted multilevel estimators, particularly under nonideal sample conditions. Multilevel weight approximation methods did not perform well when the design was informative. An empirical example is provided to demonstrate how researchers might investigate the implications of the simulation results in practice.",
keywords = "Clustered data, cluster-specific models, multilevel models, population-averaged models, sampling weights",
author = "Koziol, {Natalie A.} and Bovaird, {James A.} and Sonia Suarez",
year = "2017",
month = "5",
day = "4",
doi = "10.1080/00273171.2017.1292115",
language = "English (US)",
volume = "52",
pages = "325--349",
journal = "Multivariate Behavioral Research",
issn = "0027-3171",
publisher = "Psychology Press Ltd",
number = "3",

}

TY - JOUR

T1 - A Comparison of Population-Averaged and Cluster-Specific Approaches in the Context of Unequal Probabilities of Selection

AU - Koziol, Natalie A.

AU - Bovaird, James A.

AU - Suarez, Sonia

PY - 2017/5/4

Y1 - 2017/5/4

N2 - Sampling designs of large-scale survey studies are typically complex, involving multiple design features such as clustering and unequal probabilities of selection. Single-level (i.e., population-averaged) methods that use adjusted variance estimators and multilevel (i.e., cluster-specific) methods provide two alternatives for modeling clustered data. Although the literature comparing these methods is vast, comparisons have been limited to the context in which all sampling units are selected with equal probabilities (thus circumventing the need for sampling weights). The goal of this study was to determine under what conditions single-level and multilevel estimators outperform one another in the context of a two-stage sampling design with unequal probabilities of selection. Monte Carlo simulation methods were used to evaluate the impact of several factors, including population model, informativeness of the design, distribution of the outcome variable, intraclass correlation coefficient, cluster size, and estimation method. Results indicated that the unweighted estimators performed similarly across conditions, whereas the weighted single-level estimators tended to outperform the weighted multilevel estimators, particularly under nonideal sample conditions. Multilevel weight approximation methods did not perform well when the design was informative. An empirical example is provided to demonstrate how researchers might investigate the implications of the simulation results in practice.

AB - Sampling designs of large-scale survey studies are typically complex, involving multiple design features such as clustering and unequal probabilities of selection. Single-level (i.e., population-averaged) methods that use adjusted variance estimators and multilevel (i.e., cluster-specific) methods provide two alternatives for modeling clustered data. Although the literature comparing these methods is vast, comparisons have been limited to the context in which all sampling units are selected with equal probabilities (thus circumventing the need for sampling weights). The goal of this study was to determine under what conditions single-level and multilevel estimators outperform one another in the context of a two-stage sampling design with unequal probabilities of selection. Monte Carlo simulation methods were used to evaluate the impact of several factors, including population model, informativeness of the design, distribution of the outcome variable, intraclass correlation coefficient, cluster size, and estimation method. Results indicated that the unweighted estimators performed similarly across conditions, whereas the weighted single-level estimators tended to outperform the weighted multilevel estimators, particularly under nonideal sample conditions. Multilevel weight approximation methods did not perform well when the design was informative. An empirical example is provided to demonstrate how researchers might investigate the implications of the simulation results in practice.

KW - Clustered data

KW - cluster-specific models

KW - multilevel models

KW - population-averaged models

KW - sampling weights

UR - http://www.scopus.com/inward/record.url?scp=85014765451&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014765451&partnerID=8YFLogxK

U2 - 10.1080/00273171.2017.1292115

DO - 10.1080/00273171.2017.1292115

M3 - Article

C2 - 28281792

AN - SCOPUS:85014765451

VL - 52

SP - 325

EP - 349

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

SN - 0027-3171

IS - 3

ER -