### Abstract

In Prequential analysis, an inference method is viewed as a forecasting system, and the quality of the inference method is based on the quality of its predictions. This is an alternative approach to more traditional statistical methods that focus on the inference of parameters of the data generating distribution. In this paper, we introduce adaptive combined average predictors (ACAPs) for the Prequential analysis of complex data. That is, we use convex combinations of two different model averages to form a predictor at each time step in a sequence. A novel feature of our strategy is that the models in each average are re-chosen adaptively at each time step. To assess the complexity of a given data set, we introduce measures of data complexity for continuous response data. We validate our measures in several simulated contexts prior to using them in real data examples. The performance of ACAPs is compared with the performances of predictors based on stacking or likelihood weighted averaging in several model classes and in both simulated and real data sets. Our results suggest that ACAPs achieve a better trade off between model list bias and model list variability in cases where the data is very complex. This implies that the choices of model class and averaging method should be guided by a concept of complexity matching, i.e. the analysis of a complex data set may require a more complex model class and averaging strategy than the analysis of a simpler data set. We propose that complexity matching is akin to a bias-variance tradeoff in statistical modeling.

Original language | English (US) |
---|---|

Pages (from-to) | 274-290 |

Number of pages | 17 |

Journal | Statistical Analysis and Data Mining |

Volume | 2 |

Issue number | 4 |

DOIs | |

State | Published - Nov 1 2009 |

### Fingerprint

### Keywords

- Bayes model averaging
- Complexity
- Model selection
- Model uncertainty
- Predictive optimality
- Prequential analysis
- Stacking

### ASJC Scopus subject areas

- Analysis
- Information Systems
- Computer Science Applications

### Cite this

*Statistical Analysis and Data Mining*,

*2*(4), 274-290. https://doi.org/10.1002/sam.10052

**Prequential analysis of complex data with adaptive model reselection.** / Clarke, Jennifer; Clarke, Bertrand.

Research output: Contribution to journal › Article

*Statistical Analysis and Data Mining*, vol. 2, no. 4, pp. 274-290. https://doi.org/10.1002/sam.10052

}

TY - JOUR

T1 - Prequential analysis of complex data with adaptive model reselection

AU - Clarke, Jennifer

AU - Clarke, Bertrand

PY - 2009/11/1

Y1 - 2009/11/1

N2 - In Prequential analysis, an inference method is viewed as a forecasting system, and the quality of the inference method is based on the quality of its predictions. This is an alternative approach to more traditional statistical methods that focus on the inference of parameters of the data generating distribution. In this paper, we introduce adaptive combined average predictors (ACAPs) for the Prequential analysis of complex data. That is, we use convex combinations of two different model averages to form a predictor at each time step in a sequence. A novel feature of our strategy is that the models in each average are re-chosen adaptively at each time step. To assess the complexity of a given data set, we introduce measures of data complexity for continuous response data. We validate our measures in several simulated contexts prior to using them in real data examples. The performance of ACAPs is compared with the performances of predictors based on stacking or likelihood weighted averaging in several model classes and in both simulated and real data sets. Our results suggest that ACAPs achieve a better trade off between model list bias and model list variability in cases where the data is very complex. This implies that the choices of model class and averaging method should be guided by a concept of complexity matching, i.e. the analysis of a complex data set may require a more complex model class and averaging strategy than the analysis of a simpler data set. We propose that complexity matching is akin to a bias-variance tradeoff in statistical modeling.

AB - In Prequential analysis, an inference method is viewed as a forecasting system, and the quality of the inference method is based on the quality of its predictions. This is an alternative approach to more traditional statistical methods that focus on the inference of parameters of the data generating distribution. In this paper, we introduce adaptive combined average predictors (ACAPs) for the Prequential analysis of complex data. That is, we use convex combinations of two different model averages to form a predictor at each time step in a sequence. A novel feature of our strategy is that the models in each average are re-chosen adaptively at each time step. To assess the complexity of a given data set, we introduce measures of data complexity for continuous response data. We validate our measures in several simulated contexts prior to using them in real data examples. The performance of ACAPs is compared with the performances of predictors based on stacking or likelihood weighted averaging in several model classes and in both simulated and real data sets. Our results suggest that ACAPs achieve a better trade off between model list bias and model list variability in cases where the data is very complex. This implies that the choices of model class and averaging method should be guided by a concept of complexity matching, i.e. the analysis of a complex data set may require a more complex model class and averaging strategy than the analysis of a simpler data set. We propose that complexity matching is akin to a bias-variance tradeoff in statistical modeling.

KW - Bayes model averaging

KW - Complexity

KW - Model selection

KW - Model uncertainty

KW - Predictive optimality

KW - Prequential analysis

KW - Stacking

UR - http://www.scopus.com/inward/record.url?scp=77950257361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950257361&partnerID=8YFLogxK

U2 - 10.1002/sam.10052

DO - 10.1002/sam.10052

M3 - Article

C2 - 20617104

AN - SCOPUS:77950257361

VL - 2

SP - 274

EP - 290

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

SN - 1932-1872

IS - 4

ER -