The causal meaning of genomic predictors and how it affects construction and comparison of genome-enabled selection models

Bruno D. Valente, Gota Morota, Francisco Peñagaricano, Daniel Gianola, Kent Weigel, Guilherme J.M. Rosa

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.

Original languageEnglish (US)
Pages (from-to)483-494
Number of pages12
JournalGenetics
Volume200
Issue number2
DOIs
StatePublished - Jan 1 2015

Fingerprint

Aptitude
Genome
Decision Support Techniques
Learning

Keywords

  • Causal inference
  • Genomic selection
  • Genpred
  • Model comparison
  • Prediction
  • Selection
  • Shared data resource

ASJC Scopus subject areas

  • Genetics

Cite this

The causal meaning of genomic predictors and how it affects construction and comparison of genome-enabled selection models. / Valente, Bruno D.; Morota, Gota; Peñagaricano, Francisco; Gianola, Daniel; Weigel, Kent; Rosa, Guilherme J.M.

In: Genetics, Vol. 200, No. 2, 01.01.2015, p. 483-494.

Research output: Contribution to journalArticle

Valente, Bruno D. ; Morota, Gota ; Peñagaricano, Francisco ; Gianola, Daniel ; Weigel, Kent ; Rosa, Guilherme J.M. / The causal meaning of genomic predictors and how it affects construction and comparison of genome-enabled selection models. In: Genetics. 2015 ; Vol. 200, No. 2. pp. 483-494.
@article{b15d3082088141a2a05109141facf6ab,
title = "The causal meaning of genomic predictors and how it affects construction and comparison of genome-enabled selection models",
abstract = "The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.",
keywords = "Causal inference, Genomic selection, Genpred, Model comparison, Prediction, Selection, Shared data resource",
author = "Valente, {Bruno D.} and Gota Morota and Francisco Pe{\~n}agaricano and Daniel Gianola and Kent Weigel and Rosa, {Guilherme J.M.}",
year = "2015",
month = "1",
day = "1",
doi = "10.1534/genetics.114.169490",
language = "English (US)",
volume = "200",
pages = "483--494",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "2",

}

TY - JOUR

T1 - The causal meaning of genomic predictors and how it affects construction and comparison of genome-enabled selection models

AU - Valente, Bruno D.

AU - Morota, Gota

AU - Peñagaricano, Francisco

AU - Gianola, Daniel

AU - Weigel, Kent

AU - Rosa, Guilherme J.M.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.

AB - The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.

KW - Causal inference

KW - Genomic selection

KW - Genpred

KW - Model comparison

KW - Prediction

KW - Selection

KW - Shared data resource

UR - http://www.scopus.com/inward/record.url?scp=84931317300&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84931317300&partnerID=8YFLogxK

U2 - 10.1534/genetics.114.169490

DO - 10.1534/genetics.114.169490

M3 - Article

C2 - 25908318

AN - SCOPUS:84931317300

VL - 200

SP - 483

EP - 494

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 2

ER -