The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction

Jeremy T. Howard, Tom A. Rathje, Caitlyn E. Bruns, Danielle F. Wilson-Wells, Stephen D. Kachman, Matthew L. Spangler

Research output: Contribution to journalArticle

Abstract

Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.

Original languageEnglish (US)
Pages (from-to)251-262
Number of pages12
JournalJournal of Animal Breeding and Genetics
Volume135
Issue number4
DOIs
StatePublished - Aug 2018

Fingerprint

breeding value
Breeding
Swine
genomics
prediction
phenotype
swine
Phenotype
pork industry
animals
young animals
Datasets
Industry
breeds
Genotype
genotype

Keywords

  • data reduction
  • single-step genomic BLUP
  • swine

ASJC Scopus subject areas

  • Food Animals
  • Animal Science and Zoology

Cite this

The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction. / Howard, Jeremy T.; Rathje, Tom A.; Bruns, Caitlyn E.; Wilson-Wells, Danielle F.; Kachman, Stephen D.; Spangler, Matthew L.

In: Journal of Animal Breeding and Genetics, Vol. 135, No. 4, 08.2018, p. 251-262.

Research output: Contribution to journalArticle

Howard, Jeremy T. ; Rathje, Tom A. ; Bruns, Caitlyn E. ; Wilson-Wells, Danielle F. ; Kachman, Stephen D. ; Spangler, Matthew L. / The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction. In: Journal of Animal Breeding and Genetics. 2018 ; Vol. 135, No. 4. pp. 251-262.
@article{6179805a1bfa40baa18a7aea96384189,
title = "The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction",
abstract = "Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.",
keywords = "data reduction, single-step genomic BLUP, swine",
author = "Howard, {Jeremy T.} and Rathje, {Tom A.} and Bruns, {Caitlyn E.} and Wilson-Wells, {Danielle F.} and Kachman, {Stephen D.} and Spangler, {Matthew L.}",
year = "2018",
month = "8",
doi = "10.1111/jbg.12334",
language = "English (US)",
volume = "135",
pages = "251--262",
journal = "Journal of Animal Breeding and Genetics",
issn = "0931-2668",
publisher = "Wiley-Blackwell",
number = "4",

}

TY - JOUR

T1 - The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction

AU - Howard, Jeremy T.

AU - Rathje, Tom A.

AU - Bruns, Caitlyn E.

AU - Wilson-Wells, Danielle F.

AU - Kachman, Stephen D.

AU - Spangler, Matthew L.

PY - 2018/8

Y1 - 2018/8

N2 - Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.

AB - Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.

KW - data reduction

KW - single-step genomic BLUP

KW - swine

UR - http://www.scopus.com/inward/record.url?scp=85050034089&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050034089&partnerID=8YFLogxK

U2 - 10.1111/jbg.12334

DO - 10.1111/jbg.12334

M3 - Article

C2 - 29882604

AN - SCOPUS:85050034089

VL - 135

SP - 251

EP - 262

JO - Journal of Animal Breeding and Genetics

JF - Journal of Animal Breeding and Genetics

SN - 0931-2668

IS - 4

ER -