The impact of truncating data on the predictive ability for single-step genomic best linear unbiased prediction

Jeremy T. Howard, Tom A. Rathje, Caitlyn E. Bruns, Danielle F. Wilson-Wells, Stephen D. Kachman, Matthew L. Spangler

Research output: Contribution to journalArticle


Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.

Original languageEnglish (US)
Pages (from-to)251-262
Number of pages12
JournalJournal of Animal Breeding and Genetics
Issue number4
StatePublished - Aug 2018



  • data reduction
  • single-step genomic BLUP
  • swine

ASJC Scopus subject areas

  • Food Animals
  • Animal Science and Zoology

Cite this