Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins

Jun He, Jiaqi Xu, Xiao Lin Wu, Stewart Bauck, Jungjae Lee, Gota Morota, Stephen D. Kachman, Matthew L. Spangler

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821–0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825–0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.

Original languageEnglish (US)
Pages (from-to)137-149
Number of pages13
JournalGenetica
Volume146
Issue number2
DOIs
StatePublished - Apr 1 2018

Fingerprint

Single Nucleotide Polymorphism
genomics
prediction
genotype
Genotype
marker-assisted selection
pregnancy rate
genotyping
gene frequency
milk yield
Pregnancy Rate
Gene Frequency
lipids
Milk
Fats
animals

Keywords

  • Genomic prediction
  • Holstein
  • Imputation
  • Low-density SNP chips

ASJC Scopus subject areas

  • Animal Science and Zoology
  • Genetics
  • Plant Science
  • Insect Science

Cite this

Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins. / He, Jun; Xu, Jiaqi; Wu, Xiao Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D.; Spangler, Matthew L.

In: Genetica, Vol. 146, No. 2, 01.04.2018, p. 137-149.

Research output: Contribution to journalArticle

He, Jun ; Xu, Jiaqi ; Wu, Xiao Lin ; Bauck, Stewart ; Lee, Jungjae ; Morota, Gota ; Kachman, Stephen D. ; Spangler, Matthew L. / Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins. In: Genetica. 2018 ; Vol. 146, No. 2. pp. 137-149.
@article{3e01472157b44be9a53c193a7b639788,
title = "Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins",
abstract = "SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2{\%}. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821–0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825–0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.",
keywords = "Genomic prediction, Holstein, Imputation, Low-density SNP chips",
author = "Jun He and Jiaqi Xu and Wu, {Xiao Lin} and Stewart Bauck and Jungjae Lee and Gota Morota and Kachman, {Stephen D.} and Spangler, {Matthew L.}",
year = "2018",
month = "4",
day = "1",
doi = "10.1007/s10709-017-0004-9",
language = "English (US)",
volume = "146",
pages = "137--149",
journal = "Genetica",
issn = "0016-6707",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins

AU - He, Jun

AU - Xu, Jiaqi

AU - Wu, Xiao Lin

AU - Bauck, Stewart

AU - Lee, Jungjae

AU - Morota, Gota

AU - Kachman, Stephen D.

AU - Spangler, Matthew L.

PY - 2018/4/1

Y1 - 2018/4/1

N2 - SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821–0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825–0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.

AB - SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821–0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825–0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.

KW - Genomic prediction

KW - Holstein

KW - Imputation

KW - Low-density SNP chips

UR - http://www.scopus.com/inward/record.url?scp=85038081614&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038081614&partnerID=8YFLogxK

U2 - 10.1007/s10709-017-0004-9

DO - 10.1007/s10709-017-0004-9

M3 - Article

C2 - 29243001

AN - SCOPUS:85038081614

VL - 146

SP - 137

EP - 149

JO - Genetica

JF - Genetica

SN - 0016-6707

IS - 2

ER -