Kernel-based whole-genome prediction of complex traits

A review

Gota Morota, Daniel Gianola

Research output: Contribution to journalReview article

46 Citations (Scopus)

Abstract

Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

Original languageEnglish (US)
Article number363
JournalFrontiers in Genetics
Volume5
Issue numberOCT
DOIs
StatePublished - Jan 1 2014

Fingerprint

Genome
Breeding
Research Personnel
Technology
Machine Learning
Datasets

Keywords

  • Kernel methods
  • SNP
  • Semi-parametric regression
  • Spatial distance
  • Whole-genome prediction

ASJC Scopus subject areas

  • Molecular Medicine
  • Genetics
  • Genetics(clinical)

Cite this

Kernel-based whole-genome prediction of complex traits : A review. / Morota, Gota; Gianola, Daniel.

In: Frontiers in Genetics, Vol. 5, No. OCT, 363, 01.01.2014.

Research output: Contribution to journalReview article

@article{b22bb4247d6b4575b38e617ba463ab41,
title = "Kernel-based whole-genome prediction of complex traits: A review",
abstract = "Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.",
keywords = "Kernel methods, SNP, Semi-parametric regression, Spatial distance, Whole-genome prediction",
author = "Gota Morota and Daniel Gianola",
year = "2014",
month = "1",
day = "1",
doi = "10.3389/fgene.2014.00363",
language = "English (US)",
volume = "5",
journal = "Frontiers in Genetics",
issn = "1664-8021",
publisher = "Frontiers Media S. A.",
number = "OCT",

}

TY - JOUR

T1 - Kernel-based whole-genome prediction of complex traits

T2 - A review

AU - Morota, Gota

AU - Gianola, Daniel

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

AB - Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

KW - Kernel methods

KW - SNP

KW - Semi-parametric regression

KW - Spatial distance

KW - Whole-genome prediction

UR - http://www.scopus.com/inward/record.url?scp=84949115828&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949115828&partnerID=8YFLogxK

U2 - 10.3389/fgene.2014.00363

DO - 10.3389/fgene.2014.00363

M3 - Review article

VL - 5

JO - Frontiers in Genetics

JF - Frontiers in Genetics

SN - 1664-8021

IS - OCT

M1 - 363

ER -