TY - JOUR
T1 - Data Integration Methods for Phenotype Harmonization in Multi-Cohort Genome-Wide Association Studies With Behavioral Outcomes
AU - Luningham, Justin M.
AU - McArtor, Daniel B.
AU - Hendriks, Anne M.
AU - van Beijsterveldt, Catharina E.M.
AU - Lichtenstein, Paul
AU - Lundström, Sebastian
AU - Larsson, Henrik
AU - Bartels, Meike
AU - Boomsma, Dorret I.
AU - Lubke, Gitta H.
N1 - Funding Information:
This work was supported by FP7-602768 “ACTION: Aggression in Children: Unraveling gene-environment interplay to inform Treatment and InterventiON strategies” from the European Commission/European Union Seventh Framework Program. GL was in addition supported by DA-018673 awarded by the National Institutes of Health: The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© Copyright © 2019 Luningham, McArtor, Hendriks, van Beijsterveldt, Lichtenstein, Lundström, Larsson, Bartels, Boomsma and Lubke.
PY - 2019/12/10
Y1 - 2019/12/10
N2 - Parallel meta-analysis is a popular approach for increasing the power to detect genetic effects in genome-wide association studies across multiple cohorts. Consortia studying the genetics of behavioral phenotypes are oftentimes faced with systematic differences in phenotype measurement across cohorts, introducing heterogeneity into the meta-analysis and reducing statistical power. This study investigated integrative data analysis (IDA) as an approach for jointly modeling the phenotype across multiple datasets. We put forth a bi-factor integration model (BFIM) that provides a single common phenotype score and accounts for sources of study-specific variability in the phenotype. In order to capitalize on this modeling strategy, a phenotype reference panel was utilized as a supplemental sample with complete data on all behavioral measures. A simulation study showed that a mega-analysis of genetic variant effects in a BFIM were more powerful than meta-analysis of genetic effects on a cohort-specific sum score of items. Saving the factor scores from the BFIM and using those as the outcome in meta-analysis was also more powerful than the sum score in most simulation conditions, but a small degree of bias was introduced by this approach. The reference panel was necessary to realize these power gains. An empirical demonstration used the BFIM to harmonize aggression scores in 9-year old children across the Netherlands Twin Register and the Child and Adolescent Twin Study in Sweden, providing a template for application of the BFIM to a range of different phenotypes. A supplemental data collection in the Netherlands Twin Register served as a reference panel for phenotype modeling across both cohorts. Our results indicate that model-based harmonization for the study of complex traits is a useful step within genetic consortia.
AB - Parallel meta-analysis is a popular approach for increasing the power to detect genetic effects in genome-wide association studies across multiple cohorts. Consortia studying the genetics of behavioral phenotypes are oftentimes faced with systematic differences in phenotype measurement across cohorts, introducing heterogeneity into the meta-analysis and reducing statistical power. This study investigated integrative data analysis (IDA) as an approach for jointly modeling the phenotype across multiple datasets. We put forth a bi-factor integration model (BFIM) that provides a single common phenotype score and accounts for sources of study-specific variability in the phenotype. In order to capitalize on this modeling strategy, a phenotype reference panel was utilized as a supplemental sample with complete data on all behavioral measures. A simulation study showed that a mega-analysis of genetic variant effects in a BFIM were more powerful than meta-analysis of genetic effects on a cohort-specific sum score of items. Saving the factor scores from the BFIM and using those as the outcome in meta-analysis was also more powerful than the sum score in most simulation conditions, but a small degree of bias was introduced by this approach. The reference panel was necessary to realize these power gains. An empirical demonstration used the BFIM to harmonize aggression scores in 9-year old children across the Netherlands Twin Register and the Child and Adolescent Twin Study in Sweden, providing a template for application of the BFIM to a range of different phenotypes. A supplemental data collection in the Netherlands Twin Register served as a reference panel for phenotype modeling across both cohorts. Our results indicate that model-based harmonization for the study of complex traits is a useful step within genetic consortia.
KW - consortia
KW - data integration
KW - genome-wide association studies
KW - latent variable modeling
KW - phenotype harmonization
UR - http://www.scopus.com/inward/record.url?scp=85077331451&partnerID=8YFLogxK
U2 - 10.3389/fgene.2019.01227
DO - 10.3389/fgene.2019.01227
M3 - Article
AN - SCOPUS:85077331451
SN - 1664-8021
VL - 10
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 1227
ER -