Genome-wide association study to relate genotype (DNA SNP) to phenotype (using predictor phenotypes in milk) for a large population of dairy cows across management systems
Leaders: Nicolas Gengler (University of Liège) and Mark Crowe (University College Dublin)
Task 4.1. Defining the 10,000 cow population
Based on different criteria (e.g. MIR data available, genetic links across herds) consortium members will select herds which have good record-keeping, participate in milk recording schemes and are willing to record additional data which will be added to the GPlusE database. The aim is to recruit ~10,000 cows for the GWAS.
Task 4.2. Improvement and validation of MIR prediction equations
Additional reference phenotypes will be taken on the 10,000-cow population for animals having extreme phenotypes and lying outside usual spectral ranges. Additional MIR profiles in milk samples will be added as reference profiles to improve prediction equations. Simultaneously an external validation process will check quality of predictions.
Task 4.3. Innovative phenotypic tools to predict production efficiency, health, metabolic status, fertility, environmental footprint and animal welfare state from milk biomarkers, MIR spectra and glycan profiles
Based on previous research done in the ROBUSTMILK and OptiMIR projects and in this project (WP3), innovative biomarkers (metabolites, MIR or glycan profiles) will be used to predict production efficiency, health, physiological status, fertility, environmental footprint and animal welfare traits on a large scale. These tools will be used in the 10,000-cow population (see Task 4.4) and can later be disseminated. They will form the basis for the development of genomic tools.
Task 4.4. Acquiring phenotypes for the 10,000 cow population to be fed into the databases
MIR spectral data and glycan profiles will be obtained from collaboration with MROs. Innovative tools developed (Task 4.2) will be used and all data will be concentrated in the consolidated database from WP2. Phenotypes predicted from MIR and/or glycan profiles will be computed. Pedigree information for all cows will be stored in the GplusE database.
Task 4.5. Acquiring genotypes for the 10,000 cow population
Already 15% of the existing consortium cows have been SNP genotyped. The remaining 85% will be genotyped using the Illumina® BovineHD 777K Genotyping BeadChip. The genotype data will be added to the IGenoP database as described in WP2.
Task 4.6. Establishment of repository of RNA for subsequent micro RNA analysis
Additional blood samples will be collected and purified for coding RNA and non-coding RNAs from a sub-population of 500 cows that have been phenotypically characterised and stored as a bank to allow a future project to analyse these samples for unique microRNA associated with physiological status, health, fertility and welfare status.
Task 4.7 Modelling predicted environmental impact, welfare and product quality, traditional and other biomarker traits and their interactions
The experimental design will generate repeated records for multiple correlated traits. Specific data structure will need the development of advanced and adapted methods that will be used to model this data. Genetic parameters will be estimated using these adapted animal models. Estimated phenotypic and genetic correlations among predicted environmental impact, welfare and product quality, traditional and other biomarker traits will allow assessment of their complex relationships. These results will be needed in WP7 to develop breeding strategies. Developed models will be used as base for the development of single-step genomic evaluation methods in Task 4.8.
Task 4.8. Genomic data preparation
The SNP will be screened for call rates, minor allele frequencies, deviations from Hardy-Weinberg Equilibrium and Mendelian transmission errors. Data meeting the required quality will be selected for analysis.
Task 4.9. Initial association analysis
Genotypic frequencies for all polymorphisms will be examined to determine the level of linkage disequilibrium among genotype combinations. We will also perform an SNP association analysis with key production efficiency, product quality, health, metabolic status, fertility, environmental footprint and animal welfare traits via the predictor phenotypes identified in WP2. Parentage information will be used to account for genetic relationships that may exist within families. Significant SNPs will be used to indicate genomic regions of interest for each trait. Validation of these significant SNPs will occur within subgroups of the 10,000 cows studied based on Task 4.6 by including the genomic information into the relationship matrix when solving the model.
Task 4.10. Comparative genomics
A high quality bovine genome sequence assembly (UMD 3.1) and associated transcript annotation is available from Ensembl (19981 protein coding genes annotated in release 64). This resource will enable annotation of candidate SNPs by reference to the nearest/overlapping genes and whether they are intergenic, overlapping regions such as: 3’ or 5’ UTRs, splice sites, non-coding RNAs, exonic or intronic regions. SNPs in unannotated regions will be compared with homologous regions in other mammals (human, mouse etc.) to determine if there are unannotated genes or conserved regulatory elements in the vicinity of the SNP. A list of candidate genes for each trait will then be compiled for input into WP7. Genomic and phenomic data will be used to annotate / illuminate human and other genomes in WP8.