Back

A novel reusable transcriptome-wide association study workflow used to map key genes linked to important cattle traits

Jayaraman, S.; Chitneedi, P. K.; Kadri, N. K.; Costa-Monteiro-Moreira, G.; Salavati, M.; Charlier, C.; Boichard, D.; Sanchez, M.-P.; Pausch, H.; Kuehn, C.; Prendergast, J. G.; Clark, E. L.

2025-06-12 genomics
10.1101/2025.06.10.658680 bioRxiv
Show abstract

Transcriptome-wide association studies (TWAS) are a powerful approach for studying the genes underlying complex traits by directly integrating GWAS and gene expression datasets. In cattle, they have been previously applied to identify genes driving fertility, milk production, and health. However, these studies have also highlighted several challenges, from difficulties in reproducing these complex analyses to limitations from poor genotype calls, especially when called directly from RNA sequencing data. To address these and other challenges, for the H2020 BovReg Project, we have developed a streamlined, species-agnostic, and reusable Nextflow TWAS workflow to integrate transcriptomic and GWAS summary statistic datasets. Our workflow first generates accurate genotype calls and gene expression prediction models from transcriptomic datasets and then applies these tools to impute gene expression levels into GWAS cohorts, enabling the association of genes with traits of interest. We explore optimal strategies for calling genetic variants directly from transcriptomic data and illustrate that using imputation approaches specifically designed for low-pass sequencing data can improve variant calling over previously adopted methods. We demonstrate the utility of our TWAS workflow by applying it to both novel and publicly available GWAS cohorts for cattle, detecting novel gene-trait associations for complex traits. Using a new transcriptome annotation of the cattle genome generated for the BovReg project we also illustrate how previously un-assayable associations can be detected. The results and the workflow we present, provide a new resource for the community and contribute to a better understanding of the molecular drivers of complex traits in cattle with the goal of eventually leveraging this information in future breeding decisions.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
BMC Genomics
328 papers in training set
Top 0.1%
33.0%
2
Genetics Selection Evolution
33 papers in training set
Top 0.1%
14.4%
3
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.2%
8.4%
50% of probability mass above
4
Frontiers in Genetics
197 papers in training set
Top 1%
4.9%
5
Gigabyte
60 papers in training set
Top 0.3%
3.6%
6
GigaScience
172 papers in training set
Top 0.5%
3.6%
7
Scientific Reports
3102 papers in training set
Top 36%
3.6%
8
NAR Genomics and Bioinformatics
214 papers in training set
Top 1%
2.1%
9
Genetics
225 papers in training set
Top 2%
2.1%
10
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.3%
2.1%
11
Genome Research
409 papers in training set
Top 2%
1.7%
12
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
13
PLOS ONE
4510 papers in training set
Top 57%
1.5%
14
Genome Biology
555 papers in training set
Top 5%
1.3%
15
Communications Biology
886 papers in training set
Top 14%
1.2%
16
Molecular Ecology Resources
161 papers in training set
Top 0.8%
1.0%
17
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
18
PLOS Genetics
756 papers in training set
Top 14%
0.8%
19
Methods in Ecology and Evolution
160 papers in training set
Top 2%
0.8%
20
Bioinformatics
1061 papers in training set
Top 9%
0.7%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
22
Developmental Dynamics
50 papers in training set
Top 0.9%
0.6%