Back

SFARI Genes and where to find them; classification modelling to identify genes associated with Autism Spectrum Disorder from RNA-seq data

Navarro, M.; Simpson, I.

2021-02-01 genomics
10.1101/2021.01.29.428754 bioRxiv
Show abstract

MotivationAutism spectrum disorder (ASD) has a strong, yet heterogeneous, genetic component. Among the various methods that are being developed to help reveal the underlying molecular aetiology of the disease, one that is gaining popularity is the combination of gene expression and clinical genetic data. For ASD, the SFARI-gene database comprises lists of curated genes in which presumed causative mutations have been identified in patients. In order to predict novel candidate SFARI-genes we built classification models combining differential gene expression data for ASD patients and unaffected individuals with a genes status in the SFARI-gene list. ResultsSFARI-genes were not found to be significantly associated with differential gene expression patterns, nor were they enriched in gene co-expression network modules that had a strong correlation with ASD diagnosis. However, network analysis and machine learning models that incorporate information from the whole gene co-expression network were able to predict novel candidate genes that share features of existing SFARI genes and have support for roles in ASD in the literature. We found a statistically significant bias related to the absolute level of gene expression for existing SFARI genes and their scores. It is essential that this bias be taken into account when studies interpret ASD gene expression data at gene, module and whole-network levels. AvailabilitySource code is available from GitHub (https://doi.org/10.5281/zenodo.4463693) and the accompanying data from The University of Edinburgh DataStore (https://doi.org/10.7488/ds/2980) Contactian.simpson@ed.ac.uk

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Journal of Neurodevelopmental Disorders
15 papers in training set
Top 0.1%
17.4%
2
Frontiers in Psychiatry
83 papers in training set
Top 0.3%
10.0%
3
Journal of Medical Genetics
28 papers in training set
Top 0.1%
6.8%
4
Journal of Autism and Developmental Disorders
12 papers in training set
Top 0.1%
4.3%
5
npj Genomic Medicine
33 papers in training set
Top 0.1%
4.3%
6
PLOS ONE
4510 papers in training set
Top 36%
3.9%
7
Bioinformatics
1061 papers in training set
Top 5%
3.6%
50% of probability mass above
8
Molecular Autism
29 papers in training set
Top 0.2%
3.6%
9
BMC Bioinformatics
383 papers in training set
Top 3%
3.6%
10
Scientific Reports
3102 papers in training set
Top 42%
3.1%
11
Genetics in Medicine
69 papers in training set
Top 0.5%
2.4%
12
Genetic Epidemiology
46 papers in training set
Top 0.3%
2.3%
13
Autism Research
32 papers in training set
Top 0.3%
1.9%
14
Genes
126 papers in training set
Top 1%
1.7%
15
F1000Research
79 papers in training set
Top 2%
1.7%
16
Brain Communications
147 papers in training set
Top 2%
1.5%
17
Translational Psychiatry
219 papers in training set
Top 3%
1.5%
18
Frontiers in Genetics
197 papers in training set
Top 6%
1.5%
19
American Journal of Medical Genetics Part B: Neuropsychiatric Genetics
22 papers in training set
Top 0.2%
1.3%
20
PeerJ
261 papers in training set
Top 10%
1.2%
21
International Journal of Molecular Sciences
453 papers in training set
Top 12%
0.9%
22
PLOS Computational Biology
1633 papers in training set
Top 21%
0.9%
23
Developmental Science
15 papers in training set
Top 0.1%
0.9%
24
BioData Mining
15 papers in training set
Top 0.9%
0.7%
25
JAMA Pediatrics
10 papers in training set
Top 0.2%
0.7%
26
Communications Biology
886 papers in training set
Top 24%
0.7%
27
Journal of Child Psychology and Psychiatry
25 papers in training set
Top 0.4%
0.7%
28
International Journal of Epidemiology
74 papers in training set
Top 3%
0.7%
29
European Journal of Human Genetics
49 papers in training set
Top 2%
0.6%
30
European Psychiatry
10 papers in training set
Top 0.8%
0.6%