Back

Deep Learning Architectures For the Prediction of YY1-Mediated Chromatin Loops

Abbasi, A. F.

2022-09-19 bioinformatics
10.1101/2022.09.19.508478 bioRxiv
Show abstract

YY1-mediated chromatin loops play substantial roles in basic biological processes like gene regulation, cell differentiation, and DNA replication. YY1-mediated chromatin loop prediction is important to understand diverse types of biological processes which may lead to the development of new therapeutics for neurological disorders and cancers. Existing deep learning predictors are capable to predict YY1-mediated chromatin loops in two different cell lines however, they showed limited performance for the prediction of YY1-mediated loops in the same cell lines and suffer significant performance deterioration in cross cell line setting. To provide computational predictors capable of performing large-scale analyses of YY1-mediated loop prediction across multiple cell lines, this paper presents two novel deep learning predictors. The two proposed predictors make use of Word2vec, one hot encoding for sequence representation and long short-term memory, and a convolution neural network along with a gradient flow strategy similar to DenseNet architectures. Both of the predictors are evaluated on two different benchmark datasets of two cell lines HCT116 and K562. Overall the proposed predictors outperform existing DEEPYY1 predictor with an average maximum margin of 4.65%, 7.45% in terms of AUROC, and accuracy, across both of the datases over the independent test sets and 5.1%, 3.2% over 5-fold validation. In terms of cross-cell evaluation, the proposed predictors boast maximum performance enhancements of up to 9.5% and 27.1% in terms of AUROC over HCT116 and K562 datasets.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.3%
10.3%
2
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
8.6%
3
BMC Bioinformatics
383 papers in training set
Top 1%
6.9%
4
Bioinformatics
1061 papers in training set
Top 4%
6.5%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.5%
6.5%
6
Frontiers in Genetics
197 papers in training set
Top 0.7%
6.5%
7
Scientific Reports
3102 papers in training set
Top 22%
4.9%
50% of probability mass above
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.4%
4.0%
9
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
2.6%
10
Advanced Science
249 papers in training set
Top 7%
2.6%
11
Communications Biology
886 papers in training set
Top 4%
2.4%
12
Computers in Biology and Medicine
120 papers in training set
Top 1%
2.4%
13
PLOS ONE
4510 papers in training set
Top 53%
1.7%
14
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
1.7%
15
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.4%
16
Frontiers in Bioinformatics
45 papers in training set
Top 0.4%
1.2%
17
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
18
Biology Methods and Protocols
53 papers in training set
Top 2%
0.9%
19
BioData Mining
15 papers in training set
Top 0.6%
0.9%
20
Journal of Computational Biology
37 papers in training set
Top 0.4%
0.9%
21
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
22
Neurocomputing
13 papers in training set
Top 0.6%
0.8%
23
Frontiers in Immunology
586 papers in training set
Top 7%
0.8%
24
Quantitative Biology
11 papers in training set
Top 0.7%
0.8%
25
Genes
126 papers in training set
Top 3%
0.8%
26
Nature Machine Intelligence
61 papers in training set
Top 3%
0.8%
27
Nucleic Acids Research
1128 papers in training set
Top 17%
0.8%
28
iScience
1063 papers in training set
Top 31%
0.8%
29
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.8%
30
Methods
29 papers in training set
Top 0.7%
0.7%