Back

PRESGENE: A web server for PRediction of ESsential GENE using integrative machine learning strategies

Nandi, S.; Panditrao, G.; Ganguli, P.; Sarkar, R. R.

2022-11-25 bioinformatics
10.1101/2022.11.25.517801 bioRxiv
Show abstract

Study of essential genes in disease-causing organisms has wide application in the prediction of therapeutic targets and exploring different clinical strategies. Predicting gene essentiality for large set of genes in non-model, less explored organisms is challenging. Computational methods that use machine learning (ML)-based strategies are popularly adopted for essential gene prediction as they provide key advantage of considering diverse biological features. Previous works from our group have demonstrated two ML-based pipelines for predicting essential genes with high accuracy that mitigates the problems of sufficient labeled imbalanced dataset and limited labeled datasets of essential genes. Here we present PRESGENE at https://presgene.ncl.res.in, a ML-based web server for prediction of essential genes in unexplored eukaryotic and prokaryotic organisms. Our algorithms mitigate the problems of training dataset imbalance and limited availability of experimentally labeled data for essential genes. PRESGENE with its user-friendly web interface and high accuracy will prove to be a seamless experience for biologists looking for an accurate essential gene prediction server with limited labeled data for novel organisms.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 2%
12.3%
2
BMC Bioinformatics
383 papers in training set
Top 0.9%
10.0%
3
Briefings in Bioinformatics
326 papers in training set
Top 0.7%
6.8%
4
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
6.8%
5
PLOS ONE
4510 papers in training set
Top 28%
6.3%
6
Computers in Biology and Medicine
120 papers in training set
Top 0.7%
3.9%
7
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
2.9%
8
Scientific Reports
3102 papers in training set
Top 46%
2.6%
50% of probability mass above
9
Bioinformatics
1061 papers in training set
Top 6%
2.4%
10
Frontiers in Genetics
197 papers in training set
Top 3%
2.3%
11
BioData Mining
15 papers in training set
Top 0.2%
1.9%
12
PeerJ
261 papers in training set
Top 7%
1.7%
13
F1000Research
79 papers in training set
Top 2%
1.7%
14
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.2%
1.7%
15
Frontiers in Bioinformatics
45 papers in training set
Top 0.2%
1.7%
16
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.5%
17
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.5%
18
Journal of Computational Biology
37 papers in training set
Top 0.3%
1.3%
19
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
20
BMC Medical Genomics
36 papers in training set
Top 0.7%
1.2%
21
Informatics in Medicine Unlocked
21 papers in training set
Top 0.9%
0.9%
22
iScience
1063 papers in training set
Top 30%
0.8%
23
Genomics
60 papers in training set
Top 2%
0.8%
24
Quantitative Biology
11 papers in training set
Top 0.7%
0.7%
25
BMC Genomics
328 papers in training set
Top 6%
0.7%
26
Genes
126 papers in training set
Top 3%
0.7%
27
Gigabyte
60 papers in training set
Top 2%
0.7%
28
Vaccines
196 papers in training set
Top 3%
0.7%
29
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.7%
30
Nucleic Acids Research
1128 papers in training set
Top 20%
0.6%