Back

GAISHI: A Python Package for Detecting Ghost Introgression with Machine Learning

Huang, X.; Hackl, J.; Kuhlwilm, M.

2026-02-03 bioinformatics
10.64898/2026.01.31.703038 bioRxiv
Show abstract

SummaryGhost introgression is a challenging problem in population genetics. Recent studies have explored supervised learning models, namely logistic regression and UNet++, to detect genomic footprints of ghost introgression. However, their applicability is limited because existing implementations are tailored to tasks in their respective publications, but not available as software implementations. Here, we present GAISHI, a Python package for identifying introgressed segments and alleles using machine learning and demonstrate its usage in a Human-Neanderthal introgression scenario. Availabity and implementationGAISHI is available on GitHub under the GNU General Public License v3.0. The source code can be found at https://github.com/xin-huang/gaishi.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
18.3%
2
Frontiers in Genetics
197 papers in training set
Top 0.3%
10.3%
3
BMC Bioinformatics
383 papers in training set
Top 1.0%
9.9%
4
Bioinformatics Advances
184 papers in training set
Top 0.2%
9.9%
5
PLOS Computational Biology
1633 papers in training set
Top 4%
8.1%
50% of probability mass above
6
PLOS Genetics
756 papers in training set
Top 4%
3.6%
7
BMC Genomics
328 papers in training set
Top 0.9%
3.5%
8
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.9%
2.6%
9
Briefings in Bioinformatics
326 papers in training set
Top 3%
2.1%
10
Molecular Ecology Resources
161 papers in training set
Top 0.5%
2.0%
11
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.9%
12
Molecular Biology and Evolution
488 papers in training set
Top 2%
1.8%
13
GENETICS
189 papers in training set
Top 0.7%
1.7%
14
Scientific Reports
3102 papers in training set
Top 59%
1.7%
15
Genetics
225 papers in training set
Top 2%
1.7%
16
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.6%
17
European Journal of Human Genetics
49 papers in training set
Top 0.7%
1.6%
18
Genetics Selection Evolution
33 papers in training set
Top 0.1%
1.5%
19
Genome Research
409 papers in training set
Top 3%
1.5%
20
Genome Biology
555 papers in training set
Top 6%
0.9%
21
Genome Biology and Evolution
280 papers in training set
Top 2%
0.8%
22
Nucleic Acids Research
1128 papers in training set
Top 17%
0.8%
23
GigaScience
172 papers in training set
Top 3%
0.8%
24
PLOS ONE
4510 papers in training set
Top 68%
0.7%
25
Nature Communications
4913 papers in training set
Top 66%
0.6%