PAH-former: Transfer Learning for Efficient Discovery of Pulmonary Arterial Hypertension-Associated Genes
Kawakami, T.; Hosokawa, S.; Masamichi, I.; Kurozumi, A.; Tanaka, R.; Minatsuki, S.; ishida, J.; Isagawa, T.; Kodera, S.; Takeda, N.
Show abstract
Single-cell RNA sequencing (scRNA-seq) of patient samples holds promise for understanding disease mechanisms, but faces the challenge of excessive cost and effort in acquisition, processing, and data analysis, making it essential to leverage existing data. Pulmonary artery hypertension (PAH) is a refractory disease characterized by pulmonary vascular remodeling, and access to patient specimens is limited due to difficulties in tissue collection. In this study, we employed transfer learning with Geneformer, a deep learning algorithm pre-trained with scRNA-seq datasets and fine-tuned it with public PAH lung tissue data to identify the disease-relevant genes. The resulting algorithm, which we named PAH- former, demonstrated that its prediction accuracy varied significantly depending on the dataset used for fine-tuning. PAH-former enabled us to perform in silico perturbation analysis and identified PAH related genes. Loss-of-function PAH related genes in human pulmonary artery endothelial cells increased the expression of SOX18, a signature gene of PAH. This integration of artificial intelligence and biological experiments can significantly advance our understanding of molecular mechanisms of PAH.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.