Back

Kernel Matrix Completion with Topological and Spectral Features for Multi-Modal Classification

Rinon, E. M.; Visaya, M. V.; Sambayan, R.

2026-04-22 bioinformatics
10.64898/2026.04.19.713528 bioRxiv
Show abstract

Kernel methods offer a robust framework for integrating multi-modal datasets into a unified representation, thereby facilitating more comprehensive data interpretation. In the presence of incomplete datasets, multiple kernel learning is employed to enhance the efficiency of data completion and integration. We investigate kernel-based approaches to address the incomplete-data problem with applications to yeast protein data. Biological data such as yeast proteins can be represented through multiple modalities, including gene expression profiles, amino acid sequences, three-dimensional structures, and protein interaction networks. We introduce a computational pipeline based on kernel matrix completion, in which topological data analysis (TDA) and persistent spectral analysis are incorporated into the classification setting. TDA captures geometric structure across scales while spectral descriptors reflect connectivity patterns through Laplacian eigenvalues. Kernel, topological, and spectral descriptors are used with support vector machines to discriminate between membrane and non-membrane yeast proteins. Empirical results show that the combined pipeline improves both kernel completion accuracy and ROC performance relative to baseline kernel-only approaches. The best-performing configuration achieves an ROC score of 0.8632 using the average of three kernels augmented with TDA features. These results demonstrate competitive performance relative to strong kernel-based baselines under incomplete data conditions. The proposed approach provides a unified approach for learning from incomplete heterogeneous data while enriching kernel representations with geometric and spectral information.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.9%
26.0%
2
BMC Bioinformatics
383 papers in training set
Top 0.9%
10.1%
3
PLOS Computational Biology
1633 papers in training set
Top 8%
4.2%
4
Scientific Reports
3102 papers in training set
Top 36%
3.6%
5
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.6%
6
PLOS ONE
4510 papers in training set
Top 39%
3.6%
50% of probability mass above
7
Bioinformatics Advances
184 papers in training set
Top 2%
2.7%
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.6%
2.6%
9
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
2.1%
10
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
1.9%
11
Journal of Computational Biology
37 papers in training set
Top 0.2%
1.7%
12
Advanced Science
249 papers in training set
Top 11%
1.7%
13
Nucleic Acids Research
1128 papers in training set
Top 12%
1.5%
14
Frontiers in Genetics
197 papers in training set
Top 6%
1.5%
15
Nature Communications
4913 papers in training set
Top 53%
1.5%
16
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.5%
17
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.3%
18
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
19
Communications Biology
886 papers in training set
Top 14%
1.2%
20
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.4%
1.2%
21
GigaScience
172 papers in training set
Top 2%
1.2%
22
Patterns
70 papers in training set
Top 1%
1.2%
23
International Journal of Molecular Sciences
453 papers in training set
Top 12%
1.0%
24
Journal of Proteome Research
215 papers in training set
Top 2%
0.9%
25
Frontiers in Microbiology
375 papers in training set
Top 9%
0.7%
26
Frontiers in Plant Science
240 papers in training set
Top 5%
0.7%
27
Expert Systems with Applications
11 papers in training set
Top 0.4%
0.7%
28
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 46%
0.7%
29
BioData Mining
15 papers in training set
Top 1%
0.6%