Back

Robust Feature Selection for Cancer Microarray Data Using a Hybrid mRMR and Binary Lion Optimization Algorithm

Sahu, B.; Panigrahi, A.; Abhilash Pati, A. P.; Madhavi, B. K.; Mishra, J.; Budhathoki, R. K.; Mallik, S.

2025-10-08 public and global health
10.1101/2025.10.07.25337478 medRxiv
Show abstract

Microarray cancer datasets are characterized by a large number of irrelevant, redundant, and noisy features, which can severely hinder the accuracy and efficiency of classification algorithms. Feature selection, as a crucial branch of feature engineering, aims to enhance classification performance by identifying and retaining only the most informative features. However, feature selection is an NP-hard problem, where conventional search strategies are often prone to premature convergence and local optima, resulting in increased computational burden. To address these challenges, global metaheuristic algorithms have been widely explored. The recently proposed Lion Optimization (LO) algorithm has shown promising results for continuous optimization problems, yet its design is not inherently suited for discrete feature selection tasks. To overcome this limitation, a binary variant of the LO algorithm, termed Binary Lion Optimization (BLO), is introduced for wrapper-based feature selection in microarray cancer data analysis. In this work, the Minimum Redundancy Maximum Relevance (mRMR) criterion is first employed as a filter method to identify an initial subset of relevant features, thereby reducing search complexity. The refined feature subset is then optimized using the BLO algorithm to achieve improved classification outcomes. The proposed mRMR-BLO framework was evaluated on several widely recognized cancer microarray datasets and benchmarked against four state-of-the-art binary optimization algorithms. Experimental results demonstrate that mRMR-BLO consistently identifies smaller yet highly discriminative feature subsets, while achieving competitive or superior prediction accuracy. These findings highlight the potential of mRMR-BLO as an effective and robust tool for high-dimensional microarray cancer classification.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Expert Systems with Applications
11 papers in training set
Top 0.1%
18.1%
2
PLOS ONE
4510 papers in training set
Top 14%
12.9%
3
PLOS Computational Biology
1633 papers in training set
Top 3%
10.4%
4
Scientific Reports
3102 papers in training set
Top 16%
6.6%
5
BMC Cancer
52 papers in training set
Top 0.4%
4.5%
50% of probability mass above
6
BMC Genomics
328 papers in training set
Top 1%
2.7%
7
Biomedical Signal Processing and Control
18 papers in training set
Top 0.2%
2.2%
8
Frontiers in Public Health
140 papers in training set
Top 4%
1.9%
9
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
1.7%
10
Heliyon
146 papers in training set
Top 2%
1.7%
11
Applied Sciences
24 papers in training set
Top 0.3%
1.7%
12
Frontiers in Plant Science
240 papers in training set
Top 4%
1.4%
13
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.3%
14
Communications Biology
886 papers in training set
Top 14%
1.3%
15
Journal of Genetics and Genomics
36 papers in training set
Top 2%
0.9%
16
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.9%
17
Malaria Journal
48 papers in training set
Top 1%
0.9%
18
Frontiers in Physics
20 papers in training set
Top 0.7%
0.9%
19
IEEE Access
31 papers in training set
Top 0.8%
0.8%
20
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
22
Nature Communications
4913 papers in training set
Top 62%
0.8%
23
iScience
1063 papers in training set
Top 30%
0.8%
24
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.8%
25
Computational and Structural Biotechnology Journal
216 papers in training set
Top 9%
0.8%
26
Frontiers in Physiology
93 papers in training set
Top 6%
0.8%
27
Bioinformatics
1061 papers in training set
Top 9%
0.8%
28
Physical Biology
43 papers in training set
Top 2%
0.7%
29
Journal of Neural Engineering
197 papers in training set
Top 2%
0.7%
30
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.7%