Accepted: 19 January 2024 / Published online: 6 February 2024 | Amol Avinash Joshi, Rabia Musheer Aziz
The paper presents a two-phase cuckoo search and spider monkey optimization (CS and SMO) approach, referred to as SMOCS and CSSMO, for gene selection and deep learning classification of cancer disease using gene expression data. The methods aim to identify a subset of genes that aid in early-stage cancer prediction by combining the strengths of both metaheuristic algorithms. To enhance accuracy, the minimum redundancy maximum relevance (mRMR) method is employed to reduce redundancy in the cancer datasets. The selected gene subsets are then classified using deep learning (DL) to identify distinct groups or classes associated with specific cancer types. The performance of the proposed approaches is evaluated using six different cancer datasets, with metrics such as Recall, Precision, F1-Score, and confusion matrix analysis. The results show that both SMOCS and CSSMO achieve high prediction accuracy, with SMOCS achieving a maximum accuracy of 100% across all datasets. The paper highlights the importance of gene expression data in cancer research and the challenges of managing high-dimensional data, emphasizing the role of feature selection and extraction techniques in improving classifier performance.The paper presents a two-phase cuckoo search and spider monkey optimization (CS and SMO) approach, referred to as SMOCS and CSSMO, for gene selection and deep learning classification of cancer disease using gene expression data. The methods aim to identify a subset of genes that aid in early-stage cancer prediction by combining the strengths of both metaheuristic algorithms. To enhance accuracy, the minimum redundancy maximum relevance (mRMR) method is employed to reduce redundancy in the cancer datasets. The selected gene subsets are then classified using deep learning (DL) to identify distinct groups or classes associated with specific cancer types. The performance of the proposed approaches is evaluated using six different cancer datasets, with metrics such as Recall, Precision, F1-Score, and confusion matrix analysis. The results show that both SMOCS and CSSMO achieve high prediction accuracy, with SMOCS achieving a maximum accuracy of 100% across all datasets. The paper highlights the importance of gene expression data in cancer research and the challenges of managing high-dimensional data, emphasizing the role of feature selection and extraction techniques in improving classifier performance.
[slides and audio] A two-phase cuckoo search based approach for gene selection and deep learning classification of cancer disease using gene expression data with a novel fitness function