16 February 2024 | Chai Chee Chiet, Khoh Wee How, Pang Ying Han and Yap Hui Yen
This study proposes a lung cancer detection system using pre-trained convolutional neural network (CNN) models to improve the accuracy of early detection. The research aims to evaluate the performance of various pre-trained CNN models, including ResNet-50, VGG-16, Xception, and MobileNet, in classifying lung cancer from CT scan images. The study also investigates the effectiveness of adding extra layers and modifying parameters such as epochs, batch size, and optimizer to enhance model performance. The dataset used is the LIDC-IDRI, a publicly available dataset containing CT scan images of lung cancer patients. The images are pre-processed using techniques such as median filtering, anisotropic diffusion, and thresholding to reduce noise and improve image quality. The pre-processed images are then used to train and test the CNN models. The best-performing model is the pre-trained VGG-16 with added fully connected layers, 16 batch sizes, and the Adam optimizer, achieving an accuracy of 86.71%. The study also compares the performance of various machine learning models, including KNN regression, SVM with linear and RBF kernels, which achieved 100% accuracy. The results show that pre-trained CNN models outperform traditional machine learning models in terms of accuracy. The study concludes that the pre-trained VGG-16 model is the most effective for lung cancer detection, and future research should explore other models such as DenseNet for improved performance. The study also recommends combining metadata with pre-processed imagery data for more accurate results.This study proposes a lung cancer detection system using pre-trained convolutional neural network (CNN) models to improve the accuracy of early detection. The research aims to evaluate the performance of various pre-trained CNN models, including ResNet-50, VGG-16, Xception, and MobileNet, in classifying lung cancer from CT scan images. The study also investigates the effectiveness of adding extra layers and modifying parameters such as epochs, batch size, and optimizer to enhance model performance. The dataset used is the LIDC-IDRI, a publicly available dataset containing CT scan images of lung cancer patients. The images are pre-processed using techniques such as median filtering, anisotropic diffusion, and thresholding to reduce noise and improve image quality. The pre-processed images are then used to train and test the CNN models. The best-performing model is the pre-trained VGG-16 with added fully connected layers, 16 batch sizes, and the Adam optimizer, achieving an accuracy of 86.71%. The study also compares the performance of various machine learning models, including KNN regression, SVM with linear and RBF kernels, which achieved 100% accuracy. The results show that pre-trained CNN models outperform traditional machine learning models in terms of accuracy. The study concludes that the pre-trained VGG-16 model is the most effective for lung cancer detection, and future research should explore other models such as DenseNet for improved performance. The study also recommends combining metadata with pre-processed imagery data for more accurate results.