Classification of lung cancer using ensemble-based feature selection and machine learning methods
Publication in refereed journal


Times Cited
Web of Science46WOS source URL (as at 15/09/2020) Click here for the latest count
Altmetrics Information
.

Other information
AbstractLung cancer is one of the leading causes of death worldwide. There are three major types of lung cancers, non-small cell lung cancer (NSCLC), small cell lung cancer (SCLC) and carcinoid. NSCLC is further classified into lung adenocarcinoma (LADC), squamous cell lung cancer (SQCLC) as well as large cell lung cancer. Many previous studies demonstrated that DNA methylation has emerged as potential lung cancer-specific biomarkers. However, whether there exists a set of DNA methylation markers simultaneously distinguishing such three types of lung cancers remains elusive. In the present study, ROC (Receiving Operating Curve), RFs (Random Forests) and mRMR (Maximum Relevancy and Minimum Redundancy) were proposed to capture the unbiased, informative as well as compact molecular signatures followed by machine learning methods to classify LADC, SQCLC and SCLC. As a result, a panel of 16 DNA methylation markers exhibits an ideal classification power with an accuracy of 86.54%, 84.6% and a recall 84.37%, 85.5% in the leave-one-out cross-validation (LOOCV) and independent data set test experiments, respectively. Besides, comparison results indicate that ensemble-based feature selection methods outperform individual ones when combined with the incremental feature selection (IFS) strategy in terms of the informative and compact property of features. Taken together, results obtained suggest the effectiveness of the ensemble-based feature selection approach and the possible existence of a common panel of DNA methylation markers among such three types of lung cancer tissue, which would facilitate clinical diagnosis and treatment.
All Author(s) ListCai ZH, Xu D, Zhang Q, Zhang JX, Ngai SM, Shao JL
Journal nameMolecular BioSystems
Year2015
Month1
Day1
Volume Number11
Issue Number3
PublisherRoyal Society of Chemistry
Pages791 - 800
ISSN1742-206X
eISSN1742-2051
LanguagesEnglish-United Kingdom
Web of Science Subject CategoriesBiochemistry & Molecular Biology

Last updated on 2020-16-09 at 00:27