A machine learning approach to mass spectra classification with unsupervised feature selection
Ceccarelli M, d'Acierno A, Facchiano A. A machine learning approach to mass spectra classification with unsupervised feature selection. In: "Computational Intelligence Methods for Bioinformatics and Biostatistics." Edited by: F Masulli, R Tagliaferri and GM Verkhiver; CINN 2008, LNBI 5488, pp.242-252, 2009. Springer-Verlag Berlin Heidelberg 2009
Mass spectrometry spectra are recognized as a screening tool for detecting discriminatory protein patterns. Mass spectra, however, are high dimensional data and a large number of local maxima (a.k.a. peaks) have to be analyzed; to tackle this problem we have developed a three-step strategy. After data pre-processing we perform an unsupervised feature selection phase aimed at detecting salient parts of the spectra which could be useful for the subsequent classification phase. The main contribution of the paper is the development of this feature selection and extraction procedure grounded on the theory of multi-scale spaces. Then we use support vector machines for classification. Results obtained by the analysis of a data set of tumor/healthy samples allowed us to correctly classify more than 95% of samples. ROC analysis has been also performer.

