A classification model for the Leiden proteomics competition

Authors	H.C.J. Hoefsloot S. Smit A.K. Smilde
Publication date	2008
Journal	Statistical Applications in Genetics and Molecular Biology
Volume \| Issue number	7 \| 2
Pages (from-to)	8
Number of pages	10
Organisations	Faculty of Science (FNWI) - Swammerdam Institute for Life Sciences (SILS)
Abstract	A strategy is presented to build a discrimination model in proteomics studies. The model is built using cross-validation. This cross-validation step can simply be combined with a variable selection method, called rank products. The strategy is especially suitable for the low-samples-to-variables-ratio (undersampling) case, as is often encountered in proteomics and metabolomics studies. As a classification method, Principal Component Discriminant Analysis is used; however, the methodology can be used with any classifier. A data set containing serum samples from breast cancer patients and healthy controls is analysed. Double cross-validation shows that the sensitivity of the model is 82% and the specificity 86%. Potential putative biomarkers are identified using the variable selection method. In each cross-validation loop a classification model is built. The final classification uses a majority voting scheme from the ensemble classifier.
Document type	Article
Published at	http://www.bepress.com/sagmb/vol7/iss2/art8
Downloads	293458.pdf
Permalink to this page

Back

UvA-DARE