Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

More Information | Back to archive
Full Text of this article Full article [PDF] (760,98 kB)
doi doi:10.2390/biecoll-jib-2012-199
submission June 29, 2012
published July 24, 2012
NCBI PubMed PubMed ID 22829570

Diego Calvo-Dmgz, Juan Francisco Gálvez, Daniel Glez-Peña, Silvana Gómez-Meire and Florentino Fdez-Riverola

Correspondence should be addressed to:
Juan Francisco Gálvez
E.S Ingeniería Informática, University of Vigo, Ed. Politécnico, Campus Universitario As Lagoas s/n 32004 Ourense, Spain,
se.ogivu@nullzevlag


Abstract

DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS). The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significant differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

Reference

Diego Calvo-Dmgz, Juan Francisco Gálvez, Daniel Glez-Peña, Silvana Gómez-Meire and Florentino Fdez-Riverola. Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression. Journal of Integrative Bioinformatics, 9(3):199, 2012. Online Journal: http://journal.imbio.de/index.php?paper_id=199
imprint | sitemap | credits | top