Classification with binary gene expressions
Tuna, Salih and Niranjan, Mahesan (2009) Classification with binary gene expressions. Journal of Biomedical Science and Engineering, 2, (6), 390-399.
Microarray gene expression measurements are reported, used and archived usually to high numerical precision. However, properties of mRNA molecules, such as their low stability and availability in small copy numbers, and the fact that measurements correspond to a population of cells, rather than a single cell, makes high precision meaningless. Recent work shows that reducing measurement precision leads to very little loss of information, right down to binary levels. In this paper we show how properties of binary spaces can be useful in making infer-ences from microarray data. In particular, we use the Tanimoto similarity metric for binary vectors, which has been used effectively in the Chemoinformatics literature for retrieving che- mical compounds with certain functional prop-erties. This measure, when incorporated in a kernel framework, helps recover any informa-tion lost by quantization. By implementing a spectral clustering framework, we further show that a second reason for high performance from the Tanimoto metric can be traced back to a hitherto unnoticed systematic variability in ar-ray data: Probe level uncertainties are system-atically lower for arrays with large numbers of expressed genes. While we offer no molecular level explanation for this systematic variability, that it could be exploited in a suitable similarity metric is a useful observation in itself. We fur-ther show preliminary results that working with binary data considerably reduces variability in the results across choice of algorithms in the pre-processing stages of microarray analysis.
|Divisions:||Faculty of Physical and Applied Science > Electronics and Computer Science > Comms, Signal Processing & Control
|Date Deposited:||11 Nov 2009 14:19|
|Last Modified:||01 Mar 2012 15:51|
|Contributors:||Tuna, Salih (Author)
Niranjan, Mahesan (Author)
|Further Information:||Google Scholar|
|RDF:||RDF+N-Triples, RDF+N3, RDF+XML, Browse.|
Actions (login required)