Cluster Subspace Identification Via Conditional Entropy Calculations
James Diggans, (George Mason University), jdiggans@gmu.edu, and
Jeffrey L. Solka, (George Mason University), jsolka@gmu.edu
Abstract
Methods of high-level data exploration capable of robustness in the face of noise found within microarray data are few and far between. Solutions making use of all original features to derive cluster structure can be misleading while those that rely on a trivial feature selection can miss important characteristics. We present a method adapted from previous work in the field of geography (Guo et al, Wrokshop on Clustering High Dimensional Data and its Applications 2003) relying upon conditional entropy between pairs of dimensions to uncover underlying, native cluster structure within a dataset. Results will be presented on artificial and gene expression data sets.