In feature selection we aim at reducing the dimensionality of a dataset by excluding characteristics that do not compromise, and potentially enhance, the classification of a set of samples. We present a new type of supervised and multivariate feature selection approach that works by constructing proximity graphs in such a way that the number of edges connecting samples from different classes is minimised. We present this general idea using the Minimum Spanning Tree as a proximity graph and an Evolutionary Algorithm approach is used to search for a feature subset. We compare the performance of our algorithm against other feature selection methods, (alpha,beta)-k-Feature Set, and a ranking-based feature selection method, based on the use of CM1-scores. We employ two publicly available real-world datasets (one with training and test variants). The classification accuracies have been evaluated using a total of 49 methods from an open source data mining and machine learning package WEKA.
13th Australasian Data Mining Conference (AusDM 2015). Proceedings of the 13th Australasian Data Mining Conference (Sydney 8-9 August, 2015) p. 129-139