- Title
- Efficient methods of feature selection based on combinatorial optimization motivated by the analysis of large biological datasets
- Creator
- Rocha de Paula, Mateus
- Relation
- University of Newcastle Research Higher Degree Thesis
- Resource Type
- thesis
- Date
- 2013
- Description
- Research Doctorate - Doctor of Philosophy (PhD)
- Description
- Intuitively, the Feature Selection problem is to choose a subset of a given a set of features that best represents the whole in a particular aspect, preserving the original semantics of the variables on the given samples and classes. In practice, the objective of finding such a subset is often to reveal a particular characteristic present in the given samples. In 2004, a new feature selection approach was proposed. It was based on a combinatorial optimization problem called (α, β)-k-Feature Set Problem. The main advantage of using this approach over ranking methods is that the features are evaluated as groups, instead of only considering their individual performance. The main drawback of this approach is the complexity of the combinatorial problems involved. Since some of them are NP-Complete, it is unlikely that there would exist an efficient method to solve them to optimality efficiently. To the best of the author’s knowledge at the moment of this research, the available tools to deal with the (α, β)-k-Feature Set Problem approach can not solve problems of the magnitude required by many practical applications. Given the big advantage brought by the multivariate characteristic of this method, its successful wide applicability and knowing that its only real known drawback is scalability, further research to overcome such a difficulty is appropriate. Even though the optimal solution of the problem is always desirable, it often is not strictly necessary in the case of many biological applications. Therefore, this work aims to propose fast heuristics to address the (α, β)-k-Feature Set Problem approach, and propose procedures to obtain dual bounds that do not rely on external optimization packages.
- Subject
- Feature Selection; combinatorial optimization; heuristics; bioinformatics; (a,b)-k-Feature Set
- Identifier
- http://hdl.handle.net/1959.13/938563
- Identifier
- uon:12638
- Rights
- Copyright 2013 Mateus Rocha de Paula
- Language
- eng
- Full Text
- Hits: 1390
- Visitors: 1758
- Downloads: 364
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | ATTACHMENT01 | Abstract | 138 KB | Adobe Acrobat PDF | View Details Download | ||
View Details Download | ATTACHMENT02 | Thesis | 1 MB | Adobe Acrobat PDF | View Details Download |