Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/20793
Title: | Feature selection and allocation to diverse subsets for multi-label learning problems with large datasets | Authors: | Zdravevski, Eftim Lameski, Petre Kulakov, Andrea Gjorgjevikj, Dejan |
Issue Date: | 7-Sep-2014 | Publisher: | IEEE | Conference: | 2014 Federated Conference on Computer Science and Information Systems | Abstract: | Feature selection is important phase in machine learning and in the case of multi-label classification, it can be considerably challenging. In like manner, finding the best subset of good features is involved and difficult when the dataset has significantly large number of features (more than a thousand). In this paper we address the problem of feature selection for multilabel classification with large number of features. The proposed method is a hybrid of two phases - preliminary feature selection based on the information value and additional correlation-based selection. We show how with the first phase we can do preliminary selection of features from tens of thousands to couple of hundred, and then with the second phase we can make fine-grained feature selection with more sophisticated but computationally intensive methods. Finally, we analyze the ways of allocating the selected features to diverse subsets, which are suitable for training of ensembles of classifiers. | URI: | http://hdl.handle.net/20.500.12188/20793 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Show full item record
Page view(s)
51
checked on Oct 11, 2024
Download(s)
12
checked on Oct 11, 2024
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.