Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/20793
Title: Feature selection and allocation to diverse subsets for multi-label learning problems with large datasets
Authors: Zdravevski, Eftim 
Lameski, Petre 
Kulakov, Andrea 
Gjorgjevikj, Dejan
Issue Date: 7-Sep-2014
Publisher: IEEE
Conference: 2014 Federated Conference on Computer Science and Information Systems
Abstract: Feature selection is important phase in machine learning and in the case of multi-label classification, it can be considerably challenging. In like manner, finding the best subset of good features is involved and difficult when the dataset has significantly large number of features (more than a thousand). In this paper we address the problem of feature selection for multilabel classification with large number of features. The proposed method is a hybrid of two phases - preliminary feature selection based on the information value and additional correlation-based selection. We show how with the first phase we can do preliminary selection of features from tens of thousands to couple of hundred, and then with the second phase we can make fine-grained feature selection with more sophisticated but computationally intensive methods. Finally, we analyze the ways of allocating the selected features to diverse subsets, which are suitable for training of ensembles of classifiers.
URI: http://hdl.handle.net/20.500.12188/20793
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers

Files in This Item:
File Description SizeFormat 
500.pdf141.84 kBAdobe PDFView/Open
Show full item record

Page view(s)

32
checked on May 1, 2024

Download(s)

4
checked on May 1, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.