Applying bagging techniques to the SA Tabu miner rule induction algorithm
Date Issued
2009-09-28
Author(s)
Andovska, Mirjana
Abstract
This paper presents an implementation of bagging techniques over
the heuristic algorithm for induction of classification rules called SA Tabu
Miner. The goal was to achieve better predictive accuracy of the derived
classification rules. Bagging (Bootstrap aggregating) is an ensemble method
that has attracted a lot of attention, both experimentally, since it behaves well
on noisy datasets, and theoretically, because of its simplicity. It is directly
related to bootstrap sampling, since it uses the bootstrap samples to train
multiple predictors. The outputs of the predictors are then combined by various
voting strategies. Bootstrap is a good solution when it is impossible, or too
expensive, to get multiple samples. In this paper we present the experimental
results of various bagging versions of the SA Tabu Miner algorithm. The SA
Tabu Miner [1] algorithm is inspired by both research on heuristic optimization
algorithms (Simulated Annealing and Tabu Search based Data Miner) and rule
induction data mining concepts and principles. The algorithm creates rules
incrementally, performing a sequential process to discover a list of
classification rules covering as many as possible training cases with as big
quality as possible. It uses a combination of Simulated Annealing and Tabu
Search to perform the search for the optimal classification rule. Several
bootstrap methodologies were applied to SA Tabu Miner, including reducing
repetition of instances, forcing repetition of instances not to exceed two, using
different percentages of the original basic training set. Various experimental
approaches and parameters yielded different results on the compared datasets.
In the paper we discuss the results and conclude that the best improvement in
predictive accuracy was achieved by using only 10 voting classifiers derived
from 90% of the basic training dataset.
the heuristic algorithm for induction of classification rules called SA Tabu
Miner. The goal was to achieve better predictive accuracy of the derived
classification rules. Bagging (Bootstrap aggregating) is an ensemble method
that has attracted a lot of attention, both experimentally, since it behaves well
on noisy datasets, and theoretically, because of its simplicity. It is directly
related to bootstrap sampling, since it uses the bootstrap samples to train
multiple predictors. The outputs of the predictors are then combined by various
voting strategies. Bootstrap is a good solution when it is impossible, or too
expensive, to get multiple samples. In this paper we present the experimental
results of various bagging versions of the SA Tabu Miner algorithm. The SA
Tabu Miner [1] algorithm is inspired by both research on heuristic optimization
algorithms (Simulated Annealing and Tabu Search based Data Miner) and rule
induction data mining concepts and principles. The algorithm creates rules
incrementally, performing a sequential process to discover a list of
classification rules covering as many as possible training cases with as big
quality as possible. It uses a combination of Simulated Annealing and Tabu
Search to perform the search for the optimal classification rule. Several
bootstrap methodologies were applied to SA Tabu Miner, including reducing
repetition of instances, forcing repetition of instances not to exceed two, using
different percentages of the original basic training set. Various experimental
approaches and parameters yielded different results on the compared datasets.
In the paper we discuss the results and conclude that the best improvement in
predictive accuracy was achieved by using only 10 voting classifiers derived
from 90% of the basic training dataset.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
Applying_Bagging_Techniques_to_the_SA_Tabu_Miner_R.pdf
Size
260.54 KB
Format
Adobe PDF
Checksum
(MD5):f63aa9b29871a4cc6ed1de399e5ecb9d
