Now showing 1 - 10 of 21
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Local Hybrid SVMDT Classifier
    (IEEE, 2011-11)
    ;
    ;
    Support vector machines are among the most precise classifiers available, but this precision comes at the cost of speed. There have been many ideas and implementations for improving the speed of support vector machines. While most of the existing methods focus on reducing the number of support vectors in order to gain speed, our approach additionally focuses on reducing the number of samples, which need to be classified by the support vector machines in order to reach the final decision about a sample class. In this paper we propose a novel architecture that integrates decision trees and local SVM classifiers for binary classification. Results show that there is a significant improvement in speed with little or no compromise to classification accuracy.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Sarcasm and Irony Detection in English Tweets
    (Springer International Publishing, 2018)
    Dimovska, Jona
    ;
    Angelovska, Marina
    ;
    ;
    This paper describes an approach to sarcasm and irony detection in English tweets. Accurate sarcasm and irony detection in text is crucial for numerous NLP applications like sentiment analysis, opinion mining and text summarization. The detection of irony and sarcasm in microblogging posts can be even more challenging because of the restricted length of the message at hand, the informal language, emoticons and hash tags used. In our approach we combined a variety of standard lexical and syntactic features with specific features for capturing figurative content. All experiments were performed using supervised learning using different approaches for text preprocessing and feature extraction and four different classifiers. The corpus used was taken from SemEval2018 challenge containing a dataset with 3834 different tweets. The performance of the different approaches are reported and commented. The results have shown that the text preprocessing has very little impact on the results, while the word and sub-word frequencies are the most usable characteristics for determining irony in tweets. A separate experiment including a survey was also conducted in which human participants were challenged to label 20 given tweets from the dataset as ironic or not. The obtained results suggest that accurate irony detection in tweets can be a hard task even for humans.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Robustness of Speech Recognition System of Isolated Speech in Macedonian
    (Springer, 2015)
    Spasovski, Daniel
    ;
    Peshanski, Goran
    ;
    ;
    Over five decades the scientists attempt to design machine that clearly transcripts the spoken words. Even though satisfactory accuracy is achieved, machines cannot recognize every voice, in any environment, from any speaker. In this paper we tackle the problem of robustness of Automatic Speech Recognition for isolated Macedonian speech in noisy environments. The goal is to exceed the problem of background noise type changing. Five different types of noise were artificially added to the audio recordings and the models were trained and evaluated for each one. The worst case scenario for the speech recognition systems turned out to be the babble noise, which in the higher levels of noise reaches 81.10% error rate. It is shown that as the noise increases the error rate is also increased and the model trained with clean speech, gives considerably better results in lower noise levels.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Dual Layer Voting Method for Efficient Multi-label Classification
    (Springer Berlin Heidelberg, 2011)
    ;
    ;
    Džeroski, Sašo
    A common approach for solving multi-label classification problems using problem-transformation methods and dichotomizing classifiers is the pairwise decomposition strategy. One of the problems with this approach is the need for querying a quadratic number of binary classifiers for making a prediction that can be quite time consuming, especially in classification problems with large number of labels. To tackle this problem we propose a Dual Layer Voting Method (DLVM) for efficient pair-wise multiclass voting to the multi-label setting, which is related to the calibrated label ranking method. Five different real-world datasets (enron, tmc2007, genbase, mediamill and corel5k) were used to evaluate the performance of the DLVM. The performance of this voting method was compared with the majority voting strategy used by the calibrated label ranking method and the quick weighted voting algorithm (QWeighted) for pair-wise multi-label classification. The results from the experiments suggest that the DLVM significantly outperforms the concurrent algorithms in term of testing speed while keeping comparable or offering better prediction performance.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    ASGRT – Automated Report Generation System
    (Springer Berlin Heidelberg, 2011)
    ;
    ;
    ;
    Angelovski, Martin
    ;
    Georgiev, Marjan
    We have come to a point in time when there is an abundance of database usage in almost all aspects of our lives. However, most of the end users have neither the knowledge nor the need to manage the databases. Even more important, they are unable to generate the ever changing reports they need, based on the data in their databases. Our Applicative Solution for Generating Reports from Templates (ASGRT) tries to deal efficiently with this issue. It has a simple yet effective architectural design aimed to give power to the more experienced administrators and simplicity to common end users, to generate reports with their own criteria and design, from their databases. The presented software enables creation of templates containing text and tags that are recognized and substituted by values retrieved from the database, therefore enabling creation of customized reports with varying ease of use and flexibility.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    HYBRID DECISION TREE ARCHITECTURE UTILIZING LOCAL SVMs FOR EFFICIENT MULTI-LABEL LEARNING
    (World Scientific Pub Co Pte Lt, 2013-11)
    ;
    ;
    DŽEROSKI, SAŠO
    Multi-label learning (MLL) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLL are the large-scale problem, which have a strong impact on the computational complexity of learning. These problems are especially pronounced for approaches that transform MLL problems into a set of binary classification problems for which Support Vector Machines (SVMs) are used. On the other hand, the most efficient approaches to MLL, based on decision trees, have clearly lower predictive performance. We propose a hybrid decision tree architecture, where the leaves do not give multi-label predictions directly, but rather utilize local SVM-based classifiers giving multi-label predictions. A binary relevance architecture is employed in the leaves, where a binary SVM classifier is built for each of the labels relevant to that particular leaf. We use a broad range of multi-label datasets with a variety of evaluation measures to evaluate the proposed method against related and state-of-the-art methods, both in terms of predictive performance and time complexity. Our hybrid architecture on almost every large classification problem outperforms the competing approaches in terms of the predictive performance, while its computational efficiency is significantly improved as a result of the integrated decision tree.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Two stage architecture for multi-label learning
    (Elsevier BV, 2012-03)
    ;
    ;
    Džeroski, Sašo
    A common approach to solving multi-label learning problems is to use problem transformation methods and dichotomizing classifiers as in the pair-wise decomposition strategy. One of the problems with this strategy is the need for querying a quadratic number of binary classifiers for making a prediction that can be quite time consuming, especially in learning problems with a large number of labels. To tackle this problem, we propose a Two Stage Architecture (TSA) for efficient multi-label learning. We analyze three implementations of this architecture the Two Stage Voting Method (TSVM), the Two Stage Classifier Chain Method (TSCCM) and the Two Stage Pruned Classifier Chain Method (TSPCCM). Eight different real-world datasets are used to evaluate the performance of the proposed methods. The performance of our approaches is compared with the performance of two algorithm adaptation methods (Multi-Label k-NN and Multi-Label C4.5) and five problem transformation methods (Binary Relevance, Classifier Chain, Calibrated Label Ranking with majority voting, the Quick Weighted method for pair-wise multi-label learning and the Label Powerset method). The results suggest that TSCCM and TSPCCM outperform the competing algorithms in terms of predictive accuracy, while TSVM has comparable predictive performance. In terms of testing speed, all three methods show better performance as compared to the pair-wise methods for multi-label learning.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Efficient Two Stage Voting Architecture for Pairwise Multi-label Classification
    (Springer Berlin Heidelberg, 2010)
    ;
    ;
    A common approach for solving multi-label classification problems using problem-transformation methods and dichotomizing classifiers is the pair-wise decomposition strategy. One of the problems with this approach is the need for querying a quadratic number of binary classifiers for making a prediction that can be quite time consuming especially in classification problems with large number of labels. To tackle this problem we propose a two stage voting architecture (TSVA) for efficient pair-wise multiclass voting to the multi-label setting, which is closely related to the calibrated label ranking method. Four different real-world datasets (enron, yeast, scene and emotions) were used to evaluate the performance of the TSVA. The performance of this architecture was compared with the calibrated label ranking method with majority voting strategy and the quick weighted voting algorithm (QWeighted) for pair-wise multi-label classification. The results from the experiments suggest that the TSVA significantly outperforms the concurrent algorithms in term of testing speed while keeping comparable or offering better prediction performance.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Evaluation of different feature sets for gait recognition using skeletal data from Kinect
    (IEEE, 2014-05)
    Dikovski, Bojan
    ;
    ;
    Gait is a persons manner of walking. It is a biometric that can be used for identifying humans. Gait is an unobtrusive metric that can be obtained from distance, and this is its main strength compared to other biometrics. In this paper we construct and evaluate feature sets with the purpose of finding out the role of different types of features and body parts in the recognition process. The feature sets were constructed from skeletal images in three dimensions made with a Kinect sensor. The Kinect is a low-cost device that includes RGB, depth and audio sensors. In our work automated gait cycle extraction algorithm was performed on the Kinect recordings. Metrics like angles and distances between joints were aggregated within a gait cycle, and from those aggregations the different feature datasets were constructed. Multilayer perceptron, support vector machine with sequential minimal optimization and J48 algorithms were used for classification on these datasets. At the end we give conclusions on which groups of features and body parts gave the best recognition rates.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Parallelization of Dynamic Programming in Nussinov RNA Folding Algorithm on the CUDA GPU
    (Springer Berlin Heidelberg, 2012)
    Stojanovski, Marina Zaharieva
    ;
    ;
    When an RNA primary sequence is folded back on itself, forming complementary base-pairs, a form called RNA secondary structure is created. The first solution for the RNA secondary structure prediction problem was the Nussinov dynamic programming algorithm developed in 1978 which is still an irreplaceable base that all other approaches rely on. In this work, the Nussinov algorithm is analyzed but from the CUDA GPU programming perspective. The algorithm is radically redesigned in order to utilize the highly parallel NUMA architecture of the GPU. The implementation of the Nussinov algorithm on CUDA architecture for NVidia GeForce 8500 GT graphic card results with sub- stantial acceleration compared with the sequential executed algorithm.