Automated Structural Classification of Proteins by Using Decision Trees and Structural Protein Features
Date Issued
2009-09-28
Author(s)
Pepik, Bojan
Abstract
The protein function is tightly related to classification of proteins in
hierarchical levels where proteins share same or similar functions. One of the
most relevant protein classification schemes is the structural classification of
proteins (SCOP). The SCOP scheme has one negative drawback; due to its
manual classification methods, the dynamic of classification of new proteins is
much slower than the dynamic of discovering novel protein structures in the
protein data bank (PDB). In this work, we propose two approaches for
automated protein classification. We extract protein descriptors from the
structural coordinates stored in the PDB files. Then we apply C4.5 algorithm to
select the most appropriate descriptor features for protein classification based
on the SCOP hierarchy. We propose novel classification approach by
introducing a bottom-up classification flow, and a multi-level classification
approach. The results show that these approaches are much faster than other
similar algorithms with comparable accuracy.
hierarchical levels where proteins share same or similar functions. One of the
most relevant protein classification schemes is the structural classification of
proteins (SCOP). The SCOP scheme has one negative drawback; due to its
manual classification methods, the dynamic of classification of new proteins is
much slower than the dynamic of discovering novel protein structures in the
protein data bank (PDB). In this work, we propose two approaches for
automated protein classification. We extract protein descriptors from the
structural coordinates stored in the PDB files. Then we apply C4.5 algorithm to
select the most appropriate descriptor features for protein classification based
on the SCOP hierarchy. We propose novel classification approach by
introducing a bottom-up classification flow, and a multi-level classification
approach. The results show that these approaches are much faster than other
similar algorithms with comparable accuracy.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
2009_ICTInnovations_AutomatedStructuralClassificationofProteinsbyUsingDecisionTreesandStructuralProteinFeatures.pdf
Size
558 KB
Format
Adobe PDF
Checksum
(MD5):a3adedc3bdafb71c4a2c709e07263881
