Protein classification by using four approaches for extraction of the protein ray-based descriptor
Date Issued
2020-05-08
Author(s)
Mirceva, Georgina
Abstract
The knowledge about the protein molecules, and
how they influence the processes in the humans is very worth,
because it is really needed in order to develop new drugs for
diseases. In proteomics, one of the most important tasks is
solving the problem of classification of protein molecules. The
literature provides plethora of methods that could be used for
this task. However, it is still an open issue where still there is a
need for fast computational methods that would provide
accurate classification of proteins. In this paper, we focus on
solving this task. For that purpose, first, we extract feature
vectors that hold information about the main features of the
proteins. The feature vectors that are used in this study are
obtained by following the procedure for extraction of our
protein ray-based descriptor that we have introduced in our
former studies. For that purpose, the skeleton of the protein is
interpolated with predefined number of interpolation points,
and then the elements of the feature vector are extracted as
Euclidean distances between the interpolation points and center
of mass. Besides this approach, in this study we also use three
additional approaches for extraction of the feature vectors,
where we focus on the change of the Euclidean distance to the
center of mass between two consecutive interpolation points.
After extracting feature vectors, next we apply several wellknown classification methods in order to generate classification
model. We present the results obtained with these four
approaches used for extraction of the feature vectors.
how they influence the processes in the humans is very worth,
because it is really needed in order to develop new drugs for
diseases. In proteomics, one of the most important tasks is
solving the problem of classification of protein molecules. The
literature provides plethora of methods that could be used for
this task. However, it is still an open issue where still there is a
need for fast computational methods that would provide
accurate classification of proteins. In this paper, we focus on
solving this task. For that purpose, first, we extract feature
vectors that hold information about the main features of the
proteins. The feature vectors that are used in this study are
obtained by following the procedure for extraction of our
protein ray-based descriptor that we have introduced in our
former studies. For that purpose, the skeleton of the protein is
interpolated with predefined number of interpolation points,
and then the elements of the feature vector are extracted as
Euclidean distances between the interpolation points and center
of mass. Besides this approach, in this study we also use three
additional approaches for extraction of the feature vectors,
where we focus on the change of the Euclidean distance to the
center of mass between two consecutive interpolation points.
After extracting feature vectors, next we apply several wellknown classification methods in order to generate classification
model. We present the results obtained with these four
approaches used for extraction of the feature vectors.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
CIIT2020_paper_48.pdf
Size
298.68 KB
Format
Adobe PDF
Checksum
(MD5):cc4015e61a779a8cabb4098205f3a2e6
