TMO: time and memory optimized algorithm applicable for more accurate alignment of trinucleotide repeat disorders associated genes
Journal
Biotechnology & Biotechnological Equipment
Date Issued
2016-03-03
Author(s)
Stojanov, Done
Madevska Bogdanova, Ana
Marcin Orzechowski, Tomasz
Abstract
In this study, time and memory optimized (TMO) algorithm is presented. Compared with
Smith Waterman’s algorithm, TMO is applicable for a more accurate detection of continuous
insertion/deletions (indels) in genes’ fragments, associated with disorders caused by overrepetition of a certain codon. The improvement comes from the tendency to pinpoint indels in the
least preserved nucleotide pairs. All nucleotide pairs that occur less frequently are classified as less
preserved and they are considered as mutated codons whose mid-nucleotides were deleted. Other
benefit of the proposed algorithm is its general tendency to maximize the number of matching
nucleotides included per alignment, regardless of any specific alignment metrics. Since the
structure of the solution, when applying Smith Waterman, depends on the adjustment of the
alignment parameters and, therefore, an incomplete (shortened) solution may be derived, our
algorithm does not reject any of the consistent matching nucleotides that can be included in the
final solution. In terms of computational aspects, our algorithm runs faster than Smith Waterman
for very similar DNA and requires less memory than the most memory efficient dynamic
programming algorithms. The speed up comes from the reduced number of nucleotide
comparisons that have to be performed, without having to imperil the completeness of the
solution. Due to the fact that four integers (16 Bytes) are required for tracking matching fragment,
regardless its length, our algorithm requires less memory than Huang’s algorithm.
Smith Waterman’s algorithm, TMO is applicable for a more accurate detection of continuous
insertion/deletions (indels) in genes’ fragments, associated with disorders caused by overrepetition of a certain codon. The improvement comes from the tendency to pinpoint indels in the
least preserved nucleotide pairs. All nucleotide pairs that occur less frequently are classified as less
preserved and they are considered as mutated codons whose mid-nucleotides were deleted. Other
benefit of the proposed algorithm is its general tendency to maximize the number of matching
nucleotides included per alignment, regardless of any specific alignment metrics. Since the
structure of the solution, when applying Smith Waterman, depends on the adjustment of the
alignment parameters and, therefore, an incomplete (shortened) solution may be derived, our
algorithm does not reject any of the consistent matching nucleotides that can be included in the
final solution. In terms of computational aspects, our algorithm runs faster than Smith Waterman
for very similar DNA and requires less memory than the most memory efficient dynamic
programming algorithms. The speed up comes from the reduced number of nucleotide
comparisons that have to be performed, without having to imperil the completeness of the
solution. Due to the fact that four integers (16 Bytes) are required for tracking matching fragment,
regardless its length, our algorithm requires less memory than Huang’s algorithm.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
TMO time and memory optimized algorithm applicable for more accurate alignment of trinucleotide repeat disorders associated genes.pdf
Size
1.24 MB
Format
Adobe PDF
Checksum
(MD5):16e03489c4d872497b7db32b7dd7a0d6
