Short Tandem String Duplication Sequences
Date
Authors
Type
Examensarbete för masterexamen
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
It has been shown that tandem repeats are involved in several neurological and genetic
disorders. Additionally, tandem repeats and the Tandem Duplication (TD)
operation have been studied in other areas, such as information theory and formal
languages. Given a string AXB, where A, X, and B denote substrings, a tandem duplication
operation on the substring X copies it and places its copy adjacently, which
produces the string AXXB.
Contemporary work shows that the Tandem Duplication Distance Problem (TDDP)
is NP-complete for alphabets of size five. Therefore, we study the idea of overlapping
squares and introduce a new algorithm, which solves the k-TD problem in time
O((n log n)k), where n denotes the length of the target string T, and k denotes the
number of contraction. Primarily, we prove that the Tandem Duplication Distance
Problem is fixed-parameter tractable in the natural parameter d. We introduce a dynamic
programming algorithm that locally exhausts independent disjoint substrings
and calculates the optimal edit distance through minimization. The algorithm has
a running time of O(nd3(d log d)d), where n denotes the length of the target string
T and d denotes the length difference of the two strings S and T.
Description
Keywords
algorithms, tandem duplication, squares, tandem repeats, edit distance, dynamic programming, FPT algorithms