Clustering for DNA Storage

Luo, Youjun

Clustering for DNA Storage

Ladda ner

Youjun_MSc_Thesis__DNA_clustering_final_report.pdf (1.79 MB)

Publicerad

2023

Författare

Luo, Youjun

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Communication Engineering (MPCOM), MSc

Sammanfattning

Abstract Deoxyribonucleic acid (DNA) has emerged as a potential storage medium due to its high information density and durability. Playing a critical role in the DNA storage process, the clustering partitions similar sequenced reads into groups for the decoder. However, the synthesis, storing, and sequencing of DNA introduce insertion, deletion and substitution (IDS) errors, making the clustering of reads harder. And because of the large numbers of reads in DNA storage, traditional clustering methods in biological domains become time-consuming. Recently, a trie-based algorithm called Clover is proposed to accelerate the clustering process in DNA storage by fuzzy searching the input reads on a trie structure. However, it only considers substitutions during the search, while deletions and insertions are addressed through multiple tests on different regions of input reads afterwards. In this thesis, we proposed efficient clustering algorithms that optimize the trie searching by considering the IDS channel. In our algorithm, discrete IDS errors are corrected with a depthlimited strategy. And a cluster merging method is developed to improve the success rate of searching. We validate the proposed methods on three real-world DNA storage datasets, achieving the lowest runtime and comparable accuracy compared to state-of-the-art DNA clustering tools.

Ämne/nyckelord

DNA storage, Indexing, Clustering, Trie, Levenshtein distance, Poucet search, Depth-limited search, Cluster merging., DNA storage, Indexing, Clustering, Trie, Levenshtein distance, Poucet search, Depth-limited search, Cluster merging

URI

http://hdl.handle.net/20.500.12380/306253

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Clustering for DNA Storage

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced