Parallel construction of variable length Markov models for DNA sequences

dc.contributor.authorQvick, Jan Rune
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerDamaschke, Peter
dc.contributor.supervisorSchliep, Alexander
dc.date.accessioned2020-07-08T11:16:00Z
dc.date.available2020-07-08T11:16:00Z
dc.date.issued2020sv
dc.date.submitted2020
dc.description.abstractModern CPUs that contain multiple cores allows for parallel execution of algorithms, and while the technology exists it is not always used by existing implementations. Within this project one such case is investigated, namely the construction of variable length Markov models (VLMC). This work builds upon the unpublished work of J. Gustafsson, a base implementation for the construction of VLMCs on DNA-sequences. In addition to implementing a parallel variant, the focus has also been on constructing models for large genomes, something not yet undergone within the base project. The report presents two potential practical parallel variants of this base, and early on selects the most promising for further analysis. For this selected approach multiple tests are performed to present runtime, speedup and memory consumption. The load distribution is also analysed, and presents an opportunity for future improvement. The highest level of speedup was approximately a factor of 7, on 32 cores, compared to seriel execution. This test was performed with an input string of 22 GB. The memory footprint of the implementation, albeit high, is expected because of the adaptation to large input sizes.sv
dc.identifier.coursecodeDATX05sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/301401
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectvariable length Markov modelssv
dc.subjectVLMCsv
dc.subjectparallel computationsv
dc.titleParallel construction of variable length Markov models for DNA sequencessv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 20-19 Qvick.pdf
Storlek:
1.29 MB
Format:
Adobe Portable Document Format
Beskrivning:

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.14 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: