Population Based Microsatellite Genotyping

Kristmundsdóttir, Snædís

Population Based Microsatellite Genotyping

dc.contributor.author	Kristmundsdóttir, Snædís
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers)	en
dc.date.accessioned	2019-07-03T13:30:50Z
dc.date.available	2019-07-03T13:30:50Z
dc.date.issued	2014
dc.description.abstract	Microsatellites, also known as short tandem repeats (STRs) are short DNA sequences containing repeated motifs ranging from 2-6 bases. The number of repeats varies between individuals and the numbers occurring in a population are known as the alleles of a microsatellite. Each individual carries two copies of each chromosome and hence two alleles of each microsatellite. There are at least 250.000 microsatellites that have a known location on a human reference genome, the most common form is dinucleotide repeats. The range of applications for microsatellite analysis is very wide and includes among other things medical genetics, forensics and genetic genealogy. However, microsatellite variations are rarely considered in whole-genome sequencing studies in large due to a lack of tools capable of analyzing them. The goal of this thesis is to create a microsatellite genotype caller which is faster and more accurate than others previously presented. In order to accomplish this goal two things were examined. First, we reduce by 87% the amount of sequencing data necessary for creating microsatellite profiles using previously aligned sequencing data. This was achieved by filtering the input to contain only reads aligned to known microsatellite locations and unaligned reads as these should be the ones useful for profiling. The results indicate that when performing microsatellite profiling using previously aligned data it is possible to significantly reduce running time with negligible effects on the resulting profile. Second, the accuracy of the microsatellite profiler was increased from 87.5% to 96.3%. The improvements included using population information to train microsatellite and individual specific error profiles. This was done by adding parameters to the model as well as using sequencing data from multiple individuals to improve parameter estimates. Combining these two procedures we were able to give a practical implementation of microsatellite genotyping which is both much faster and more accurate than previously presented solutions.
dc.identifier.uri	https://hdl.handle.net/20.500.12380/203063
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Data- och informationsvetenskap
dc.subject	Computer and Information Science
dc.title	Population Based Microsatellite Genotyping
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master Thesis	en
dc.type.uppsok	H
local.programme	Computer science – algorithms, languages and logic (MPALG), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: 203063.pdf
Size:: 1.16 MB
Format:: Adobe Portable Document Format
Description:: Fulltext

Ladda ner

Samlingar

Examensarbeten för masterexamen