Bioinformatics and Statistical Methods for Identifying Enrichment of Functional Gene Classes in Telomeric Regions of Chromosomes

dc.contributor.authorAhamed, Tanvir Mohammad
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.departmentChalmers University of Technology / Department of Mathematical Sciencesen
dc.date.accessioned2019-07-03T13:20:53Z
dc.date.available2019-07-03T13:20:53Z
dc.date.issued2013
dc.description.abstractIt has been noted that the telomeric regions of Saccharomyces cerevisiae has fewer essential genes than expected from random shuffling. Further the general effect of single gene silencing of non-essential genes in the telomeric regions with an average has less effect on viability than for non-essential genes in other chromosomal regions. It has also been suggested that the genes in the telomeric regions are less stable with higher mutation and recombination rates. And this could be an evolutionary positive property for adaption of genes with changing environment, provided that there are back up systems for the genes. In this work, we took a look at some different statistical properties of the telomeres and the genes in the telomeric regions. Some of the studied properties are: How dense the code is in the telomeric region compared to the rest of the genome? What length distribution do the genes have in the telomeric region in comparison to the general length distribution? What GO-annotated classes are over-represented in telomeres? Can we find protein sequence clusters that are over-represented in the telomeres? We have found fairly a lot of interesting properties and at least partly our results also support the earlier suggestions. Finally, for the future, we suggest that comparison of our different finding corresponding telomeric statistical properties in Saccharomyces cerevisiae should be performed with other yeast species, like Schizosaccharomyces pombe, which is evolutionary distant enough to be genomically fairly reshuffled. As usual, in multivariate statistics, the statistical properties are correlated (Length correlates to viability, function, etc.) and causality is hard to deduce, but may be easier to understand using more organisms. The main findings of the thesis were that, there is less code in the extreme telomeric region. In percentage, long essential genes in the telomeric region are very few. The numbers of genes in the long non-essential gene category are larger but also quite few compared to elsewhere. And of those that reside in the telomeric region, there are many genes related to metal ion transport, disaccharide and oligosaccharide metabolic and catabolic process. The pipeline of methods used in the present research also identifies some gene function related to helicase activity that has been pointed out in earlier research.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/193480
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectMatematisk statistik
dc.subjectGrundläggande vetenskaper
dc.subjectMathematical statistics
dc.subjectBasic Sciences
dc.titleBioinformatics and Statistical Methods for Identifying Enrichment of Functional Gene Classes in Telomeric Regions of Chromosomes
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
local.programmeBioinformatics and systems biology, MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
193480.pdf
Storlek:
3.39 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext