Discovery of subgroup dynamics in Glioblastoma multiforme using integrative clustering methods and multiple data types
Examensarbete för masterexamen
Computer science – algorithms, languages and logic (MPALG), MSc
An integrative data mining method, using multiple data types, called Joint and Individual Variation Explained (JIVE) and it's existing sparse version Sparse JIVE (sJIVE) are analysed and further extended. The proposed extension, called Fused Lasso JIVE(FLJIVE), includes the integration of a Fused Lasso penalization framework into the JIVE method. Also, a model selection tool for selecting the parameters in the JIVE model is proposed. The new model selection algorithm and the three versions of the method, JIVE, sJIVE and FLJIVE, are analysed and compared in a simulation study and later applied to the TCGA Glioblastoma Multiforme Copy Number (CNA) data which is know to have fused properties. The simulation study shows that the rank selection algorithm is successful and that FLJIVE is superior JIVE and sJIVE when the data have underlying fused properties. The results of applying the methods to the TCGA data set suggest that large parts of the underlying mutational process is shared between chromosome 7, 9 and 10. Results also suggest that chromosome 1 does not share as much of this process and that chromosome 15 is almost independent of this process.
Data- och informationsvetenskap , Informations- och kommunikationsteknik , Computer and Information Science , Information & Communication Technology