Statistical Inference with Auxiliary Information under Block-Structured Missing Data
| dc.contributor.author | Holmberg, Linnéa | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för matematiska vetenskaper | sv |
| dc.contributor.examiner | Jörnsten, Rebecka | |
| dc.contributor.supervisor | Imberg, Henrik | |
| dc.date.accessioned | 2026-01-22T14:34:57Z | |
| dc.date.issued | 2026 | |
| dc.date.submitted | ||
| dc.description.abstract | In medical research, a common challenge is missing data. Missing data can lead to biased findings and loss of precision if not handled appropriately. Common methods of handling missing data are complete-case analysis (CCA), multiple imputation (MI), or inverse probability weighting (IPW), but these methods have drawbacks. This thesis aims to compare these methods to the method augmented inverse probability of completecase weighting (AIPCCW), that is less established but with certain desirable theoretical properties. AIPCCW is an extension of inverse probability of complete-case weighting (IPCCW), and utilises information from both participants with fully observed data and participants with partly observed data. AIPCCW utilises two models, one for the outcome and one for missingness, where only one model is required to be correctly specified for AIPCCW to achieve unbiased inference. This thesis implement and compare AIPCCW, CCA, MI, and IPCCW in different scenarios through a simulation study and a application study on real-world data. The scenarios cover unique combinations of sample size, proportion of missing data, levels of correlation between variables with missing values and with an auxiliary variable, and different missingness mechanisms. In our experiments, the AIPCCW method demonstrate a performance in bias and eRMSE statistically significantly better than CCA, MI, and IPCCW, in certain scenarios, especially in simulated scenarios with a large proportion of missing data. AIPCCW is found to significantly improve in scenarios with a higher correlation between the variable with missing values and the auxiliary variable. On the other hand, the performance of AIPCCW is found to not outperform CCA, MI, and IPCCW in a majority of the scenarios that were implemented. AIPCCW performed comparable to CCA and IPCCW on realworld data in this study, but AIPCCW could potentially perform better on real-world data if a stronger correlation between the variable with missing data and the auxiliary variable existed. Owing to these results, the evaluation is inconclusive to whether AIPCCW is significantly better than CCA, MI, and IPCCW. This thesis concludes that AIPCCW is a stable method, but does not necessarily recommend it over more common methods. However, further research is needed. | |
| dc.identifier.coursecode | MVEX03 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310938 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | PhysicsChemistryMaths | |
| dc.subject | Augmented inverse probability weighting, block-structured missing data, coarsened data, auxiliary information, missing data, statistical inference, doubly robust methods. | |
| dc.title | Statistical Inference with Auxiliary Information under Block-Structured Missing Data | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Engineering mathematics and computational science (MPENM), MSc |
