Fracture risk prediction using multimodal neural networks
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Osteoporosis, characterized by reduced bone mass and micro-architectural deterioration, increases the risk of fractures, particularly in elderly populations. Traditional fracture risk prediction models rely on clinical risk factors but do not incorporate imaging data, potentially overlooking structural indicators. This thesis explores the potential of multimodal learning by integrating spinal X-ray images with clinical risk factors to improve fracture risk prediction. Both full spinal radiographs and cropped vertebral images are evaluated to investigate whether focusing on anatomically relevant regions can enhance the predictive signal. Two deep learning architectures are considered, Convolutional neural networks (CNNs) and Vision transformers (ViTs), alongside multiple fusion strategies for combining image and tabular data. Results demonstrate that multimodal models consistently outperform baselines, with the best performance achieved by a CNN using vertebral crops and intermediate fusion (C-index: 0.69, AUC: 0.76, Brier score: 0.14). This suggests that image data alone contain meaningful predictive information, and that combining imaging with clinical features enhances fracture risk prediction. Using vertebral crops as input generally yielded better performance than using full radiographs, highlighting the importance of localized features. However, the models were evaluated on a single dataset of elderly Caucasian women, indicating the need for future work to assess generalization across diverse populations.