Holistic Diagnosis via Multimodal Foundation Models

Publicerad

Författare

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

The healthcare domain has data in many different forms, or modalities. They can be in the form of x-ray images, time-series of certain events like heart rate or blood pressure, textual data from notes etc. Medical practitioners uses many different modalities every day to make informed and sound decisions. With the recent success of small and large language models, it is natural to try and incorporate them with multimodal capabilities in the healtcare domain. This thesis seeks to investigate how well small language models can perform on predictive tasks in healthcare using multimodal data. To explore this, projectors that project data from different sources to the embedding space of a language model was developed. While the results show that a multimodal language model is better than a single-sourced version, it is still being outperformed by the XGBoost model. Even though it is being outperformed, the model proposed shows promise in regards to generalizability, potentially streamlining predictive tasks in healthcare. The thesis argue that even if improvements needs to be made and the challenges it poses can be difficult to handle, further advancements can lead to facilitating medical practitioners in a very efficient way.

Beskrivning

Ämne/nyckelord

ML, Language Models, Healthcare, Multi-label Classification, SHAP

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced