Weakly Supervised Deep Learning Classification

Claesson, Carl; Johnsson, Fredrik

Weakly Supervised Deep Learning Classification

dc.contributor.author	Claesson, Carl
dc.contributor.author	Johnsson, Fredrik
dc.contributor.department	Chalmers tekniska högskola / Institutionen för fysik	sv
dc.contributor.examiner	Volpe, Giovanni
dc.contributor.supervisor	Kjellberg, Magnus
dc.date.accessioned	2021-06-17T11:40:42Z
dc.date.available	2021-06-17T11:40:42Z
dc.date.issued	2021	sv
dc.date.submitted	2020
dc.description.abstract	The usage of increasingly large and complex sets of data is rapidly gaining traction within healthcare and life sciences. To handle these datasets prompts for more sophisticated methods. A key such method is Artificial Intelligence, AI. There are numerous examples of successful application of AI in health care, especially in diagnostic disciplines, e.g., automatic analysis of X-ray images, treatment recommendations and monitoring adherence [25]. In some of these disciplines, AI have been demonstrated to be able to outperform humans. AI is therefore receiving more and more attention as a way to increase efficiency and safety in healthcare. A key hindrance to the adoption of such systems is the large quantities of labeled data required to train deep learning models. One proposed method of overcoming this annotation bottleneck is weak supervision, or data programming, where the data annotation is done using labeling functions. These labeling functions are used to translate the expert domain knowledge of the annotator using statistical models into “denoised” or probabilistic labels that can be used to train deep learning algorithms without the use of ground truth data provided by an expert annotator. This thesis investigates the Weak Supervision method for concept classification from electronic health records. We describe the development of a distant supervision method, where the external medical database MeSH is used to create labeling functions for different phenotypes (concepts) from the MIMIC-III database [20]. These labeling functions are then used to create probabilistic labels for a few different deep learning models to train on. A deep CNN model trained on the probabilistic labels from the labeling functions achieves a f1-score of 0.93 on the test set and is clearly able to generalize beyond the probabilistic labels it is trained on. It can be concluded that weak supervision seems to be a promising approach for NLP problems within the medical field that could potentially drastically decrease the need for expert annotations, which is both time-consuming and expensive.	sv
dc.identifier.coursecode	TIFX05	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/302595
dc.language.iso	eng	sv
dc.setspec.uppsok	PhysicsChemistryMaths
dc.subject	weak supervision	sv
dc.subject	deep learning	sv
dc.subject	machine learning	sv
dc.subject	NLP	sv
dc.title	Weakly Supervised Deep Learning Classification	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H
local.programme	Complex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: masters_thesis_Johnsson_Claesson.pdf
Size:: 1.07 MB
Format:: Adobe Portable Document Format
Description:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 1.14 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen