Applying Conformal Prediction for LLM Multi-Label Text Classification

dc.contributor.authorÖrnbratt, Viktor
dc.contributor.departmentChalmers tekniska högskola / Institutionen för fysiksv
dc.contributor.departmentChalmers University of Technology / Department of Physicsen
dc.contributor.examinerVolpe, Giovanni
dc.contributor.supervisorHallberg Szabadváry, Johan
dc.date.accessioned2025-09-26T13:42:46Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractThis thesis investigates how conformal prediction can be used to improve the robustness and interpretability of multi-label text classification with large language models (LLMs). Using a dataset of Wikipedia comments annotated for multiple types of toxicity, a binary relevance approach is combined with inductive conformal prediction to produce label-wise prediction sets with formal coverage guarantees. Two data splitting strategies are explored to study the trade-off between model accuracy and calibration quality: one prioritising LLM fine-tuning, the other prioritising calibration set size. Results show that conformal prediction enables meaningful uncertainty quantification, including abstention on ambiguous inputs, while maintaining reliable coverage across a range of significance levels. The analysis also highlights challenges related to rare labels, label imbalance, and the sensitivity of validity guarantees to shifts in annotation quality and dataset distribution over time. Overall, the study supports the practical use of conformal prediction as a safeguard mechanism for LLM-based classifiers, especially in settings where predictive reliability and human oversight are both critical.
dc.identifier.coursecodeTIFX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310557
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectLarge Language Models, Conformal Prediction, Multi-label Conformal Prediction, Uncertainty Quantification, Text Classification
dc.titleApplying Conformal Prediction for LLM Multi-Label Text Classification
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComplex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Viktor_Örnbratt.pdf
Storlek:
5.03 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: