Sample-efficient machine learning with auxiliary information

Publicerad

Författare

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Our thesis proposes a Learning using Privileged Mediating Information (LuPI) algorithm based on a directed Gaussian graphical model, and analyzes that LuPI outperforms the Ordinary Least Squares (OLS) model in terms of statistical properties under known causality by constructing a causal directed acyclic graph (DAG) containing mediating variables. Using the Rao-Blackwell theorem, it is shown theoretically that LuPI can efficiently decrease the mean square error (MSE) and the expected risk. In the experimental part, the improvement of LuPI over OLS is verified on a synthetic dataset under different noise levels and sample sizes, especially under high noise and small sample conditions. In addition, the experiments also investigate the impact of graph estimation bias on the performance of the algorithm, and the results show that appropriate removal of redundant edges in the causal graph can help reduce the variance, which in turn improves the overall performance of the model. Finally, the experiments based on real datasets further demonstrate the superiority of the LuPI algorithm under small sample sizes and validate its application value in complex causal data.

Beskrivning

Ämne/nyckelord

Learning using Privileged Information, Directed Gaussian Graphical Model, Linear Regression, Causal Analysis

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced