Mixture-of-Experts Architectures Through the Lens of Continual Learning

Hämtar...
Bild (thumbnail)

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Mixture-of-experts architectures on a vision transformer backbone are compared against standard architectures for image classification in continual learning challenges with the constraints found in autonomous vehicle onboard systems and a novel routing algorithm is presented for improving MoE performance in this setting. Domain incremental learning without domain labels and class imbalanced datasets are used with continual learning and imbalanced learning metrics to describe when MoE architectures become useful and what advantages and drawbacks one should consider. Results show that MoE should be used in highly complex datasets with domain focused routing to improve the architectures natural resistance to catastrophic forgetting but with current MoE strategies, large gains are not yet realized. Suggestions for strategies to pair with MoE for continual learning are given alongside guidance for MoE training in this environment.

Beskrivning

Ämne/nyckelord

Image classification, mixture of experts, deep learning, continual learning, domain incremental learning, new instance classification, vision transformers, geometric router

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

Endorsement

Review

Supplemented By

Referenced By