Training Multi-Tasking Neural Network using ADMM: Analysing Autoencoder-Based Semi-Supervised Learning
Publicerad
Författare
Typ
Examensarbete för masterexamen
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
An autoencoder is a neural network for unsupervised learning, which consists of two
parts: an encoder and a decoder. The encoder uses data as input, while the decoder
uses the encoder output as input. The learning task for the autoencoder is to reconstruct
data in the decoder output, despite that dimensionality of the encoder output
is smaller than that of the data. In this project, a neural network for classification,
i.e, a discriminator, together with an autoencoder, are trained by minimizing the
sum of the loss functions of the two networks. We also add the constraints that each
parameter of the encoder should equal the corresponding parameter of the discriminator.
This corresponds to established semi-supervised methods, which improve
classification results when only a fraction of the observations are labelled. In this
work, we implement training by employing the Alternating DirectionMethod of Multipliers
(ADMM), which allows the networks to be trained in a distributed manner.
Distributed training may be applicable for privacy-protecting or efficiency reasons.
Since ADMM mainly has been used in convex distributed optimization, some adjustments
are proposed to make it applicable for the non-convex problem of training
neural networks. The most important change is that exact minimizations within
ADMM are replaced by a number of Stochastic Gradient Descent (SGD) steps, the
number of steps increases linearly with the ADMM iterations. The method is experimentally
evaluated on two datasets, the so-called two-dimensional interleaving
halfmoons and instances from the MNIST database of handwritten digits. The results
show that our suggested method can improve classification results, with at
least as good results as from unsupervised pretraining.
Beskrivning
Ämne/nyckelord
semi-supervised learning, distributed machine learning, deep learning, autoencoder, ADMM