Sub-networks and Spectral Anisotropy in Deep Neural Networks
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Deep neural networks (DNNs) have achieved remarkable success across diverse domains,
yet the fundamental reasons behind their efficacy and ability to generalize
remain elusive. This thesis examines how over-parameterized DNNs learn and generalize
by investigating two interconnected phenomena: the emergence of sparse,
critical sub-networks (aligned with the Lottery Ticket Hypothesis) and the structural
symmetry-breaking. Additionally, we explore the geometric structure of the
parameter space, with a particular focus on the anisotropy of the Fisher Information
Matrix (FIM) spectrum.
We demonstrate that different layers in a deep network exhibit varying degrees of
symmetry breaking, which we link to the presence of sub-networks that encapsulate
the model’s core representational capacity. Using two distinct criteria—magnitudebased
and change-based—we identify critical sub-networks and show that, despite
the over-parameterization of DNNs, these sparse sub-networks play a central role in
achieving high performance.
By analyzing the spectrum of the FIM, we reveal that DNNs evolve along a limited
number of dominant eigendirections, spanning a subspace where training dynamics
converge. This finding highlights an intrinsic anisotropy in the parameter manifold.
Furthermore, we investigate how this anisotropy correlates with the emergence of
sub-networks and the internal structure of the subspace.
Overall, this thesis provides a novel perspective on the roles of implicit regularization,
loss landscape geometry, and sparse substructures in modern deep neural
networks, offering insights into the geometric nature of DNNs.
Beskrivning
Ämne/nyckelord
Deep Neural Networks, Information Geometry, Generalization, Spectral Analysis, Lottery Ticket Hypothesis