Sub-networks and Spectral Anisotropy in Deep Neural Networks
dc.contributor.author | Ge, Hanwen | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för matematiska vetenskaper | sv |
dc.contributor.examiner | Jonasson, Johan | |
dc.contributor.supervisor | Gerken, Jan | |
dc.date.accessioned | 2025-04-14T12:39:40Z | |
dc.date.issued | 2025 | |
dc.date.submitted | ||
dc.description.abstract | Deep neural networks (DNNs) have achieved remarkable success across diverse domains, yet the fundamental reasons behind their efficacy and ability to generalize remain elusive. This thesis examines how over-parameterized DNNs learn and generalize by investigating two interconnected phenomena: the emergence of sparse, critical sub-networks (aligned with the Lottery Ticket Hypothesis) and the structural symmetry-breaking. Additionally, we explore the geometric structure of the parameter space, with a particular focus on the anisotropy of the Fisher Information Matrix (FIM) spectrum. We demonstrate that different layers in a deep network exhibit varying degrees of symmetry breaking, which we link to the presence of sub-networks that encapsulate the model’s core representational capacity. Using two distinct criteria—magnitudebased and change-based—we identify critical sub-networks and show that, despite the over-parameterization of DNNs, these sparse sub-networks play a central role in achieving high performance. By analyzing the spectrum of the FIM, we reveal that DNNs evolve along a limited number of dominant eigendirections, spanning a subspace where training dynamics converge. This finding highlights an intrinsic anisotropy in the parameter manifold. Furthermore, we investigate how this anisotropy correlates with the emergence of sub-networks and the internal structure of the subspace. Overall, this thesis provides a novel perspective on the roles of implicit regularization, loss landscape geometry, and sparse substructures in modern deep neural networks, offering insights into the geometric nature of DNNs. | |
dc.identifier.coursecode | MVEX03 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12380/309268 | |
dc.language.iso | eng | |
dc.setspec.uppsok | PhysicsChemistryMaths | |
dc.subject | Deep Neural Networks, Information Geometry, Generalization, Spectral Analysis, Lottery Ticket Hypothesis | |
dc.title | Sub-networks and Spectral Anisotropy in Deep Neural Networks | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master's Thesis | en |
dc.type.uppsok | H | |
local.programme | Complex adaptive systems (MPCAS), MSc |