Controllable Gaze and Head-Pose Redirection via Latent Disentanglement in Convolutional Autoencoders

Blohm, Eric; Harari, Nadav

Controllable Gaze and Head-Pose Redirection via Latent Disentanglement in Convolutional Autoencoders

dc.contributor.author	Blohm, Eric
dc.contributor.author	Harari, Nadav
dc.contributor.department	Chalmers tekniska högskola / Institutionen för elektroteknik	sv
dc.contributor.examiner	Fredriksson, Jonas
dc.contributor.supervisor	Dahl, John
dc.date.accessioned	2026-06-25T20:00:18Z
dc.date.issued	2026
dc.date.submitted
dc.description.abstract	Driver Monitoring Systems (DMS) increasingly rely on gaze and head pose estimation to assess driver attention and detect unsafe states. However, existing datasets are dominated by common driving patterns while rare yet safety-critical behaviors occur irregularly and are difficult to capture systematically. This motivates the use of synthetic and controllable image generation to improve robustness and validation. This thesis investigates whether gaze direction and head pose can be controllably manipulated in image space through autoencoder-based latent disentanglement. A custom data collection procedure is developed to enable dense and geometrically consistent supervision of gaze and head pose, supporting controlled learning of latent factors. Based on this data, convolutional autoencoders are trained using a latent-swapping strategy and explicit label supervision to encode gaze and head pose into interpretable latent dimensions. In addition, a Laplacian-based edge loss is introduced to improve preservation of high-frequency image details. The results demonstrate consistent and interpretable control of gaze and head pose within the training distribution. The model achieves high reconstruction quality and preserves fine-scale features such as corneal reflections, verified through a dedicated detection pipeline. For unseen identities, coherent eye-region structure and meaningful gaze and head pose variations are retained, though distortions in other image regions and lower evaluation scores reveal limited out-of-distribution generalization. The results highlight both the potential and the limitations of deterministic autoencoders, motivating future work on improved realism and generalization.
dc.identifier.coursecode	EENX30
dc.identifier.uri	https://hdl.handle.net/20.500.12380/311543
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Deep learning
dc.subject	Autoencoder
dc.subject	DMS
dc.subject	Latent Control
dc.subject	Gaze
dc.subject	Image Generation
dc.subject	Latent Disentanglement
dc.title	Controllable Gaze and Head-Pose Redirection via Latent Disentanglement in Convolutional Autoencoders
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Complex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: Master_Thesis_Nadav and Eric.pdf
Size:: 74.22 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen