December 10, 2025
PhD defense of Zhengyang Yu
Learning to see and deep learning
Zhengyang Yu finished his PhD in the group of Jochen Triesch on December 10th. “We hope to build a bridge between infant vision and deep learning,” Yu says. In his PhD thesis he shows that representation learning can benefit from the same kind of natural, continuous visual experience that toddlers gain through active, self-directed exploration. By replacing traditional data augmentation with temporally adjacent frames from real visual play sequences, his self-supervised model learned robust object representations comparable to those achieved with full supervision.
This work was made possible through strong collaborations. “We worked closely with Prof. Chen Yu at the University of Texas at Austin, and we are very grateful to him and his lab members. Their first-person toddler video datasets enabled us to train our models with realistic developmental visual input,” Yu explains.
To approximate a toddler’s central visual experience, he cropped regions of head-mounted camera images centered on gaze positions measured via eye tracking. These gaze-guided visual streams were then fed into self-supervised models built on the slowness principle. The results show that toddlers’ gaze strategies support the learning of invariant object representations. Their analysis also indicated that the limited range of the central visual field, where visual acuity is highest, is critical for learning.
In addition, Yu also examined how biological mechanisms such as foveation and cortical magnification influence representation learning. He developed a biologically inspired Circle Relationship Embedding for Vision Transformers, a positional encoding technique that emphasizes central vision and is compatible with traditional positional encodings, leading to improved object recognition performance in ViTs. Additionally, he demonstrated that simulating cortical magnification can serve as an effective augmentation strategy, further boosting the object recognition performance of self-supervised models.
Yu will continue his research for a short period in the Triesch Lab while pursuing postdoctoral opportunities in related fields.
Publications:
Yu, Z., Aubret, A., Yu, C., & Triesch, J. Simulated Cortical Magnification Supports Self-Supervised Object Learning. In 2025 IEEE International Conference on Development and Learning (ICDL) (pp. 1-6). IEEE.
Yu, Z., Aubret, A., Raabe, M. C., Yang, J., Yu, C., & Triesch, J. Toddlers' Active Gaze Behavior Supports Self-Supervised Object Learning. arXiv preprint arXiv:2411.01969.
Yu, Z., & Triesch, J. Cre: Circle relationship embedding of patches in vision transformer. In 2023 European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN).
Raabe, M. C., López, F. M., Yu, Z., Caplan, S., Yu, C., Shi, B. E., & Triesch, J. Saccade amplitude statistics are explained by cortical magnification. In 2023 IEEE International Conference on Development and Learning (ICDL) (pp. 300-305). IEEE.
Schneider, F., Xu, X., Ernst, M. R., Yu, Z., & Triesch, J. Contrastive learning through time. In SVRHM 2021 Workshop@ NeurIPS.
