北京大学定量生物学中心
学术报告
题目: Algorithmic Phase Transitions in In-Context Learning for Markov Chains
报告人: Wenping Cui (崔文平), Ph.D.
Postdoctoral Associate, Department of Physics, Princeton University
时间: 3月5日(周四)13:00-14:00
地点: 吕志和楼B101
主持人: 李志远 研究员
摘要:
Modern distributed architectures, particularly transformers, exhibit an emergent ability known as in-context learning: they can infer task-specific structure from limited input data without updating their parameters. A central theoretical challenge is to identify the architectural principles and data conditions that enable this behavior. Here, we provide a mechanistic and dynamical characterization of in-context generalization in a transformer trained on discrete stationary Markov chains. We show that training gives rise to two distinct algorithmic phases: a unigram phase and a bigram phase. Mechanistically, we demonstrate that the bigram solution is implemented through a statistical induction head. We further derive an effective theory for the learning dynamics of this induction head, explain why its formation occurs abruptly, and show that the transition time is governed by data statistical bias that guides optimization toward the generalizing solution.
报告人简介:
崔文平博士,2011年在中国科学技术大学天体物理学专业取得学士学位,2014年在德国波恩大学取得物理学硕士学位,2021年在美国波士顿学院取得物理学博士学位。2021-2024年期间在加州大学圣芭芭拉分校-卡维里理论物理研究所从事博士后研究。2024年至今为普林斯顿大学博士后研究员。研究领域为生物物理,特别是利用统计物理来探索复杂系统的基本原理。研究方向包括生态与演化,细胞感知与响应机制,动物行为学,以及人工神经网络的注意力学习机制。
