Announcement_19

📑 New preprint on the training dynamics in Transformers: Clustering Heads.