The University of Arizona
Please note that this event has ended!

How do transformer networks encode linguistic knowledge?

Program in Applied Mathematics Colloquium

How do transformer networks encode linguistic knowledge?
Series: Program in Applied Mathematics Colloquium
Location: MATH 501
Presenter: Steven Bethard, School of Information, University of Arizona

Pre-trained transformer networks combine a neural self-attention mechanism with a training objective that is compatible with large unlabeled datasets. Models such as BERT and XL-NET have followed this approach to produce new state-of-the-art results across a wide range of natural language processing tasks. But why exactly these models are so successful is not well understood. In this talk, I will present work that looks more closely at the internals of these models and investigates how they acquire and represent linguistic knowledge.