The Problem Chain That Led to Transformers
Word embeddings, attention, and positional encodings and the specific problem each one was built to solve. Blogging my understanding as I go through the lecture series of “CS 639: Introduction to Foundation Models” at University of Wisconsin-Madison. Photo by Gaelle Marcel on Unsplash How Do Machines Even Understand Words? Before we get into the fancy stuff, it’s worth pausing on this simple question. Text is just characters to a machine, so how do we get it to understand meaning? A […]