From Transformers to Associative Memory, How Titans and MIRAS Rethink Long Context Modeling
What comes after Transformers? Google Research is proposing a new way to give sequence models usable long term memory with Titans and MIRAS, while keeping training parallel and inference close to linear. Titans is a concrete architecture that adds a deep neural memory to a Transformer style backbone. MIRAS is a general framework that views most modern sequence models as instances of online optimization over an associative memory. Why Titans and MIRAS? Standard Transformers use attention over a […]