Language Models: A 75-Year Journey That Didn’t Start With Transformers
Introduction Language models have existed for decades — long before today’s so-called “LLMs.” In the 1990s, IBM’s alignment models and smoothed n-gram systems trained on hundreds of millions of words set performance records. By the 2000s, the internet’s growth enabled “web as corpus” datasets, pushing statistical models to dominate natural language processing (NLP). Yet, many believe language modelling began in 2017 with Google’s Transformer architecture and BERT. In reality, Transformers revolutionized scalability but were just one step in a much […]