A History of AI

Neural Networks: The Wilderness Years

Geoffrey Hinton
Image: Canadian Institute for Advanced Research / Associated Press

“Give me another six months and I’ll prove to you that it works.” — Geoffrey Hinton

As we saw in the Perceptron post, by the end of 1960s, funding for neural network research dried up and most researchers moved on to other promising approaches. Most, but not all. A small group of researchers believed.

Geoffrey Hinton was one of them. His PhD supervisor at the University of Edinburgh urged him weekly to abandon neural networks. Hinton refused, insisting he just needed more time.

In the early 1980s, John Hopfield at Caltech published work on networks that could store and recall patterns. Show the network part of a pattern and it could reconstruct the whole. Hopfield’s work reignited interest in what neural networks might achieve.

In 1986, David Rumelhart, Geoffrey Hinton, and Ronald Williams published a paper in Nature describing a new algorithm. The Backpropagation algorithm solved the problem that had defeated Rosenblatt and showed that you could train networks with multiple layers.

As the name suggests, Backpropagation works backward. During training, the network measures output error. Then, working backward layer by layer, the algorithm calculates how much each connection contributed to that error. It adjusts each weight by a small amount to reduce the error. Repeat this across thousands of examples and, if the weights settle down, the network learns. Backpropagation lets networks learn representations at multiple levels of abstraction. Simple features in early layers combine into complex patterns in later layers.

Some notable applications emerged. In 1987, Terry Sejnowski and Charles Rosenberg built NETtalk, a neural network that learned to pronounce English text. They trained it on examples and it started like a babbling baby, gradually improving until it could read text aloud. The system learned pronunciation rules without anyone programming them explicitly. This was a demonstration of what backpropagation could do.

In 1989, Yann LeCun at AT&T Bell Labs trained a neural network to recognise handwritten digits. The US Postal Service provided 9,298 scanned postcodes from mail sorting offices. LeCun used 7,291 to train the network and 2,007 to test it. The system achieved a 1% error rate on the digits it classified, with a 9% rejection rate on ambiguous cases. This solved expensive operational problems. Postal services deployed it for sorting mail. Banks used similar systems to read cheques, processing millions per day.

But problems emerged:

By the early 1990s, a new approach emerged. It was easier to use, worked better on small datasets, and solved problems that neural networks struggled with. More on the success of SVMs in a later post.

The connectionist community that had survived now faced another exodus. Funding and researchers moved on. Working on neural networks in the 1990s became career suicide all over again.

But the faithful kept going. They had seen backpropagation work and built systems that learned things no one had programmed. They just needed the right conditions.

Those conditions were coming. They just didn’t know it yet.


Further Reading