Self-learning computer eclipses human ability at complex game Go

Scientists announce that game-playing Artificial Intelligence computer mastered the ancient game of Go in three days, and say it could teach itself more than just playing games

(GERMANY OUT) games, Go, an asian strategic board game for two players (Photo by Fishman/ullstein bild via Getty Images) *** Local Caption *** 00972863
Powered by automated translation

The timing was uncanny. Within days of the announcement of the UAE’s first Minister of State for Artificial Intelligence, scientists unveiled one of the biggest-ever advances in the technology – and arguably the scariest.

They have created a computer programme that learnt to play the notoriously complex game Go, and then trounced the world’s greatest player 100 wins to nil.

That might not sound like big news for those of us who struggle with checkers, let alone Go, but beware: the implications are huge.

Only last year a programme called AlphaGo created by a team from UK-based Google DeepMind made headlines by beating the human world Go champion 4-1.

At the time this was hailed as a landmark in the development of computers with literally superhuman abilities.

But a year is an eternity in AI research, and now the same team has achieved something of far greater significance.

AlphaGo consisted of a fleet of computers trained for months using tens of millions of real-life moves and input from expert players.

Now its abilities have been utterly eclipsed by AlphaGo Zero, a single machine that mastered the game in just three days – and, crucially, without any human input whatsoever.

Beginning with just random guesses, it worked out how best to play the game all by itself.  And its prowess is truly astounding. That 100-0 victory was achieved not against a mere human but over AlphaGo, the programme which defeated Lee Sedol last year.

Reporting their work in the journal Nature earlier this month, the researchers point out that in the space of just a few days, AlphaGo Zero went from knowing nothing to rediscovering strategies humans have taken thousands of years to develop – and then pushing onward, finding new strategies never before seen.

This is more than just an academic triumph. Despite Go’s apparent simplicity – in essence, it’s about capturing more territory than your opponent – it involves no luck and offers more combinations of moves than there are particles in the known universe.

As such, it involves making the best choices in the face of an overwhelming range of possibilities – an all too familiar real-world challenge.

___________________________

Read more: 

___________________________

To tackle it, researchers in AI have developed an array of techniques, taking inspiration from many different sources.

So-called genetic algorithms cut through the myriad options using a kind of “survival of the fittest”, in which random guesses are ranked according to their success, with the best being combined and mutated to produce “offspring”. Over time, these evolve into better solutions.

DeepMind has focused on so-called neural networks, which take their inspiration from the architecture of the brain. Put simply, a computer is programmed to act as if it is made up of a network of interconnected nerves, each with inputs and output. The network is then trained using lots of examples, with algorithms tweaking the connections between the neurons until they produce the appropriate response to a given input.

With the original AlphaGo programme, the training came in the form of tens of millions of moves, with it discovering strategies that seem to work well. But these moves incorporated human knowledge of the game, giving the programme a considerable boost.

With AlphaGo Zero, the researchers have developed a so-called reinforcement learning algorithm that allows the programme to acquire skill by repeatedly playing against itself.  Being forced to play against a closely matched “opponent” led to the programme improving with astonishing speed.

According to the researchers, it is now possible to train computers “to superhuman level, without human examples or guidance, given no knowledge of the domain beyond basic rules”.

This has sparked concern that we may soon witness something called The Singularity.

First mooted by scientists over half a century ago, this marks the point at which computers acquire the ability to improve exponentially rapidly, becoming unfathomably intelligent.

According to some – notably celebrity scientist Stephen Hawking and entrepreneur billionaire Elon Musk – The Singularity poses an existential threat to civilisation.

Others dismiss it as sci-fi scaremongering.

As so often with emerging technology, the reality is likely somewhere in between.

DeepMind lead researcher Dr David Silver has publicly stated that because it can learn from scratch, AlphaGo Zero can be transplanted from the game of Go into any other domain.

The idea of a computer with such an algorithm being let loose on the world certainly sounds like the plot for a sci-fi novel that doesn’t end well.

Yet it is already leading to another – and more benign – sci-fi plot: humans benefiting from the wisdom of a super-intelligence.

In the same issue of Nature, American Go experts Andy Okun and Andrew Jackson describe their excitement at seeing what AlphaGo Zero can teach them about the ancient game.

While its approach to the start and end of games fits in with what humans have learnt over the centuries, Okun and Jackson admit that AlphaGo Zero’s moves in the middle game are utterly mysterious.

Could it be that human players have fallen into the same trap for centuries – one which the programme avoided by ignoring the accepted wisdom and starting again from scratch?

That, in turn, raises the possibility of a myriad insights being discovered in countless other fields by computers left to their own devices.

In centuries to come, this month may be seen as the start of a new era in human development, where we partner with electronic progeny far smarter than ourselves. Whatever happens, the UAE’s establishment of a ministry for AI is looking stunningly prescient.

Robert Matthews is Visiting Professor of Science at Aston University, Birmingham, UK