Can you win Atari games against computers? It seems to be impossible anymore


I think it is better to watch the youtube of interview here first. Onstage at TED2014, Charlie Rose interviews Google CEO Larry Page about his far-off vision for the company.  Page talks through the company’s recent acquisition of Deep Mind, an AI that is learning some surprising.  At the time of 2 minutes 30 seconds in his interview,  he talks about DeepMind for two minutes.


According to white paper from DeepMind which were bought by Google at 650m USD in Jan 2014,  in three games of Atari 2600, Breakout, Enduro, Pong,  human can not win against computers after computer learns how each game works for a couple of hours.  There is only one same program prepared for each game and there is no input about how to win the specific game in advance.  It means that only one program should learn how to obtain high score from scratch by itself.  At the result of six games,  computers could record higher score than human experts in three games. It is amazing.

Reinforcement learning, one of machine learning, is used in this challenge. It is different form machine learning used in image recognition and natural language processing.  In reinforcement learning,  reward functions are used to decide what the best policy among many choices in the long run.  We can say in short “how much we should give up today’s lunch,  in order to maximize total sum of lunches tomorrow and later”. We always face this kind of problems but it is difficult for computers to answer.  However DeepMind proved reinforcement learning works well against this kind of problems when they presented the demo at the end of 2013.


If this kind of decision-making is available by computers, it will give huge impacts to intellectual jobs, such as lawyers, fund managers, analysts and cooperate officers because they make decisions in long-term horizon, rather than outcomes in tomorrow. They have a lot of experiences in the past, some of  them are successes and others are failures, they can use these experiences when they make a plan for the future.  If computers can use same logic as human and make decisions by themselves, it can be a revolution for intelligent job.  For example, at board meetings in companies, computers may answer questions about management strategies from board members based on the massive amount of past examples and tell them how to maximize future cash flow by using reinforcement learning.  Future cash flow is the most important thing to board members because share holders require to maximize it.


Currently a lot of discussions about our future jobs are going on because it is probable that many jobs will be replaced by computers in near future. If reinforcement learning have been improved, CEO of companies might be replaced by computers and share holders might welcome for them in future ?!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s