BERT performs near state of the art in question and answering! I confirm it now

Today, I write the article of BERT, which a new natural language model, again because it works so well in question and answering task. In my last article, I explained how BERT works so if you are new about BERT, could you read it?

For this experiment, I use SQuADv1.1data as it is very famous in the field of question and answering.  Here is an explanation by them.

“Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowd workers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.” (This is from SQuAD2.0, a new version of Q&A data)

This is a very challenging task for computers to answer correctly. How does BERT work for this task? As you saw below, BERT recorded f1 90.70 after one-hour training on TPU on colab in our experiment. It is amazing because based on the Leaderboard of SQuAD1.1 below, it is the third or fourth among top universities and companies although the Leaderboard may be different from our experiment setting. It is also noted BERT is as good as a human is!




I tried both Base model and Large model with different batch size.  Large model is better than Base model with around 3 points. Large model takes around 60 minutes to complete training while Base model takes around 30 munites. I use TPU on Google colab for training. Here is the result. EM means “exact match”.

Question & answering can be applied to many tasks in businesses, such as information extraction from documents and automation for customer centers. It must be exciting when we can apply BERT to businesses in the near future.


Next, I would like to perform text-classification of news title in Japanese because BERT has a multi-language model which works in 104 languages globally. As I live in Tokyo now, it is easy to find good data for this experiment. I will update my article soon. So stay tuned!






  title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1810.04805},

Notice: Toshi Stats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. Toshi Stats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on Toshi Stats Co., Ltd. and me to correct any errors or defects in the codes and the software


How can we communicate with computers in the future?


I sometimes have opportunities to teach data analysis to business men/women.  I send emails to learners in order to explain  how it works. Then I am wondering whether computers can do the same things as I do in the future.  It is called “Question answering system”.   Based on the progress of technologies, it may be “Yes” and not too far away from now. Let us consider it for a while.


1.  Can computers understand our natural languages as we do?

In order to communicate with us, computers should learn how we use natural languages, such as English, Malay, Chinese, Japanese, and so on.  It is very difficult for computers to do that.  But with technological breakthrough, it might be possible in near future.  This is called “thought vectors”.  The technology is led by Dr. Hinton, who is a professor in the computer science Dept at the University of Toronto. His explanation 1 is a little complicated. In short, our sentences are mapped to vectors by using numbers so that computers can understand and calculate the meaning of them.  For example,  “Kuala Lumpur – Malaysia + Japan = Tokyo”.  This kind of calculation might be possible by using “thought vectors” according to the article 2.  A translation could be more accurate by “thought vectors” because they can be a bridge between one language  to another. He said “Computers will have developed common sense within a decade” in this article.  I think that is revolutionary!


2.  Would we like to talk to computers?

Someone is wondering whether people like talking with computers or not.   I think “yes”.  Now computers can be a brain in robots. Robots have looked so cute recently.  Pepper, developed by Aldebaran Robotics and SoftBank Group, is very popular in Japan. Last month, pepper started to be sold in retails there. But 1,000 units of pepper were sold out 3 in just a minute, even though it is not cheap. I think they can be people’s friends,  just like a dog.


3. How it will impact our businesses and society?

It is very difficult to imagine what the impacts of this technology are in our business and society.  This is a kind of revolution about how our knowledge and intelligence are used in our lives.  Simple task might be done by computers and people will create new “knowledge and intelligence” which do not exist now, supported by computers.  By using conversations with computers, people can obtain information and insights of new things because computers can keep massive amounts of data as a form of text, image, sound and voice, etc. It must be exciting, isn’t it?



Do you know the humanoid robot called “C3PO’ in the movie “STAR WARS“.  It might appear in front of us in near future?!  C3PO can translate many kinds of  languages among universe and answer questions from people. I hope I can buy him just like Luke Skywalker in the future. How about you?




1. ‘Thought vectors’ could revolutionize artificial intelligence, EXTREME TECH, 27 May 2015

2. Google a step closer to developing machines with human-like intelligence, The Guardian, 21 May 2015

3. ‘Emotional’ robot sells out in a minute, CNN, 23 June 2015


Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third party sources may have been used in the preparation of this material and I, Autor of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user.