Is this a real voice by human being? It is amazing as generated by computers

girl-926225_640

As I shared the article this week,  I found the exciting system to generate voices by computers. When I heard the voice I was very surprised as it sounds so real. I recommend you to listen to them in the website here.  There are versions of English and Mandarine. This is created by DeepMind, which is one of the best research arms of artificial intelligence in the world. What makes it happen?   Let us see it now.

 

1. Computers learns our voices deeper and deeper

According to the explanation of DeepMind, they use “WaveNet, a deep neural network for generating raw audio waveforms”.  They also explain”pixel RNN and pixel CNN”, which are invented by them earlier this year. (They have got one of best paper award at ICML 2016, which are one of the biggest international conference about machine learning, based on the research). By applying pixel RNN and CNN to voice generation, computers can learn wave of voices far more details than previous methods. It enables computers generate more natural voices. It is how WaveNet is born this time.

As the result of learning raw audio waveforms, computer can generate voices that sound so real. Could you see the metrics below?  The score of WaveNet is not so different from the score of Human Speech (1). It is amazing!

%e3%82%b9%e3%82%af%e3%83%aa%e3%83%bc%e3%83%b3%e3%82%b7%e3%83%a7%e3%83%83%e3%83%88-2016-09-14-9-29-29

2. Computers can generate man’s voice as well as woman’s voice at the same time

As computer can learn wave of our voices more details,  they can create both man’s voice and woman’s voice. You can also listen to each of them in the web. DeepMind says “Similarly, we could provide additional inputs to the model, such as emotions or accents”(2) . I would like to listen them, too!

 

3. Computers can generate not only voice but also music!

In addition to that,  WaveNet can create music, too.  I listen to the piano music by WaveNet and I like it very much as it sounds so real. You can try it in the web, too.  When we consider music and voice as just data of audio waveforms, it is natural that WaveNets can generate not only voices but also music.

 

If we can use WaveNet in digital marketing, it must be awesome! Every promotions, instructions and guidance to customers can be done by voice of  WaveNet!  Customers may not recognize “it is the voice by computers”.  Background music could be optimized to each customer by WaveNet, too!  In my view, this algorithm could be applied to many other problems such as detections of cyber security attack, anomaly detections of vibrations of engines, analysis of earthquake as long as data can form  of “wave”.  I want to try many things by myself!

Could you listen the voice by WaveNet? I believe that in near future, computers could learn how I speech and generate my voice just as I say.  It must be exciting!

 

 

1,2.  WaveNet:A generative model for Raw Audio

https://deepmind.com/blog/wavenet-generative-model-raw-audio/

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

“DEEP LEARNING PROJECT” starts now. I believe it works in digital marketing and economic analysis

desert-956825_640

As the new year starts,  I would like to set up a new project of my company.  This is beneficial not only for my company, but also readers of the article because this project will provide good examples of predictive analytics and implementation of new tools as well as platforms. The new project is called “Deep Learning project” because “Deep Learning” is used as a core calculation engine in the project.  Through the project,  I would like to create “predictive analytics environment”. Let me explain the details.

 

1.What is the goal of the project?

There are three goals of the project.

  • Obtain knowledge and expertise of predictive analytics
  • Obtain solutions for data-driven management
  • Obtain basic knowledge of Deep Learning

As big data are available more and more, we need to know how to consume big data to get insight from them so that we can make better business decisions.  Predictive analytics is a key for data-driven management as it can make predictions “What comes next?” based on data. I hope you can obtain expertise of predictive analytics by reading my articles about the project. I believe it is good and important for us  as we are in the digital economy now and in future.

 

2.Why is “Deep Learning” used in the project?

Since the November last year, I tried “Deep Learning” many times to perform predictive analytics. I found that it is very accurate.  It is sometimes said that It requires too much time to solve problems. But in my case, I can solve many problems within 3 hours. I consider “Deep Learning” can solve the problems within a reasonable time. In the project I would like to develop the skills of tuning parameters in an effective manner as “Deep Learning” requires several parameters setting such as the number of hidden layers. I would like to focus on how number of layers, number of neurons,  activate functions, regularization, drop-out  can be set according to datasets. I think they are key to develop predictive models with good accuracy.  I have challenged MNIST hand-written digit classifications and our error rate has been improved to 1.9%. This is done by H2O, an awesome analytic tool, and MAC Air11 which is just a normal laptop PC.   I would like to set my cluster on AWS  in order to improve our error rate more. “Spark” is one of the candidates to set up a cluster. It is an open source.

DL.002

3. What businesses can benefit from introducing “Deep Learning “?

“Deep Learning ” is very flexible. Therefore, it can be applied to many problems cross industries.  Healthcare, financial, retails, travels, food and beverage might be benefit from introducing “Deep Learning “.  Governments could benefit, too. In the project, I would like to focus these areas as follows.

  • Digital marketing
  • Economic analysis

I would like to create a database to store the data to be analyzed, first. Once it is created,  I perform predictive analytics on “Digital marketing” and “Economic analysis”.  Best practices will be shared with you to reach our goal “Obtain knowledge and expertise of predictive analytics” here.  Deep Learning is relatively new to apply both of the problems.  So I expect new insight will be obtained. For digital marketing,  I would like to focus on social media and measurement of effectiveness of digital marketing strategies.  “Natural language processing” has been developed recently at astonishing speed.  So I believe there could be a good way to analyze text data.  If you have any suggestions on predictive analytics in digital marketing,  could you let me know?  It is always welcome!

 

I use open source software to create an environment of predictive analytics. Therefore, it is very easy for you to create a similar environment on your system/cloud. I believe open source is a key to develop superior predictive models as everyone can participate in the project.  You do not need to pay any fee to introduce tools which are used in the project as they are open source. Ownership of the problems should be ours, rather than software vendors.  Why don’t you join us and enjoy it! If you want to receive update the project, could you sing up here?

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

How will “Deep Learning” change our daily lives in 2016?

navigation-1048294_640

“Deep Learning” is one of the major technologies of artificial intelligence.  In April 2013, two and half years ago, MIT technology review selected “Deep Learning” as one of the 10  breakthrough technologies 2013.  Since then it has been developed so rapidly that it is not a dream anymore now.   This article is the final one in 2015.  Therefore, I would like to look back the progress of “Deep Learning” this year and consider how it changes our daily lives in 2016.

 

How  has “Deep Learning” progressed in 2015? 

1.  “Deep Learning” moves from laboratories to software developers in the real world

In 2014,  Major breakthrough of deep learning occurred in the major laboratory of big IT companies and universities. Because it required complex programming and huge computational resources.  To do that effectively, massive computational assets and many machine learning researchers were required.  But in 2015,  many programs, softwares of deep learning jumped out of the laboratory into the real world.  Torch, Chainer, H2O and TensorFlow are the examples of them.  Anyone can develop apps with these softwares as they are open-source. They are also expected to use in production. For example, H2O can generate the models to POJO (Plain Old Java Code) automatically. This code can be implemented into production system. Therefore, there are fewer barriers between development and production anymore.  It will accelerate the development of apps in practice.

 

2. “Deep Learning” start understanding languages gradually.

Most of people use more than one social network, such as Facebook, Linkedin, twitter and Instagram. There are many text format data in them.  They must be treasury if we can understand what they say immediately. In reality, there are too much data for people to read them one by one.  Then the question comes.  Can computers read text data instead of us?  Many top researchers are challenging this area. It is sometimes called “Natural Language Processing“.  In short sentences, computers can understand the meaning of sentences now. This app already appeared in the late of 2015.  This is “Smart Reply” by  Google.  It can generate candidates of a reply based on the text in a receiving mail. Behind this app,  “LSTM (Long short term memory)” which is one of the deep learning algorithm is used.  In 2016, computers might understand longer sentences/paragraphs and answer questions based on their understanding. It means that computers can step closer to us in our daily lives.

 

3. Cloud services support “Deep Learning” effectively.

Once big data are obtained,  infrastructures, such as computational resources, storages, network are needed. If we want to try deep learning,  it is better to have fast computational resources, such as Spark.  Amazon web services, Microsoft Azure, Google Cloud Platform and IBM Bluemix provide us many services to implement deep learning with scale. Therefore, it is getting much easier to start implementing “Deep Learning” in the system. Most cloud services are “pay as you go” so there is no need to pay the initial front cost to start these services. It is good, especially for small companies and startups as they usually have only limited budgets for infrastructures.

 

 

How will “Deep Learning” change our daily lives in 2016? 

Based on the development of “Deep learning” in 2015,  many consumer apps with “Deep learning” might appear in the market in 2016.   The deference between consumer apps with and without “Deep Learning” is ” Apps can behave differently by users and conditions”. For example,  you and your colleagues might see a completely different home screen even though  you and your colleagues use the same app because “Deep learning” enables the app to optimize itself to maximize customer satisfaction.  In apps of retail shops,  top pages can be different by customers according to customer preferences. In apps of education,  learners can see different contents and questions as they have progressed in the courses.  In apps of navigations,  the path might be automatically appeared based on your specific schedule, such as the path going airport on the day of the business trip.  They are just examples.  It can be applied across the industries.  In addition to that,  it can be more sophisticated and accurate if you continue to use the same app  because it can learn your behavior rapidly.  It can always be updated to maximize customer satisfactions.  It means that we do not need to choose what we want, one by one because computers do that instead of us.  Buttons and navigators are less needed in such apps.  All you have to do is to input the latest schedules in your computers.  Everything can be optimized based on the updated information.  People are getting lazy?  Maybe yes if apps are getting more sophisticated as expected. It must be good for all of us.  We may be free to do what we want!

 

 

Actually,  I quit an investment bank in Tokyo to set up my start-up at the same time when MIT  technology review released 10 breakthrough technologies 2013.  Initially I knew the word “Deep Learning” but I could not understand how important is is to us because it was completely new for me. However, I am so confident now that I always say “Deep Learning'” is changing the landscape of jobs, industries and societies.  Could you agree with that?  I imagine everyone can agree that by the end of 2016!

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

Can computers write sentences of docs to support you in the future?

49cad23354bef871147702f5880a45c6_s

This is amazing!  It is one of the most incredible applications for me this year!  I am very excited about that.  Let me share with you as you can use it,  too.

This is “Smart Reply of Inbox”, an e-mail application from Google.  It was announced on 3rd November. I try it today.

For example, I got e-mail from Hiro. He asked me to have a lunch tomorrow. In the screen, three candidates of my answer appear automatically.  1. Yes, what time?  2. Yes, what’s up  3. No, sorry.  These candidates  are created after computers understand what Hiro said in the e-mail. So each of them is very natural for me.

mail1

So all I have to do is just to choose the first candidate and send it to Hiro.  It is easy!

mail2

According to Google, state of the art technology “Long short term memory” is used in this application.

I always wonder how computers understand the meaning of words and sentences.  In this application, sentences are represented in fixed sized vectors. It means that each sentence is converted to sequences of numbers.  If two sentences have the same meaning,  the vector of each sentence should be similar to each other even though the original sentences look different.

 

This technology is one of the machine learning. Therefore,  the more people use it, the more sophisticated it can be because it can learn by itself.  Now it applies to relatively short sentences like e-mail. But I am sure it will be applied to longer sentences, such as official documents in business.  I wonder when it happens in the future.  Pro. Geoffrey Hinton is expected to research this area with intense.  If it happens, computers will be able to understand what documents mean and create some sentences based on their understanding.  I do not know how Industires are changed when it happens.

This kind of technology is sometimes referred as “Natural language processing” or “NLP”.   I want to focus on this area as a main research topic of my company in 2016.  Some progresses will be shared through my weekly letter here.

 

I would like to recommend you to try Smart Reply of Inbox and enjoy it!  Let me know your impressions. Cheers!

 

 

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

How can we communicate with computers in the future?

iphone-410314_640

I sometimes have opportunities to teach data analysis to business men/women.  I send emails to learners in order to explain  how it works. Then I am wondering whether computers can do the same things as I do in the future.  It is called “Question answering system”.   Based on the progress of technologies, it may be “Yes” and not too far away from now. Let us consider it for a while.

 

1.  Can computers understand our natural languages as we do?

In order to communicate with us, computers should learn how we use natural languages, such as English, Malay, Chinese, Japanese, and so on.  It is very difficult for computers to do that.  But with technological breakthrough, it might be possible in near future.  This is called “thought vectors”.  The technology is led by Dr. Hinton, who is a professor in the computer science Dept at the University of Toronto. His explanation 1 is a little complicated. In short, our sentences are mapped to vectors by using numbers so that computers can understand and calculate the meaning of them.  For example,  “Kuala Lumpur – Malaysia + Japan = Tokyo”.  This kind of calculation might be possible by using “thought vectors” according to the article 2.  A translation could be more accurate by “thought vectors” because they can be a bridge between one language  to another. He said “Computers will have developed common sense within a decade” in this article.  I think that is revolutionary!

 

2.  Would we like to talk to computers?

Someone is wondering whether people like talking with computers or not.   I think “yes”.  Now computers can be a brain in robots. Robots have looked so cute recently.  Pepper, developed by Aldebaran Robotics and SoftBank Group, is very popular in Japan. Last month, pepper started to be sold in retails there. But 1,000 units of pepper were sold out 3 in just a minute, even though it is not cheap. I think they can be people’s friends,  just like a dog.

 

3. How it will impact our businesses and society?

It is very difficult to imagine what the impacts of this technology are in our business and society.  This is a kind of revolution about how our knowledge and intelligence are used in our lives.  Simple task might be done by computers and people will create new “knowledge and intelligence” which do not exist now, supported by computers.  By using conversations with computers, people can obtain information and insights of new things because computers can keep massive amounts of data as a form of text, image, sound and voice, etc. It must be exciting, isn’t it?

 

 

Do you know the humanoid robot called “C3PO’ in the movie “STAR WARS“.  It might appear in front of us in near future?!  C3PO can translate many kinds of  languages among universe and answer questions from people. I hope I can buy him just like Luke Skywalker in the future. How about you?

 

 

Source

1. ‘Thought vectors’ could revolutionize artificial intelligence, EXTREME TECH, 27 May 2015

2. Google a step closer to developing machines with human-like intelligence, The Guardian, 21 May 2015

3. ‘Emotional’ robot sells out in a minute, CNN, 23 June 2015

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third party sources may have been used in the preparation of this material and I, Autor of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

Do you know how computers can read e-mails instead of us?

email-329819_1280

Hello, friends. I am Toshi. Today I update my weekly letter. This week’s topic is “e-mail”.   Now everyone uses email to communicate with customers, colleagues and families. It is useful and efficient. However, if you try to read massive amounts of e-mails at once manually, it takes a lot of time.  Recently, computers can read e-mail and classify potentially relevant e-mail from others instead of us. So I am wondering how computers can do that. Let us consider it a little.

1.  Our words can become “data”

When we hear the word “data”,  we imagine numbers in spreadsheets.  This is a kind of “traditional” data.  Formally, it is called “structured data”. On the other hand, text such as words in e-mail, Twitter, Facebook can be “data”, too.  This kind of data is called “unstructured data“. Most of our data exist as “unstructured data” around us.  However, computers can transform these data into data that can be analyzed. This is generally an automated process. So we do not need to check each of them one by one. Once we can create these new data, computers can analyze them at astonishing speed.  It is one of the biggest advantages to use computers in analyzing e-mails.

2. Classification comes again

Actually, there are many ways for computers to understand e-mails. These methods are sometimes called Natural language processing (NLP)“.  One of the most sophisticated one is a method using machine learning and understanding the meaning of sentences by looking at the structures of sentences. Here I would like to introduce one of the simplest methods so that everyone can understand how it works.  It is easy to imagine that the “number of each word” can be data.  For example, ” I want to meet you next week.”.  In this case, (I,1), (want,1),(to,1), (meet,1),(you,1), (next,1),(week,1) are data to be analyzed. The longer sentences are, the more words appear as data. For example, we try to analyze e-mails from customers to assess who are satisfied with our products. If the number of positive words, such as like, favorite, satisfy, are high,  it might mean customers are satisfied with the products, vice versa.  This is a problem of “classification“.  So we can apply the same method as I explained before. The “target” is “customers satisfied” or “not satisfied” and “features” are the number of each word. 

3. What’s the impact to businesses?

If computers understand what we said in text such as e-mails,  we can make the most out of it in many fields. For the marketing, we can analyze the voices of customers from the massive amount of e-mails. For the legal services, computers identify what e-mails are potentially relevant as evidences for litigations.  It is called “e-discovery“.  In addition to that, I found that Bank of England started monitoring social networks such as Twitter and Facebook in order to research economies.  This is a kind of “new-wave” of economic analysis.  These are just examples. I think  you can create many examples of applications for businesses by yourself because we are surrounded by a lot of e-mails now.  

In my view, natural language processing (NLP) will play a major role in the digital economy.   Would you like to exchange e-mail with computers?

Is it possible to raise the quality of services if computers can talk to you?

public-domain-images-free-stock-photos-apple-fruits-iphone

When you go to Uniqlo,  people of Uniqlo talk to you and advise how you can coordinate your favorite fashion.  When you go to hospitals, doctors ask you what your condition is and advise you what you should do in order to be healthy.  Then let us consider whether computers can talk to you and answer your questions, instead of a human being.

It is the first step to know the customers in service industries,  students in education.  So there are many people working to face with customers and students. If computers can face with customers and students,  it means that quality of services dramatically is going up because computers are cost-effective and operate 24hours per day, 365 days per year without rest time.

 

I like taking courses in open online courses.  It is very convenient as we can look at courses whenever we want as long as internet connection is available.  But the biggest problem is that there are no teachers to be asked for each learner when you want to ask.  This description explains this problem very well.

Because of the nature of MOOC-style instruction (Massive Open Online Course), teachers cannot provide active feedback to individual learners. Most MOOCs have thousands of learners enrolled at the same time and engaging personally with each learner is not possible.”

When I cannot understand the course lectures and solve the problems in exams by myself, it is very difficult to continue to learn because I feel powerless.  This is one of the reasons why completion rate is very low in open online courses (usually less than 10%).  If you need assistance from instructors,  you should pay fees which are not cheep for people in developing countries. I want to change this situation.

 

A technology called “Machine learning” may enable us to enjoy conversations with computers cross industries from financial to education.  Computers can understand what you ask and provide answers in real-time basis.  It takes some time to develop to make computers more sophisticated, so that computers can answer exactly what you want.  This is like a childhood.  At the beginning, there is very little knowledge so It may be difficult to answer questions. Then computers start learning from interactions with human.  The more knowledge they have,  the more sophisticated their answer is.

So I would like to start to examine how computer is learning in order to provide sophisticated answers to learners and customers. If computers obtain enough knowledge effectively, they can talk to you and enjoy conversations with you.  I hope computers can be good partners to us.

This new toy looks so bright! Do you know why ?

doll-2679_640

Last week I found that new toy  called “CogniToys” for infants will be developed in the project of Kickstarter, one of the biggest platforms in cloud funding.  The developer is elemental path, one of the three winners of the IBM Watson competition. Let see why it is so bright!

According to the web site of this company,  this toy is connected to the internet.  When a child talks to this toy, it can reply because this toy can see what a child says and answer the question from a child.  It usually requires less than one second to answer because IBM Watson-powered system is powerful enough to calculate answers quickly.

 

Let us look at the descriptions of this company’s technology.

“The Elemental Path technology is built to easily license and integrate into existing product lines. Our dialog engine is able to utilize some of the most advanced language processing algorithms available driving the personalization of our platform, and keeping the interaction going between toy and child.”

Key words are 1. Dialog    2. Language processing   3. Personalization

 

1. Dialog

This toy communicates with children by conversation, rather than programming. Therefore technology called “speech recognition” is needed in it.  This technology is applied in real-time machine translation such as Microsoft Skype, too.

 

2. Language processing

In the area of machine learning, it is called “Natural language processing”. Based on the structure of sentence and phrase, the toy understands what children say.  IBM Watson is very expert in the field of natural language processing because Watson should understand the meaning of questions in Jeopardy contests before.

 

3. Personalization

It is beneficial when children talk to this toy, it knows children preference in advance. This technology is called “Personalization”.  Through interactions between children and the toy, it can learn what children like to cognize. This technology is oftentimes used in retailers such as Amazon and Netflix. There is no disclosure about the method of personalization as far as I know.  I am very interested in how the personalization mechanism works.

 

In short, machine learning enables this toy to work and be smart. Functions of Machine Learning are provided as a service by big IT companies, such as IBM and Microsoft.  Therefore, this kind of applications is expected to be put out to the market in future. This is amazing, isn’t it?  I imagine next versions of the toy can see images,  identify what they are and share images with children because technology called image recognition is also offered as a service by big companies.

I ordered one CogniToy through Kickstarter. It is expected to deliver in November this year. I will report how it works when I get it!

 

Note:IBM, IBM Watson Analytics, the IBM logo are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. 

What can computers do now ? It looks very smart !

restaurant-301951_1280-2

Lately I found that several companies such as Microsoft and IBM provide us services by machine learning. Let us see what is going on now.

These new services are based on the progress on Machine learning recently. For example, Machine translation services between English and Spanish are provided by Microsoft skype. It uses Natural Language Processing by Machine learning. Although it started at Dec 2014, the quality of the services is expected to be improved quickly as a lot of people use and computer can learn the data from such users.

 

It is beneficial for you to explain what computers can do lately so that you can imagine new services in future. First, computers can see the images and videos and identify what it is. This is image recognition. Second, it can listen to our speech and interpret what you mean. This is speech recognition. It can translate one language to another, as well. This is machine translation. Third, computers can research based on concepts rather than key words. Fourth, it can calculate best choice among the potential options. This is an optimization. In short computers can see, listen to, read, speak and think.

These functions are utilized in many products and services although you cannot notice it. For example, IBM Watson Analytics provides these functions through platform as a service to developers.

 

I expect these functions enable computers to behave just like us. At the initial phase, it may be not so good just like a baby. However, machine learning allows computers to learn from experience. It means that the computer will perform better than we do in many fields. As you know, Shogi, one of the popular Japanese board game, artificial machine players can beat human professional teams. This is amazing!

Proceeding forward, it is recommended that you understand how computers are progressing in terms of the functions above. Many companies such as Google, Facebook invest a great deal of money in this filed. Therefore, many services are anticipated to be released in near future. Some of new services can impact our jobs, education and society a lot. Some of them may arise new industries in future.

 

Some day, when you are in the room, the computer can identify you by computer vision. Then ask if you want to drink a cup of coffee. The computer holds a lot of data, such as temperature, weather, time, season, your preference in it and generates the best coffee for you. If you want to know how this coffee is generated, the computer provides you a detailed report about the coffee. All settings are done automatically. It is the ultimate coffee maker by using powerful computer algorithm. Do you want it for you?

 

 

Note:IBM, IBM Watson Analytics, the IBM logo are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. 

Is this message spoken by human or machine?!

binary-system-557614_1280

Firstly, could you watch the video ?   Our senior instructor speaks about himself.  It sounds natural for me,  far better than my poor English. Then the question comes. Who speaks in reality?  Human or machine?  The answer is IBM Watson,  one of the famous artificial intelligence in the world.  When I listened to his (or her?) English, I was very surprised as it sounds very natural and fluent.  I want to have artificial English speakers for a long time in order to develop self speaking apps. Finally, I found it!

This function is one of the new five services provided in IBM Watson Developer Cloud as beta service.   Now it has 13 functions total. Here are new services.

  1. Speech to Text :  Speech can be converted to text in real-time basis. It looks good when I try to convert news broadcast into text.
  2. Text to Speech :  This is used to prepare the video message above without native speakers. It sounds natural for both male and female voices.  English and Spanish (only male) are currently available. One of them is the American English voice used by Watson in the 2011 Jeopardy match
  3. Visual Recognition : When you can input jpg image, Watson can identify what it is with probabilities.  I try several images, however it looks less accurate than I expected so far. In my view it needs improvement to be used in applications.
  4. Concept Insights : According to explanations in the company blog, the Concept Insights service links documents that you provide with a pre-existing graph of concepts based on Wikipedia.   I think it is useful as it works beyond just using keywords in searching information.
  5. Tradeoff Analytics : According to explanations in the company blog, it helps people make better choices when faced with conflicting goals and multiple alternatives, each with its own strengths and weaknesses.  I think it has optimization algorithms in it. It may be useful to construct investment portfolios.

Watson can listen to speeches,  read text and speak it.  It also can see the image and understand what is to some extent. Therefore Watson can do the same thing as human do with new added functions.  Therefore, in theory,  mobile applications can obtain the same functions as people do, such as seeing, reading, listening and speaking.

IBM Watson Developer Cloud has a plan to add new functions as they are ready. Although they are currently beta service,  its quality must be improved gradually as machine learning behind services learns a lot in future. It enables us to develop new services with artificial intelligence to be available in a short period.  It must be amazing. What kind of services do you want? Maybe it will be available in near future !

Note:IBM, IBM Watson, the IBM logo are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide.