# How can computers see the objects? It is done by probability

Do you know how computers can see the world?  It is very important as self-driving cars will be available in near future.  If you do not know it,  you can not be brave enough to ride on them. So let me explain it for a while.

1.Image can be expressed as a sequence of number

I believe that you have heard the word “RGB“. R stands for red,  G stands for green, B stands for blue. Every color is created by mix of three colors of R,G and B.  Each R, G and B has a value of number which is somewhere from 0 to 255.  Therefore each point in the images, which is called “pixel” has a vector such as [255, 35, 57].  So each image can be expressed as a sequence of numbers. The sequence of numbers are fed into computers to understand what it is.

2. Convnet and classifier learn and classify images

Once images are fed into computers,  convnet is used to analyze these data. Convent is one of the famous algorithms of deep learning and frequently used for computer vision. Basic process of image classification is explained as follows.

• The images is fed into computers as a sequence of numbers
• Convolutional neural network identifies features to represent the object in the image
• Features are obtained as a vector
• Classifier provides the probability of each candidate of the objective
• The object in the image is classified as an object with the highest probability

In this case, probability of Dog is the highest. So computers can classify “it is a dog”.  Of course, each image has a different set of probabilities so that computers can understand what it is.

3.  This is a basic process of computer vision. In order to achieve higher accuracy, many researchers have been developing better algorithms and processing methods intensively. I believe that the most advanced computer vision algorithm is about to surpass the sight of human being. Could you look at the famous experiment by a researcher with his sight? (1)  . His error rate is 5.1%.

Now I am very interested in computer vision and focus on this field in my research. Hope I can update my new finding in near future.

1.What I learned from competing against a ConvNet on ImageNet, Andrej Karpathy, a Research Scientist at OpenAI, Sep 2 2014

http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

# Is this a real voice by human being? It is amazing as generated by computers

As I shared the article this week,  I found the exciting system to generate voices by computers. When I heard the voice I was very surprised as it sounds so real. I recommend you to listen to them in the website here.  There are versions of English and Mandarine. This is created by DeepMind, which is one of the best research arms of artificial intelligence in the world. What makes it happen?   Let us see it now.

1. Computers learns our voices deeper and deeper

According to the explanation of DeepMind, they use “WaveNet, a deep neural network for generating raw audio waveforms”.  They also explain”pixel RNN and pixel CNN”, which are invented by them earlier this year. (They have got one of best paper award at ICML 2016, which are one of the biggest international conference about machine learning, based on the research). By applying pixel RNN and CNN to voice generation, computers can learn wave of voices far more details than previous methods. It enables computers generate more natural voices. It is how WaveNet is born this time.

As the result of learning raw audio waveforms, computer can generate voices that sound so real. Could you see the metrics below?  The score of WaveNet is not so different from the score of Human Speech (1). It is amazing!

2. Computers can generate man’s voice as well as woman’s voice at the same time

As computer can learn wave of our voices more details,  they can create both man’s voice and woman’s voice. You can also listen to each of them in the web. DeepMind says “Similarly, we could provide additional inputs to the model, such as emotions or accents”(2) . I would like to listen them, too!

3. Computers can generate not only voice but also music!

In addition to that,  WaveNet can create music, too.  I listen to the piano music by WaveNet and I like it very much as it sounds so real. You can try it in the web, too.  When we consider music and voice as just data of audio waveforms, it is natural that WaveNets can generate not only voices but also music.

If we can use WaveNet in digital marketing, it must be awesome! Every promotions, instructions and guidance to customers can be done by voice of  WaveNet!  Customers may not recognize “it is the voice by computers”.  Background music could be optimized to each customer by WaveNet, too!  In my view, this algorithm could be applied to many other problems such as detections of cyber security attack, anomaly detections of vibrations of engines, analysis of earthquake as long as data can form  of “wave”.  I want to try many things by myself!

Could you listen the voice by WaveNet? I believe that in near future, computers could learn how I speech and generate my voice just as I say.  It must be exciting!

1,2.  WaveNet:A generative model for Raw Audio

https://deepmind.com/blog/wavenet-generative-model-raw-audio/

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

# Let us overview the variations of deep learning now !

This weekend, I research recurrent neural network (RNN) as I want to develop my small chatbot. I also run program of convnet as I want to confirm how they are accurate.  So I think it is good timing to overview the variations of deep learning because this makes it easier to learn each of network in details.

1. Fully connected network

This is the basic of deep learning. When you heard the word “deep learning”, it means “Fully connected network” in most cases. Let us see the program in my article of last week again. You can see “fully_connected” in it.  This network is similar to the network in our brain.

2. Convolutional neural network (Convnet)

This is mainly used for image recognition and computer vision. there are many variations in convnet to achieve higher accuracy. Could you remember my recommendation of TED presentations before?  Let us see it again when you want to know convnet more.

3. Recurrent neural network (RNN)

The biggest advantage of RNN is that no need to use fixed size input (Covnet needs it). Therefore it is frequently used in natural language processes as our sentences are sometimes very short and sometimes very long. It means that RNN can handle sequence of input data effectively. In order to solve difficulties when parameters are obtained, many kind of RNN are developed and used now.

4. Reinforcement learning (RL)

the output is an action or sequence of actions and the only supervisory signal is an occasional scalar reward.

• The goal in selecting each action is to maximize the expected sum of the future rewards. We usually use a discount factor for delayed rewards so that we don’t have to look too far into the future.

This is a good explanation according to the lecture_slides-lec1 p46 of  “Neural Networks for Machine Learning” by Geoffrey Hinton, in Coursera.

Many researchers all over the world have been developing new models. Therefore new kind of network may be added in near future. Until that, these models are considered as building blocks to implement the deep learning algorithms to solve our problems. Let us use them effectively!

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

# We might need less energy as artificial intelligence can enable us to do so

When I heard the news about the reduction of consumption energy in google data center (1), I was very surprised.  Because it has been optimized for a long time. It means that it is very difficult to improve the efficiency of the system more.

It is done by “Google DeepMind“, which has been developing “General artificial intelligence”. Google DeepMind is an expert on “deep learning‘. It is one of major technologies of artificial intelligence. Their deep learning models can reduce the energy consumption in data center of google dramatically.  Many data are corrected in data center, — data such as temperatures, power, pump speeds, etc. — and the models provide more efficient control of energy consumption. This is amazing. If you are interested in the details, you can read their own blog from the link below.

It is easy to imagine that there are much room to get more efficiency outside google data centers. There are many huge systems such as factories, airport, power generators, hospitals, schools, shopping mall, etc.. But few systems could have the same control as Google DeepMind provides. I think they can be more effeicent based on the points below.

1.More data will be available from devices, sensors and social media

Most people  have their own mobile devices and use them everyday.  Sensors are getting cheaper and there are many sensors in factories, engines on airplanes and automobile, power generations,etc. People use social media and generate their own contents everyday. It means that massive amount of data are generating and volume of data are increasing dramatically. The more data are available, the more chances we can get to improve energy consumptions.

2. Computing resources are available from anywhere and anytime

The data itself can say nothing without analyzing it.  When massive amount of data is available,  massive amount of computer resources are needed. But do not worry about that. Now we have cloud systems. Without buying our own computer resources, such as servers, we can start analyzing data with “cloud“.  Cloud introduces “Pay as you go” system. It means that we do not need huge initial investments to start understanding data. Just start it today with cloud.  Cloud providers, such as Amazon web service, Microsoft Azure and Google Cloud Platform, prepare massive amount of computer resources which are available for us.  Fast computational resources, such as GPU (Graphics processing unit) are also available. So we can make most out of massive amount of data.

3. Algorithms will be improved at astonishing speed.

I have heard that there are more than 1000 research papers to submit and apply to one major machine learning international conference. It means that many researchers are developing their own models to improve the algorithms everyday. There are many international conferences on machine learning every year. I can not imagine how many innovations of algorithms will appear in future.

At the end of their blog, Google DeepMind says

“We are planning to roll out this system more broadly and will share how we did it in an upcoming publication, so that other data centre and industrial system operators — and ultimately the environment — can benefit from this major step forward.”
So let us see what they say in next publication.  Then we can discuss how to apply their technology to our own problems. It must be exciting!

(1) DeepMind AI Reduces Google data center cooling bill by 40%,  21st July 2016

https://deepmind.com/blog

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software