How can computers see the objects? It is done by probability

sheltie-1023012_640

Do you know how computers can see the world?  It is very important as self-driving cars will be available in near future.  If you do not know it,  you can not be brave enough to ride on them. So let me explain it for a while.

 

1.Image can be expressed as a sequence of number

I believe that you have heard the word “RGB“. R stands for red,  G stands for green, B stands for blue. Every color is created by mix of three colors of R,G and B.  Each R, G and B has a value of number which is somewhere from 0 to 255.  Therefore each point in the images, which is called “pixel” has a vector such as [255, 35, 57].  So each image can be expressed as a sequence of numbers. The sequence of numbers are fed into computers to understand what it is.

 

2. Convnet and classifier learn and classify images

Once images are fed into computers,  convnet is used to analyze these data. Convent is one of the famous algorithms of deep learning and frequently used for computer vision. Basic process of image classification is explained as follows.

conputer-vision-001

  • The images is fed into computers as a sequence of numbers
  • Convolutional neural network identifies features to represent the object in the image
  • Features are obtained as a vector
  • Classifier provides the probability of each candidate of the objective
  • The object in the image is classified as an object with the highest probability

In this case, probability of Dog is the highest. So computers can classify “it is a dog”.  Of course, each image has a different set of probabilities so that computers can understand what it is.

 

3.  This is a basic process of computer vision. In order to achieve higher accuracy, many researchers have been developing better algorithms and processing methods intensively. I believe that the most advanced computer vision algorithm is about to surpass the sight of human being. Could you look at the famous experiment by a researcher with his sight? (1)  . His error rate is 5.1%.

Now I am very interested in computer vision and focus on this field in my research. Hope I can update my new finding in near future.

 

1.What I learned from competing against a ConvNet on ImageNet, Andrej Karpathy, a Research Scientist at OpenAI, Sep 2 2014

http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s