This is incredible! Semantic segmentation by just 700 images from scratch with Mac Air!

dubai-1767540_1280

You may see this kind of pair of images below before.  Images are segmented by color based on the objects on them.  They are called “semantic segmentation”.  It is studied by many AI researchers now because it is critically important for self-driving car and robotics.

segmentaion1

Unfortunately, however, it is not easy for startups like us to perform this task.  Like other computer vision tasks, semantic segmentations needs massive images and computer resources. It is sometimes difficult in tight-budget projects. In case we cannot correct many images,  we are likely to give it up.

 

This situation can be changed by this new algorithm.  This is called “Fully convolutional DenseNets for semantic segmentation  (In short called “Tiramisu” 1)”.    Technically, this is the network which consists of many “Densenet(2)”,  which in July 2017 was awarded the CVPR Best Paper award.  This is a structure of this model written in the research paper (1).

Tiramisu1

I would like to confirm how this model works with a small volume of images. So I obtain urban-scene image set which is called”CamVid Database (3)”.  It has 701 scene images and colour-labeled images.  I choose 468 images for training and 233 images for testing. This is very little data for computer vision tasks as it usually needs more than 10,000-100,000 images to complete training for each task from scratch. In my experiment,  I do not use pre-trained models.  I do not use GPU for computation, either. My weapon is just MacBook Air 13 (Core i5) just like many business persons and students.  But new algorithm works extream well.  Here is the example of results.

T0.84 2017-08-13-1

T0.84 2017-08-13-4

“Prediction” looks similar to “ground-truth” which means the right answer in my experiment. Over all accuracy is around 83% for classification of 33 classes (at the 45th epoch in training).  This is incredible as only little data is available here. Although prediction misses some parts such as poles,  I am confident to gain more accuracy when more data and resources are available. Here is the training result. It took around 27 hours.  (Technically I use “FC-DenseNet56”.  Please read the research paper(1) for details)

Tiramisu0.84_2

Tiramisu0.84_1

Added on 18th August 2017: If you are interested in code with keras, please see this Github.

 

This experiment is inspired by awesome MOOCs called “fast.ai by Jeremy Howard. I strongly recommend watching this course if you are interested in deep learning.  No problem as it is free.  It has less math and is easy to understand for the people who are not interested in Ph.D. of computer science.

I will continue to research this model and others in computer vision. Hope I can provide updates soon.  Thanks for reading!

 

 

1.The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation (Simon Jegou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio),  5 Dec 2016

 

2. Densely Connected Convolutional Networks(Gao Huang, Zhuang Liu, Kilian Q. Weinberger, Laurens van der Maaten),  3 Dec 2016

 

3. Segmentation and Recognition Using Structure from Motion Point Clouds, ECCV 2008
Brostow, Shotton, Fauqueur, Cipolla (bibtex)

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software

 

Advertisements

How can we track our mobile-e-commerce? Google analytics academy is good to start learning!

iphone-410311_640

Last week, I found that Alibaba, the biggest e-commerce in China, announced the financial result of Q2 2016.  One of things that were attracting me is 75% sales are coming from mobile device, rather than PC.

This is amazing. This is much bigger than I expected.  When we consider many younger people use mobile devices as their main devices. This rate is expected to increase steadily going forward.

Then I wonder how we can track customer behaviors on mobile-e-commerce with ease. Because it is getting more important as many customers come to your e-commerce shop from mobile devices. What do you think?

 

I found that Google analytics academy, which teaches how to use Google analytics, provides awesome online courses for free.  Although you may not be users of Google analytics, it is very beneficial because it shares the idea and concept of mobile-e-commerce. If you want to know which marketing generates the most valuable users, it is worth learning it. Let me explain several take aways

 

1. “High-value user” vs “Low-value user”

When we have many users at our mobile-e-commerce shop,  we find that some users buy many products or subscriptions than other users. They are “High-value users”. On the other hand, some users rarely buy them. They are “Low-value users”. This idea is good and useful to prepare target lists of new campaigns in order to put priority among  many customers. So our goal is to increase the number of  “High-value user” effectively.

 

2. Segmentation of customer is critically important

Segmentation means prepare the correct subset data to get insights form data. It is popular and widely-used across industries. When we analyze data, creating appropriate user segments are critically important. You may want create the segment of “buy-users and not-buy-users” and get the insights of what factors influence people to buy. There are many segmentations you can imagine.  You can create your own segmentations on Google analytics!

 

3. How to measure behavior of customers

It is also important to track behavior of each customer. There are many data to be obtained.  Ex : What screen each customer visit and what actions they take. How many minutes they stay on each screen and how much they spend to buy products. The former data is formed as “categorical” and the latter as “numerical”.  It is noted that these data should be relevant to identify and increase the number of “high-value user” as it is our goal. When you identify good candidates of data to use,  you can add them to your own segmentations and analyze them deeper in order to get insights from these data.

 

In addition to the on-line courses,  Google analytics makes real data of their e-commerce shop “Google Merchandise Store ” available to everyone who wants to learn it for free. It is called “Google analytics demo account“. This is also an amazing service as e-commerce data in real-world are rarely available to us before.  I would like to go deeper and get insights from them in near future.  Of course I will share it here with you as it is beneficial to everyone. Please see the one of awesome reports on Google analytics demo account.

Google analytics DA

 

Do you like it?  I recommend you to start learning with Google analytics academy. When you are getting familiar with data of mobile-e-commerce, it is more easier to learn more advanced data analytics, such as machine learning. Anyway, this course is free so you can access many awesome contents without paying any fee. Let us try and enjoy it!

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

 

 

What is the marketing strategy at the age of “everything digital”?

presentation-1311169_640

In July,  I have researched TensorFlow, which is a deep learning library by Google, and performed several classification tasks.  Although it is open-source software and free for everyone, its performance is incredible as I said in my last article.

When I perform image classification task with TensorFlow,  I found that computers can see our world better and better as deep learning algorithms are improved dramatically. Especially it is getting better to extract “features“, what we need to classify images.

Images are just a sequence of numbers for computers. So some features are difficult for us to understand what they are. However computers can do that. It means that computers might see what we cannot see in images. This is amazing!

Open CV

 

Open CV2

This is an example “how images are represented as a sequence of numbers. You can see many numbers above (These are just a small part of all numbers). These numbers can be converted to the image above which we can see. But computers cannot see the image directly.  It can only see the image through numbers above. On the other hand, we can  not understand the sequence of numbers above at all as they are too complicated. It is interesting.

In marketing,  when images of products are provided,  computers might see what are needed to improve the products and to be sold more. Because computers can understand these products more in a deferent way as we do. It might give us new way to consider marketing strategy.  Let us take T shirts as an example. We usually consider things like  color, shape,  texture,  drawings on it,  price. Yes, they are examples of “features” of T shirts because T-shirts can be represented by them. But computers might think more from the images of T shirts than we do. Computers might create their own features of T-shirts.

 

Then, I would like to point out three things to consider new marketing strategy.

1.Computers might extract more information that we do from same images.

As I explained, computers can see the images in a different way as we do. We can say same things for other data, such as text or voice mail as they are also just a sequence of numbers for computers. Therefore computers might understand our customers behavior more based on customer related data than we do when deep learning algorithms are much improved. We sometimes might not understand how computers can understand many data because computers can understand text/speech as a sequence of numbers and provide many features that are difficult to explain for us.

 

2.Computers might see many kind of data as massive amount data generated by costomers

Not only images but also other data, such as text or voice mail are available for computers as they are also just a sequence of numbers for computers. Now everything from images to voice massages is going to digital.  I would like to make computers understand all of them with deep learning. We cannot say what features are used when computers see images or text in advance. But I believe some useful and beneficial things must be found.

 

3. Computers can work in real-time basis

As you know, computers can work 24 hours a day, 365 days a year. Therefore it can operate in real-time basis. When new data is input, answer can be obtained in real-time basis. This answer can be triggered next actions by customers. These actions also can be recorded as digital and fed to into computers again. Therefore many digital data will be generated when computers are operated without stop /rest time and the interactions with customers might trigger chain-reactions. I would like to call it “digital on digital”

 

Images, social media, e-mails from customers, voice mail,  sentences in promotions, sensor data from customers are also “digital”. So there are many things that computers can see. Computers may find many features to understand customer behaviors and preferences in real-time basis. We need to have system infrastructures to enable computers to see them and tell the insight from them. Do you agree with that?

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

 

This is our new platform provided by Google. It is amazing as it is so accurate!

cheesecake-608963_640

In Deep learning project for digital marketing,  we need superior tools to perform data analysis and deep learning.  I have watched “TensorFlow“, which is an open source software provided by Google since it was published on Nov 2015.   According to one of the latest surveys by  KDnuggets, “TensorFlow” is the top ranked tool for deep learning (H2O, which our company uses as main AI engine, is also getting popular)(1).

I try to perform an image recognition task with TensorFlow and ensure how it works. These are results of my experiment. MNIST, which is hand written digits from 0 to 9, is used for the experiment. I choose convolutional network to perform it.  How can TensorFlow can classify them correctly?

MNIST

I set the program of TensorFlow in jupyter like this. This comes from tutorials of TensorFlow.

MNIST 0.81

 

This is the result . It is obtained after 80-minute training. My machine is MAC air 11 (1.4 GHz Intel Core i5, 4GB memory)

MNIST 0.81 3

Could you see the accuracy rate?  Accuracy rate is 0.9929. So error rate is just 0.71%!  It is amazing!

MNIST 0.81 2r

Based on my experiment, TensorFlow is an awesome tool for deep learning.  I found that many other algorithms, such as LSTM and Reinforcement learning, are available in TensorFlow. The more algorithms we have,  the more flexible our strategy for solutions of digital marketing can be.

 

We obtain this awesome tool to perform deep learning. From now we can analyze many data with TensorFlow.  I will provide good insights from data in the project to promote digital marketing. As I said before “TensorFlow” is open source software. It is free to use in our businesses.  No fees is required to pay. This is a big advantage for us!

I can not say TensorFlow is a tool for beginners as it is a computer language for deep leaning. (H2O can be operated without programming by GUI). If you are familiar with Python or similar languages, It is for you!  You can download and use it without paying any fees. So you can try it by yourself. This is my strong recommendation!

 

TensorFlow: Large-scale machine learning on heterogeneous systems

1 : R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results

http://www.kdnuggets.com/2016/06/r-python-top-analytics-data-mining-data-science-software.html

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

 

“DEEP LEARNING PROJECT for Digital marketing” starts today. I present probability of visiting the store here

cake-623579_640

At the beginning of this year,  I set up a new project of my company.  The project is called “Deep Learning project” because “Deep Learning” is used as a core calculation engine in the project. Now that I have set up the predictive system to predict customer response to a direct mailing campaign, I would like to start a sub-project called  “DEEP LEARNING PROJECT for Digital marketing”.  I think the results from the project can be applied across industries, such as healthcare, financial, retails, travels and hotels, food and beverage, entertainments and so on. First, I would like to explain how to obtain probability for each customer to visit the store in our project.

 

1. What is the progress of the project so far?

There are several progresses in the project.

  • Developing the model to obtain the probability of visiting the store
  • Developing the scoring process to assign the probability to each customer
  • Implement the predictive system by using Excel as an interface

Let me explain our predictive system. We constructed the predictive system on the platform of  Microsoft Azure Machine Learning Studio. The beauty of the platform is Excel, which is used by everyone, can be used as an interface to input and output data. This is our interface of the predictive system with on-line Excel. Logistic regression in MS Azure Machine Learning is used as our predictive model.

The second row (highlighted) is the window to input customer data.

Azure ML 1

Once customer data are input, the probability for the customer to visit the store can be output. (See the red characters and number below). In this case (Sample data No.1) the customer is less likely to visit the store as Scored  Probabilities is very low (0.06)

Azure ML 3

 

On the other hand,  In the case (Sample data No.5) the customer is likely to visit the store as Scored Probabilities is relatively high (0.28). If you want to know how it works, could you see the video?

Azure ML 2

Azure ML 4

 

2. What is the next in our project?

Once we create the model and implement the predictive system, we are going to the next stage to reach more advanced topics

  • More marketing cases with variety of data
  • More accuracy by using many models including Deep Learning
  • How to implement data-driven management

 

Our predictive system should be more flexible and accurate. In order to achieve that, we will perform many experiments going forward.

 

3. What data is used in the project?

There are several data to be used for digital marketing. I would like to use this data for our project.

When we are satisfied with the results of our predictions by this data,  next data can be used for our project.

 

 

Digital marketing is getting more important to many industries from retail to financial.   I will update the article about our project on a monthly basis. Why don’t you join us and enjoy it!  When you have your comments or opinions, please do not hesitate to send us!

If you want to receive update of the project or want to know the predictive system more, could you sing up here?

 

 

 

Microsoft, Excel and AZURE are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

Will the age of “Brain as a Service” come to us in near future?

macbook-926425_640

15 March 2016,  I found two things which may change the world in the future,  The former, artificial intelligence Go player “AlphaGo” and the latter is an automated marketing system “Google Analytics 360 Suite“. Both of them came from Google. Let me explain why I think the age of “Brain as a service” is coming  based on these two innovations.

1. AlphaGo

You may know what AlphaGo achieved on 15 March 2016.  At  Google DeepMind Challenge, where artificial intelligence Go player had five games against a top professional Go player. It beats Lee sedol, who is one of the strongest Go player in the world, 4 to 1.  Go is one of the oldest games, which are mainly played in China, Korea, Japan and Taiwan. At the beginning of the challenge, few people thought AlphaGo could win the games as it is always said that  Go is so complex that computers can not win professional Go players at least in 10 years. The result was, however, completely opposite. Therefore,  other professional Go players, artificial intelligence researchers and even people who do not play Go must be upset to hear the news. AlfaGo is strengthened by algorithms, which are called “deep learning” and “reinforcement learning“. It can learn the massive amount of Go patterns created by human being for a long time.  Therefore, we need not to program specifically, one by one as computers can learn by themselves. It looks like our brains. We are born without any knowledge and start learning many things as we grow.  Finally, we can be sophisticated enough to be “adult”. Yes, we can see “AlphaGo” as a brain.  It can learn by itself at an astonishing speed as it does not need to rest.  It is highly likely that Google will use this brain to improve many products in it in the future.

 

2. Google Analytics 360 Suite

Data is a king.  But it is very difficult to feed them into computers effectively.  Some data are stored in servers. Others are stored in local PCs. No one knows how we can well-organize data effectively to obtain the insights from data.  Google is strong for consumer use.  G-mails, Android and google search are initially very popular among consumers. But the situations are gradually changing.  Data and algorithms have no-boarders between consumers and enterprises. So it is natural for Google to try to obtain enterprise market more and more. One of the examples is  “Google analytics 360 Suites”. Although I never tried it yet, this is very interesting for me because it can work as a perfect interface to customers. Customers may request many things, ask questions and make complains to your services. It is very difficult to gather these data effectively when systems are not united seamlessly. But with “Google analytics 360 Suites”,  data of customers could be tracked in a timely manner effectively.  For example, the data from Google analytics 360 may be going to Google Audience Center 360,  which is a data management platform (DMP).  It means that the data is available to any analyses that marketers want.  “Google Audience Center 360” can collect data from other sources or third party data providers. It means that many kind of data could be ready to be fed into computers effectively.

 

3. Data is gasoline for “Artificial intelligence”

AlfaGo can be considered as “Artificial intelligence”. “Artificial intelligence” is like our brain.  There is no knowledge in it initially.  It has only structures to learn.  In order to be “intelligent”, it should learn a lot from data. It means that massive amount data should be fed into computers. Without data, “artificial intelligence” can do nothing. Now data management like “Google Audience Center 360” is in progress. It seems that data are getting well organized to be fed into computers.  The centralized data management system can collect data automatically from many systems. It becomes easier to feed massive amounts of data into computers. It enables to computers learn the massive amount of data. These things must be a trigger to change the landscape of our business, societies and lives. Because suddenly computers can be sophisticated enough to work just like our brain.  AlphaGo teaches us that it may happen when a few people think so. Yes, this is why I think that the age of “Brain as a Service” will come in near future.  How do you think of that?

 

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

 

This is No.1 open-online course of “Deep Learning”. It is a new year present from Google!

desk-918425_640

I am very happy to find this awesome course of “Deep Learning” now . It is the course which is provided by Google through Udacity(1), one of the biggest mooc platforms in the world. So I would like to share it to any person who are interested in “Deep Learning”.

It is the first course which explains Deep Learning from Logistic regression to Recurrent neural net (RNN) in the uniformed manner on mooc platform as far as I know. I looked at it and was very surprised how awesome the quality of the course is.  Let me explain more details.

 

1. We can learn everything from Logistic regression to RNN seamlessly

This course covers many important topics such as logistic regression, neural network,  regularization, dropout, convolutional net, RNN and Long short term memory (LSTM). These topics are seen in some articles independently before. It is however very rare to explain each of them at once in the same place.  This course looks like a story of development of Deep Learning. Therefore, even beginners of Deep Learning can follow the course. Please look at the path of the course. It is taken from the course video of L1 Machine Learning to Deep Learning .

DL path

Especially, explanations of RNN are very easy to understand. So if you do not have enough time to take a whole course, I just recommend to watch the videos of RNN and related topics in the course. I am sure it is worth doing that.

 

2. Math is a little required, but it is not an obstacle to take this course

This is one of the courses in computer science.  The more you understand math, the more you can obtain insights from the course. However, if you are not so familiar with mathematics, all you have to do is to overview basic knowledge of “vectors”, “matrices” and “derivatives”.  I do not think you need to give up the course because of the lack of knowledge of math. Just recall high school math, then you can start this awesome course!

 

3. “Deep learning” can be implemented with “TensorFlow“, which is open source provided by Google

This is the most exciting part of the course if you are developers or programmers.  TensorFlow is a  python-based language. So many developers and programmers can be familiar with TensorFlow easily.  In the program assignments, participants can learn from simple neural net to sequence to sequence net with TensorFlow. It must be good! While I have not tried TensorFlow programming yet, I would like to do that in the near future. It is worth doing that even though you are not programmers. Let us challenge it!

 

 

In my view,  Deep Learning for sequence data is getting more important as time series data are frequently used in economic analysis,  customer management and internet of things.   Therefore, not only data-scientists, but also business personnel, company executives can benefit from this course.  It is free and self-paced when you watch the videos. If you need a credential, small fee is required. Why don’t you try  this awesome course?

 

 

(1) Deep Learning on Udacity

https://www.udacity.com//course/viewer#!/c-ud730/l-6370362152/m-6379811815

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

 

 

“DEEP LEARNING PROJECT” starts now. I believe it works in digital marketing and economic analysis

desert-956825_640

As the new year starts,  I would like to set up a new project of my company.  This is beneficial not only for my company, but also readers of the article because this project will provide good examples of predictive analytics and implementation of new tools as well as platforms. The new project is called “Deep Learning project” because “Deep Learning” is used as a core calculation engine in the project.  Through the project,  I would like to create “predictive analytics environment”. Let me explain the details.

 

1.What is the goal of the project?

There are three goals of the project.

  • Obtain knowledge and expertise of predictive analytics
  • Obtain solutions for data-driven management
  • Obtain basic knowledge of Deep Learning

As big data are available more and more, we need to know how to consume big data to get insight from them so that we can make better business decisions.  Predictive analytics is a key for data-driven management as it can make predictions “What comes next?” based on data. I hope you can obtain expertise of predictive analytics by reading my articles about the project. I believe it is good and important for us  as we are in the digital economy now and in future.

 

2.Why is “Deep Learning” used in the project?

Since the November last year, I tried “Deep Learning” many times to perform predictive analytics. I found that it is very accurate.  It is sometimes said that It requires too much time to solve problems. But in my case, I can solve many problems within 3 hours. I consider “Deep Learning” can solve the problems within a reasonable time. In the project I would like to develop the skills of tuning parameters in an effective manner as “Deep Learning” requires several parameters setting such as the number of hidden layers. I would like to focus on how number of layers, number of neurons,  activate functions, regularization, drop-out  can be set according to datasets. I think they are key to develop predictive models with good accuracy.  I have challenged MNIST hand-written digit classifications and our error rate has been improved to 1.9%. This is done by H2O, an awesome analytic tool, and MAC Air11 which is just a normal laptop PC.   I would like to set my cluster on AWS  in order to improve our error rate more. “Spark” is one of the candidates to set up a cluster. It is an open source.

DL.002

3. What businesses can benefit from introducing “Deep Learning “?

“Deep Learning ” is very flexible. Therefore, it can be applied to many problems cross industries.  Healthcare, financial, retails, travels, food and beverage might be benefit from introducing “Deep Learning “.  Governments could benefit, too. In the project, I would like to focus these areas as follows.

  • Digital marketing
  • Economic analysis

I would like to create a database to store the data to be analyzed, first. Once it is created,  I perform predictive analytics on “Digital marketing” and “Economic analysis”.  Best practices will be shared with you to reach our goal “Obtain knowledge and expertise of predictive analytics” here.  Deep Learning is relatively new to apply both of the problems.  So I expect new insight will be obtained. For digital marketing,  I would like to focus on social media and measurement of effectiveness of digital marketing strategies.  “Natural language processing” has been developed recently at astonishing speed.  So I believe there could be a good way to analyze text data.  If you have any suggestions on predictive analytics in digital marketing,  could you let me know?  It is always welcome!

 

I use open source software to create an environment of predictive analytics. Therefore, it is very easy for you to create a similar environment on your system/cloud. I believe open source is a key to develop superior predictive models as everyone can participate in the project.  You do not need to pay any fee to introduce tools which are used in the project as they are open source. Ownership of the problems should be ours, rather than software vendors.  Why don’t you join us and enjoy it! If you want to receive update the project, could you sing up here?

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

How will “Deep Learning” change our daily lives in 2016?

navigation-1048294_640

“Deep Learning” is one of the major technologies of artificial intelligence.  In April 2013, two and half years ago, MIT technology review selected “Deep Learning” as one of the 10  breakthrough technologies 2013.  Since then it has been developed so rapidly that it is not a dream anymore now.   This article is the final one in 2015.  Therefore, I would like to look back the progress of “Deep Learning” this year and consider how it changes our daily lives in 2016.

 

How  has “Deep Learning” progressed in 2015? 

1.  “Deep Learning” moves from laboratories to software developers in the real world

In 2014,  Major breakthrough of deep learning occurred in the major laboratory of big IT companies and universities. Because it required complex programming and huge computational resources.  To do that effectively, massive computational assets and many machine learning researchers were required.  But in 2015,  many programs, softwares of deep learning jumped out of the laboratory into the real world.  Torch, Chainer, H2O and TensorFlow are the examples of them.  Anyone can develop apps with these softwares as they are open-source. They are also expected to use in production. For example, H2O can generate the models to POJO (Plain Old Java Code) automatically. This code can be implemented into production system. Therefore, there are fewer barriers between development and production anymore.  It will accelerate the development of apps in practice.

 

2. “Deep Learning” start understanding languages gradually.

Most of people use more than one social network, such as Facebook, Linkedin, twitter and Instagram. There are many text format data in them.  They must be treasury if we can understand what they say immediately. In reality, there are too much data for people to read them one by one.  Then the question comes.  Can computers read text data instead of us?  Many top researchers are challenging this area. It is sometimes called “Natural Language Processing“.  In short sentences, computers can understand the meaning of sentences now. This app already appeared in the late of 2015.  This is “Smart Reply” by  Google.  It can generate candidates of a reply based on the text in a receiving mail. Behind this app,  “LSTM (Long short term memory)” which is one of the deep learning algorithm is used.  In 2016, computers might understand longer sentences/paragraphs and answer questions based on their understanding. It means that computers can step closer to us in our daily lives.

 

3. Cloud services support “Deep Learning” effectively.

Once big data are obtained,  infrastructures, such as computational resources, storages, network are needed. If we want to try deep learning,  it is better to have fast computational resources, such as Spark.  Amazon web services, Microsoft Azure, Google Cloud Platform and IBM Bluemix provide us many services to implement deep learning with scale. Therefore, it is getting much easier to start implementing “Deep Learning” in the system. Most cloud services are “pay as you go” so there is no need to pay the initial front cost to start these services. It is good, especially for small companies and startups as they usually have only limited budgets for infrastructures.

 

 

How will “Deep Learning” change our daily lives in 2016? 

Based on the development of “Deep learning” in 2015,  many consumer apps with “Deep learning” might appear in the market in 2016.   The deference between consumer apps with and without “Deep Learning” is ” Apps can behave differently by users and conditions”. For example,  you and your colleagues might see a completely different home screen even though  you and your colleagues use the same app because “Deep learning” enables the app to optimize itself to maximize customer satisfaction.  In apps of retail shops,  top pages can be different by customers according to customer preferences. In apps of education,  learners can see different contents and questions as they have progressed in the courses.  In apps of navigations,  the path might be automatically appeared based on your specific schedule, such as the path going airport on the day of the business trip.  They are just examples.  It can be applied across the industries.  In addition to that,  it can be more sophisticated and accurate if you continue to use the same app  because it can learn your behavior rapidly.  It can always be updated to maximize customer satisfactions.  It means that we do not need to choose what we want, one by one because computers do that instead of us.  Buttons and navigators are less needed in such apps.  All you have to do is to input the latest schedules in your computers.  Everything can be optimized based on the updated information.  People are getting lazy?  Maybe yes if apps are getting more sophisticated as expected. It must be good for all of us.  We may be free to do what we want!

 

 

Actually,  I quit an investment bank in Tokyo to set up my start-up at the same time when MIT  technology review released 10 breakthrough technologies 2013.  Initially I knew the word “Deep Learning” but I could not understand how important is is to us because it was completely new for me. However, I am so confident now that I always say “Deep Learning'” is changing the landscape of jobs, industries and societies.  Could you agree with that?  I imagine everyone can agree that by the end of 2016!

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

These are small Christmas presents for you. Thanks for your support this year!

christmas-present-83119_640i

I started the group of “big data and digital economy” in Linked in on 15th April this year. Now the participants are over 300 people!  This is beyond my initial expectation. So I would like to appreciate all of you for your support.

I prepare several small Chirstmas presents here. If you are interested in, please let me know. I will do my best!

 

1. Your theme of my weekly letter

As you know, I write the weekly letter “big data and digital economy” every week and publish it in Linkedin. If you are interested in specific themes,  I would like to research and write them as long as I can. Anything is OK if it is about digital economy.  Please let me know!

 

2.  Applications of data analysis in 2016

In 2016,  I would like to develop my applications using data analysis and make them public through the internet.  As long as data is “public”,  we can do any analysis on the data. Therefore,  if you would like to look at your own analysis based on public data,  could you let me know what you are interested in?    These are examples of applications provided by “shiny”,  very famous tool among data scientists.

http://shiny.rstudio.com/gallery/

 

3.   Announcement on the  project of R-programming platform

This is a project of my company in 2016.  To support for business personnel to learn R-programming,  I would like to set up the platform where participants can learn R-programming interactively with ease.  Contents are very important in order for participants to keep learning motivations. When you have specific themes which you want to learn,  could you let me know?  These themes may be included as programs in the platform going forward!    This is an introductory video of the platform.

http://www.toshistats.net/r-programming-platform/

 

Thanks for your support in 2015 and let us enjoy predictive analytics in 2016!