“DEEP LEARNING PROJECT” starts now. I believe it works in digital marketing and economic analysis

desert-956825_640

As the new year starts,  I would like to set up a new project of my company.  This is beneficial not only for my company, but also readers of the article because this project will provide good examples of predictive analytics and implementation of new tools as well as platforms. The new project is called “Deep Learning project” because “Deep Learning” is used as a core calculation engine in the project.  Through the project,  I would like to create “predictive analytics environment”. Let me explain the details.

 

1.What is the goal of the project?

There are three goals of the project.

  • Obtain knowledge and expertise of predictive analytics
  • Obtain solutions for data-driven management
  • Obtain basic knowledge of Deep Learning

As big data are available more and more, we need to know how to consume big data to get insight from them so that we can make better business decisions.  Predictive analytics is a key for data-driven management as it can make predictions “What comes next?” based on data. I hope you can obtain expertise of predictive analytics by reading my articles about the project. I believe it is good and important for us  as we are in the digital economy now and in future.

 

2.Why is “Deep Learning” used in the project?

Since the November last year, I tried “Deep Learning” many times to perform predictive analytics. I found that it is very accurate.  It is sometimes said that It requires too much time to solve problems. But in my case, I can solve many problems within 3 hours. I consider “Deep Learning” can solve the problems within a reasonable time. In the project I would like to develop the skills of tuning parameters in an effective manner as “Deep Learning” requires several parameters setting such as the number of hidden layers. I would like to focus on how number of layers, number of neurons,  activate functions, regularization, drop-out  can be set according to datasets. I think they are key to develop predictive models with good accuracy.  I have challenged MNIST hand-written digit classifications and our error rate has been improved to 1.9%. This is done by H2O, an awesome analytic tool, and MAC Air11 which is just a normal laptop PC.   I would like to set my cluster on AWS  in order to improve our error rate more. “Spark” is one of the candidates to set up a cluster. It is an open source.

DL.002

3. What businesses can benefit from introducing “Deep Learning “?

“Deep Learning ” is very flexible. Therefore, it can be applied to many problems cross industries.  Healthcare, financial, retails, travels, food and beverage might be benefit from introducing “Deep Learning “.  Governments could benefit, too. In the project, I would like to focus these areas as follows.

  • Digital marketing
  • Economic analysis

I would like to create a database to store the data to be analyzed, first. Once it is created,  I perform predictive analytics on “Digital marketing” and “Economic analysis”.  Best practices will be shared with you to reach our goal “Obtain knowledge and expertise of predictive analytics” here.  Deep Learning is relatively new to apply both of the problems.  So I expect new insight will be obtained. For digital marketing,  I would like to focus on social media and measurement of effectiveness of digital marketing strategies.  “Natural language processing” has been developed recently at astonishing speed.  So I believe there could be a good way to analyze text data.  If you have any suggestions on predictive analytics in digital marketing,  could you let me know?  It is always welcome!

 

I use open source software to create an environment of predictive analytics. Therefore, it is very easy for you to create a similar environment on your system/cloud. I believe open source is a key to develop superior predictive models as everyone can participate in the project.  You do not need to pay any fee to introduce tools which are used in the project as they are open source. Ownership of the problems should be ours, rather than software vendors.  Why don’t you join us and enjoy it! If you want to receive update the project, could you sing up here?

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

How will “Deep Learning” change our daily lives in 2016?

navigation-1048294_640

“Deep Learning” is one of the major technologies of artificial intelligence.  In April 2013, two and half years ago, MIT technology review selected “Deep Learning” as one of the 10  breakthrough technologies 2013.  Since then it has been developed so rapidly that it is not a dream anymore now.   This article is the final one in 2015.  Therefore, I would like to look back the progress of “Deep Learning” this year and consider how it changes our daily lives in 2016.

 

How  has “Deep Learning” progressed in 2015? 

1.  “Deep Learning” moves from laboratories to software developers in the real world

In 2014,  Major breakthrough of deep learning occurred in the major laboratory of big IT companies and universities. Because it required complex programming and huge computational resources.  To do that effectively, massive computational assets and many machine learning researchers were required.  But in 2015,  many programs, softwares of deep learning jumped out of the laboratory into the real world.  Torch, Chainer, H2O and TensorFlow are the examples of them.  Anyone can develop apps with these softwares as they are open-source. They are also expected to use in production. For example, H2O can generate the models to POJO (Plain Old Java Code) automatically. This code can be implemented into production system. Therefore, there are fewer barriers between development and production anymore.  It will accelerate the development of apps in practice.

 

2. “Deep Learning” start understanding languages gradually.

Most of people use more than one social network, such as Facebook, Linkedin, twitter and Instagram. There are many text format data in them.  They must be treasury if we can understand what they say immediately. In reality, there are too much data for people to read them one by one.  Then the question comes.  Can computers read text data instead of us?  Many top researchers are challenging this area. It is sometimes called “Natural Language Processing“.  In short sentences, computers can understand the meaning of sentences now. This app already appeared in the late of 2015.  This is “Smart Reply” by  Google.  It can generate candidates of a reply based on the text in a receiving mail. Behind this app,  “LSTM (Long short term memory)” which is one of the deep learning algorithm is used.  In 2016, computers might understand longer sentences/paragraphs and answer questions based on their understanding. It means that computers can step closer to us in our daily lives.

 

3. Cloud services support “Deep Learning” effectively.

Once big data are obtained,  infrastructures, such as computational resources, storages, network are needed. If we want to try deep learning,  it is better to have fast computational resources, such as Spark.  Amazon web services, Microsoft Azure, Google Cloud Platform and IBM Bluemix provide us many services to implement deep learning with scale. Therefore, it is getting much easier to start implementing “Deep Learning” in the system. Most cloud services are “pay as you go” so there is no need to pay the initial front cost to start these services. It is good, especially for small companies and startups as they usually have only limited budgets for infrastructures.

 

 

How will “Deep Learning” change our daily lives in 2016? 

Based on the development of “Deep learning” in 2015,  many consumer apps with “Deep learning” might appear in the market in 2016.   The deference between consumer apps with and without “Deep Learning” is ” Apps can behave differently by users and conditions”. For example,  you and your colleagues might see a completely different home screen even though  you and your colleagues use the same app because “Deep learning” enables the app to optimize itself to maximize customer satisfaction.  In apps of retail shops,  top pages can be different by customers according to customer preferences. In apps of education,  learners can see different contents and questions as they have progressed in the courses.  In apps of navigations,  the path might be automatically appeared based on your specific schedule, such as the path going airport on the day of the business trip.  They are just examples.  It can be applied across the industries.  In addition to that,  it can be more sophisticated and accurate if you continue to use the same app  because it can learn your behavior rapidly.  It can always be updated to maximize customer satisfactions.  It means that we do not need to choose what we want, one by one because computers do that instead of us.  Buttons and navigators are less needed in such apps.  All you have to do is to input the latest schedules in your computers.  Everything can be optimized based on the updated information.  People are getting lazy?  Maybe yes if apps are getting more sophisticated as expected. It must be good for all of us.  We may be free to do what we want!

 

 

Actually,  I quit an investment bank in Tokyo to set up my start-up at the same time when MIT  technology review released 10 breakthrough technologies 2013.  Initially I knew the word “Deep Learning” but I could not understand how important is is to us because it was completely new for me. However, I am so confident now that I always say “Deep Learning'” is changing the landscape of jobs, industries and societies.  Could you agree with that?  I imagine everyone can agree that by the end of 2016!

 

 

 

Notice: TOSHI STATS SDN. BHD. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

“Community” accelerates the progress of machine learning all over the world!

cake-1005760_1280

When you start learning programming,  it is recommended to visit the sites of community of languages.  “R” and “python” have big communities, and they have been contributing to the progress of each language. This is good for all users. H2O. ai also held an annual community conference “H2O WORLD 2015”  this month.  Now video and presentation slides are available through the internet. I could not attend the conference as it was held in Silicon Valley in the US. But I can follow and enjoy it just by going through websites. I recommend you to have a quick look to understand how knowledge and experiences can be shared at the conference. It is good for anyone who are interested in data analysis.

 

1.  The user communities can accelerate the progress of open source languages

When I started learning “MATLAB®” in 2001,  there were few user communities in Japan as far as I knew.  So I should attend the paid seminars to learn this language, which were not cheap.  But now most of uses communities are available without any fee. In addition to that,  this kind of communities have been bigger and bigger recently.   One of the main reasons is that number of “open source languages” are increasing recently.    “R” and “python” are also open source languages. It means that when someone want to try certain language,  all they have to do is just “download  and use it”.  Therefore, users can be increased at an astonishing pace.  On the other hand,  if someone want to try “proprietary software” such as MATLAB, they must buy each license before using it. I loved MATLAB for many years and recommended my friends to use it. But unfortunately no one uses it privately because it is difficult to pay license fee privately.  I imagine that most users of proprietary software are in organizations such as companies and universities.  In such case, organizations pay license fees.  So each individual can enjoy no freedom to choose languages they want to use. Generally it is difficult to switch from one language to another when proprietary softwares are used. It is called “Vendor lock-in“.  Open source languages can avoid that. This is one of the reasons why I love open source languages now. The more people can use, the more progress can be achieved.  New technologies such as “machine learning” can be developed thought user communities because more users will join going forward.

 

2.  The real industry experiences can be shared in communities

It is the most exciting part of the community.  As a lot of data scientists and engineers from industry join communities,  their knowledge and experience are shared frequently.  It is difficult to find this kind of information in other places.  For example, the theory of algorithms and methods of programming can be found in the courses provided by universities in MOOCs. But there are few about industry experiences in MOOCs in a real time basis.  For example, in H2O WORLD 2015,  there are sessions with many professionals and CEOs from industries. They share their knowledge and experiences there.  It is a treasure not only for experts of data analysis, but for business personnel who are interested in data analysis. I would like to share my own experience in user communities in future.

 

3.  Big companies are supporting uses communities

Recently major IT big companies have noticed the importance of the user community and try to support them.  For example, Microsoft supports “R Consortium” as a platinum member. Google and Facebook support communities of their open source languages, such as “TensorFlow” and “Torch“.  Because new things are likely to happen and be developed among users outside the companies.  Therefore It is also beneficial to big IT companies when they support user communities. Many other IT companies are supporting communities, too. You can find many names as sponsors under the big conference of user communities.

 

The next big conference of user communities is “useR! – International R User Conference 2016“.  It will be held on June 2016.  Why don’t you join us?  You may find a lof of things there. It must be exciting!

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

“Speed” is the first priority of data analysis in the age of big data

cake-219595_1280

When I learned data analysis a long time ago,  the number of samples of data was from 100 to 1,000. Because teachers should explain what the data are in the details.  There were  a little parameters that was calculated, too.  Therefore, most of statistical tools could handle these data within a reasonable time.  Even spread sheets worked well.  There are huge volume data,  however,  and there are more than 1,000 or10,000 parameters that should be calculated now.  We have problems to analyze data because It takes too long to complete the analysis and obtain the results.  This is the problem in the age of big data.

This is one of the biggest reasons why new generation tools and languages of machine learning appear in the market.  Torch became open sourced from Facebook at January 2015.  H2O 3.0 was released as open source in May 2015 and TensorFlow was also released from Google as open source in this month.  Each language explains itself as “very fast” language.

 

Let us consider each of the latest languages.  I think each language puts importance into the speed of calculations.  Torch uses LuaJIT+C, H2O uses Jave behind it.  TensorFlow uses C++. LuaJIT , Java and C++ are usually much faster compared to script languages such as python or R. Therefore new generation languages must be faster when big data should be analyzed.

Last week, I mentioned deep learning by R+H2O.  Then let me check how fast H2O runs models to complete the analysis.  This time, I use H2O FLOW,  an awesome GUI,  shown below.  The deep learning model runs on my MAC Air11  (1.4 GHz Intel Core i5, 4GB memory, 121GB HD) as usual.  Summary of the data used  as follows

  • Data: MNIST  hand-written digits
  • Training set : 19000 samples with 785 columns
  • Test set : 10000 samples with 785 columns

Then I create the deep learning model with three hidden layers and corresponding units (1024,1024,2048).  You can see it in red box here. It is a kind of complex model as it has three layers.

DL MNIST1 model

It took just 20 minutes to complete. It is amazing!  It is very fast, despite the fact that  deep learning requires many calculations to develop the model.  If deep learning models can be developed within 30 minutes,  we can try many models at different setting of parameters to understand what the data means and obtain insight from them.

DL MNIST1 time

I did not stop running the model before it fitted the data.  These confusion matrices tell us error rate is 2.04 % for training data (red box) and 3.19 % of test data (blue box). It looks good in term of  data fitting.  It means that 20 minutes is enough to create good models in this case.

DL MNIST1 cm

 

Now it is almost impossible to understand data by just looking at them carefully because it is too big to look at with our eye. However,  through analytic models, we can understand what data means. The faster analyses can be completed,  the more  insight can be obtained from data. It is wonderful for all of us.  Yes, we can have an enough time to enjoy coffee and cakes with relaxing after our analyses are completed!

 

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

 

This is my first “Deep learning” with “R+H2O”. It is beyond my expectation!

dessert-352475_1280

Last Sunday,  I tried “deep learning” in H2O because I need this method of analysis in many cases. H2O can be called from R so it is easy to integrate H2O into R. The result is completely beyond my expectation. Let me see in detail now!

1. Data

Data used in the analysis is ” The MNIST database of handwritten digits”. It is well known by data-scientists because it is frequently used to validate statistical model performance.  Handwritten digits look like that (1).

MNIST

Each row of the data contains the 28^2 =784 raw grayscale pixel values from 0 to 255 of the digitized digits (0 to 9). The original data set of The MNIST is as follows.

  • Training set of 60,000 examples,
  • Test set of 10,000 examples.
  • Number of features is 784 (28*28 pixel)

The data in this analysis can be obtained from the website (Training set of 19,000 examples, Test set of 10,000 examples).

 

 

2. Developing models

Statistical models learn by using training set and predict what each digit is by using test set.  The error rate is obtained  as “number of wrong predictions /10,000″. The world record is ” 0.83%”  for models without convolutional layers, data augmentation (distortions) or unsupervised pre-training (2). It means that the model has only 83 error predictions in 10,000 samples.

This is an image of RStudio, IDE of R.  I called H2O from R and write code “h2o.deeplearning( )”.  The detail is shown in the blue box below.  I developed the model with 2 layers and 50 size for each. The error rate is 15.29% (in the red box).  I need more improvement of the model.

DL 15.2

Then I increase the number of layers and sizes.  This time,   I developed the model with 3 layers and 1024, 1024, 2048 size for each. The error rate is 3.22%, much better than before (in the red box).  It took about 23 minutes to be completed. So there is no need to use more high-power machines or clusters so far ( I use only my MAC Air 11 in this analysis). I think I can improve the model more if I tune parameters carefully.

DL 3.2

Usually,  Deep learning programming is a little complicated. But H2O enable us to use deep learning without programming when graphic user interface “H2O FLOW” is used.  When you would like to use R,   the command of deep learning to call H2O  is similar to the commands for linear model (lm) or generalized linear model (glm) in R. Therefore, it is easy to use H2O with R.

 

 

This is my first deep learning with R+H2O. I found that it could be used for a variety cases of data analysis. When I cannot be satisfied with traditional methods, such as logistic regression, I can use deep learning without difficulties. Although it needs  a little parameter tuning such as number of layers and sizes,  it might bring better results as I said in my experiment. I would like to try “R+H2O” in Kaggle competitions, where many experts compete for the best result of predictive analytics.

 

P.S.

The strongest competitor to H2O appears on 9 Nov 2015.  This is ” TensorFlow” from Google.  Next week,  I will report this open source software.

 

Source

1. The image from GitHub  cazala/mnist

https://github.com/cazala/mnist

2. The Definitive Performance Tuning Guide for H2O Deep Learning , Arno Candel, February 26, 2015

http://h2o.ai/blog/2015/02/deep-learning-performance/

 

Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user.