“H2O”, this is an awesome tool of “Digital marketing” for everyone!


Last week I found the awesome tool for digital marketing as well as data analysis.  It is called “H2O“.  Although it is open source software, its performance is incredible and easy to use.  I would like to introduce it to Sales/Marketing personnel who are interested in Digital marketing.

“H2O is open-source software for big-data analysis. It is produced by the start-up H2O.ai(formerly 0xdata), which launched in 2011 in Silicon Valley. The speed and flexibility of H2O allow users to fit hundreds or thousands of potential models as part of discovering patterns in data. With H2O, users can throw models at data to find usable information, allowing H2O to discover patterns. Using H2O, Cisco estimates each month 20 thousand models of its customers’ propensities to buy while Google fits different models for each client according to the time of day.” according to Wikipedia(1).

Although its performance looks very good, it is open source software. It means that everyone can use the awesome tool without any fee.  It is incredible!  “H2O” is awarded one of ” Bossie Awards 2015: The best open source big data tools” (2).  This image shows H2O user interface “H2O FLOW”.

H2O Flow

By using this interface, you can use the state of art algorithm such as “Deep learning” without programming.  It is very important for beginners of data analysis. Because they can start data analysis without programming anyway.  Dr. Arno Candel,   Physicist & Hacker at H2O.ai. , said  “And the best thing is that the user doesn’t need to know anything about Neural Networks”(3).  Once models are developed by this user interface, program of the model with “Java” is automatically generated.  It can be used in production systems with ease.



One of the advantages of open source is that many user’s cases are publicly available. Open source can be public, therefore it is easy to be distributed as users’ experiences of “What is good?” and “What is bad?”.   This image is a collection of tutorials “H2O University“.  It is also available for free. There are many other presentations, videos about H2O in the internet, too! You may find your industry”s cases among them. Therefore, there is a lot of materials to learn H2O by ourselves.

H2O Univ


In addition to that,  “H2O” can be used as an extension of “R“.  R is one of the most widely-used analytical language.  “H2O” can be controlled from R console easily. Therefore  “H2O” can be integrated with R.  “H2O” also can be used with Python.

There are so many other functionalities in H2O. I cannot write everything here.  I am sure it is an awesome tool for both business personnel and data scientists.  I  would like to start using “H2O” and publish my experiences of “H2O”going forward. Why don’t you join “H2O community”?




1.Wikipedia:H2O (software)


2.Bossie Awards 2015: The best open source big data tools


3.Interview: Arno Candel, H2O.ai on the Basics of Deep Learning to Get You Started



Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

Linkedin bought a predictive marketing company. What does it mean?


I think many people like Linkedin as a platform of professionals and they are interested in what is going on in the company.  Last week, I found that “Today we are pleased to announce that we’ve acquired Fliptop, a leading provider of predictive sales and marketing software.1″  (David Thacker said on the blog on 27 August, 2015 ).  Linekdin bought a leading predictive marketing company.  What does it mean? Let me consider a little.


1. What does “Fliptop” do?

It is a marketing software company. On the website, it says  “DRIVE REVENUE FASTER WITH PREDICTIVE MARKETING”, “Increase lead conversion rates and velocity.” and “Identify the companies most likely to buy”.  It was established in 2009 so it is a relatively young company. The company uses technologies called “Machine learning” to identify potential customers with high probability of purchase of products and services.   According to the website of the company, it has an expertise of standard machine learning algorithms, such as logistic regression and decision trees. These methods are used for classifications or predictions.  For example,  the company can identify who is likely to buy the products based on data, including the purchase history of each customer in the past. It hires computer science experts to develop the models for predictions.


2. What will Linkedin do with Fliptop?

As you know, Linkedin has a huge customer base so it has a massive amount of data generated by users of Lnkedin everyday.  This data have been accumulated every second. Therefore Lnikedin should have an ability and enhance it to make the most out of the data.  Linkedin should analyze the data and make better business decisions to compete other IT companies in the markets. In order to do that, there are two options, 1. Technologies developed In-house,  2. Purchase of resources outside the company.  Lnkedin took an option of “2” this time. Doug Camplejohn, CEO of Fliptop, said “We will continue to support our customers with existing contracts for some period of time, but have decided not to take on any new ones. We will also be reaching out to our customers shortly to discuss winding down their existing relationship with Fliptop.”.  Therefore Fliptop will not be independent as a service provider and will be integrated into the functions of Linkedin. It seems that knowledge and expertise of Fliptop are seamlessly integrated into Linkedin in future.  I am not so sure what current users of Fliptop should do as long as I know now.


3.  Data is “King”

This kind of purchases has been seen in IT industry recently. Google bought “DNN research” in 2013 and “DeepMind” in 2014. Microsoft also bought “Revolution Analytics” in 2015.  These small or medium size companies have expertise in machine learning and data analysis.  When they try to expand their businesses, they need massive data to be analyzed. However, they are not owners of a massive amount of data. Owners of a massive amount of data are usually big IT companies, such as Google and Facebook.  It is sometimes difficult for relatively small companies to obtain a massive amount of data, including  customer data.  On the other hand, big IT companies, including Linkedin, are usually owners of huge customer data. In addition to that, big IT companies now  enhance resources and expertise to analyze data as well. Once they have both of them, new services can be created and offered in a shorter period. The more people use these services, the more accurate and effective they can be.  Therefore, it sounds logical when big IT companies acquire small companies with expertise in data analysis and machine learning. Big IT companies definitely need their expertise in data analysis and machine learning.



From the standpoint of consumers, it is good because they can enjoy many services offered by big IT companies with lower costs. But from the standpoint of companies, competitions are getting tougher as this occurs not only in IT industries but many other industries. Now Linkedin seems to be ready for this competition, which comes in the future.

Machine learning is sometimes considered as engines and data are considered as fuel.  When they are combined in one place, new knowledge and insights may be found and new products and services may be created.  It accelerates changes of the landscape of the industries. Mobile, cloud, big data, IOT and artificial intelligence will contribute to this change a lot. It must be exciting to see what happens next in the future.





1. Accelerating Our Sales Solutions Efforts Through Fliptop Acquisition, David Thacker, August 27, 2015 


2.A New Chapter,  Doug Camplejohn, August 27, 2015




Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

How can we create good movies based on big data?


Last Sunday, my son came to Kuala Lumpur as he has summer vacation now.  So  I brought him to the movie theater.  I chose “Mission: Impossible – Rogue Nation” to entertain him.  In the movie, Tom cruise is very active.  I cannot believe he is older than I am!  My son and I could enjoy the movie very much, as his action is amazing.

Then I am wondering how we can create good movies.  Every year, many movies are created, but few of them can stay in people’s mind in the long term.  Let me consider it here for a while.

1. How can we define what good movies are?

There are many measures to evaluate movies.  Critics can assess the quality of movies. But I would like to make it simple.  Number of customers who watch the movie or the amount of sales revenue such as “Box office“of the movie can be used as a measure of “good movie” as it is easy to collect and measure. So the more people watch the movie, the better it is according to our definition of “good movies”.


2. Let us consider something related to the number of customers or sales revenue.

A lot of things relate to them.  Like Mission:Impossible,  actors and actresses are very important.  The director is also important.  In addition to that,  where is it created?  Is it an action movie or a love story or a thriller?,  and so on. They may be closely related to the number of customers or sales revenue of the movie.  So the data about something related to the number of customers or sales revenue in the past should be collected.


3. How can we obtain predictions of the number of customers or sales revenue of the  unseen movie in advance?

Could you remember the last week’s letter about “Target” and “Features”?  “Target” is something that we want to predict and “Features” are something that are related to “Target”.  Predictions of “Target” can be obtained by inputting “features” into “Statistical model”.  I would like to call this unit “module”.   I summarize it as follows.


According to our definition of “good movies”,  Target is the number of customers or sales revenue of the movie. Features are actors and actress, category of the story, locations where the movie was taken, and so on. So these features are input to statistical models to obtain predictions of target for unseen movies.  Based on this analysis,  we could predict the sales of movies before they are seen in theaters. It means that good movies could be created based on this prediction. When this prediction is accurate,  film production companies might increase sales revenue  because they can create good movies based on predictions of Targets. But in reality, we should prepare a lot of data to predict them accurately.  In additions to that,  customer preference might be changed suddenly, however, it is very difficult to update the statistical models in advance to follow such changes. Therefore, there is a risk where statistical models can not follow circumstance changes in a timely manner.



It should be noted that more features will be available as computers will understand videos or movies. Now the technology 1is in progress.  It will enable computers to turn videos into texts. For example, when there is a scene where the swan is on the lake, computers understand the video and make sentences that explain the scene automatically.  It means that whole part of the movie can be transformed into texts without human intervention. So movies will be analyzed based on their stories. More features can be identified in the results of this analysis. When new kinds of data are available to us, it may enable us to obtain more features and improve accuracy of predictions.  Would you likely to make your own movie in future?



1. A picture is worth a thousand (coherent) words:building a natural description of images, 17 Nov 2014,  Google Research



Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

“Prediction” is very important in analyzing big data of the business


It is a good timing to reconsider “Big data and digital economy” because this name of group on Linledin has four-month-history and more than 100 participants now. I would like to appreciate the cooperation of all of you.

In the beginning of 2000s, I worked in the risk management dept in the Japanese consumer finance company.   There is a credit risk model which can predict who is likely to be in a default in the company. I learned it more details and understood how it worked so accurately. I found that if I collect a lot of data about customers, I could obtain accurate predictions for events of defaults in terms of each customer.

Now in 2015,  I researched many algorithms and statistical models including the state of art “deep learning”.   While there are many usages and objectives in using such models,  in my view,  the most important thing for business persons is “prediction” just like my experience in consumer finance company because they should make good business decisions to compete in markets.

If you are in health care industry,  you may be interested in predictions about who is likely to be cured. If you are in sales, you may be interested in predictions about who is likely to come to the shop and buy the products. If you are in marketing,  you may be interested in who is likely to click the advertisement on the web.  Whatever you do,  predictions are very important for your businesses because it enables us to take the right actions.  Let me explain key points about predictions.



What are your interests to predict?    Revenue of your business?  Number of customers?    Satisfaction rate based on client feedback?  Price of wine near futures? You can mention anything you want.  We call it “Target”.  So firstly, “Target” should be defined in predictions so that you can make right business decisions.



Secondly,  let us find something related to your target.  For example,   If you are a sales person and interested in who is likely to buy the products,  features are “attributes of each customer such as age, sex, occupation” , “behavior of each customer such as how many times he/she come to the shop per month and when he/she bought the products last time”,  “What did he/she click in the web shop”  and so on.  Based on the prediction, you can send coupons or tickets to “highly likely to buy”customers in order to increase your sales.  If you are interested in the price of wine,  features may be temperature,  amount of rain and locations of farms,  and so on.  If you can predict the price of wine,  you might make  good investments of wine.  These are just simple examples. In reality,  a number of features may be 100,  1000  or more.  It depends on whole data you have.  Usually the more data you have, the more accurate your predictions are.  This is why data is very important to obtain predictions.


Evaluation of predictions

Finally by inputting features into statistical models,  predictions of the target can be obtained. Therefore, you can predict who is likely to buy the products when you think of marketing strategies.  This is good for your business as marketing strategies can be more effective.  Unfortunately customer preferences may be changed in the long run.  When situations and environments such as customer preferences are changed,  predictions may not be accurate anymore.  So it is important to evaluate predictions and update statistical models periodically.  No model can work accurately forever.


Once you can obtain the prediction,  you can implement processes of the predictions as a daily activity, rather than one-off analysis. It means that data driven decisions are made on a daily basis.  It is one of the biggest aspects of “digital economy”.  From retail shops to health care and financial industry,  predictions are already used in many fields.  The methods of predictions are sometimes considered as “black-box”.  But I do not think It is good to use predictions without understanding the methods behind predictions. I would like to explain them in my weekly letter in future.  Hope you enjoy it!



Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

Facebook, Twitter, Google and “new wave” of economic analysis


On Saturday, I found that the report from Bank of England.  This report is about economic analysis in central banks with Big data such as social network services. It is good not only for economic researchers, but also business personnels to consider how Big data should be used. So I would like to consider it based on this report for a while.

Before considering usage of Big data, I would like to define “Big data”. Big data is data sets that are granular, in real time basis and  non-numeric data as well as  numeric one.   These data are completely opposite in nature compared with data which are currently analyzed in Central banks.  Because such data are usually “aggregated,  periodical and numeric”.  One of the examples is financial statements of companies.  Big data are different from such data.  For example Twitter are generated by individuals in real time. These are usually text, images and video. Then the questions come.


1. Can we build up macro economic models based on big data?

Central banks are responsible for the stability of the financial system in the country.  Is it possible for central banks to collect data of each loan from private banks and assess credit risk of each, then confirm financial stability as a whole country?  It can be applied to private companies, too. Is it possible that the company collect data of each customer, forecast the amount of purchase by each customer and predict the revenue of the company next fiscal year?  Big data may enable us to do so even though it takes time.


2. Is the method used “theory based” or “data driven”?

Even though they cannot be clearly distinguished in practice,  these are two approaches to analyze Big data in economic analysis. Someone puts importance to economic theories. Let us call it “theory based”.  Others take another approach of “Let the data speak for themselves”.   We may call it “data driven”.  Their opinions are sometimes against each other even though they analyze the same data. So we should have well-balanced approach between them.


3. Should we change the processes to make business decisions?

Big data comes to us in a real time basis.  But our decision making process in organizations is usually periodical. For example, board of directors meetings and executive committees in companies are generally held on a monthly basis.  Should they be held more flexibly in a timely manner based on outputs from analysis of Big data, rather than periodical one?  The bigger companies become, the more difficult it is to change the process in practice.


FRB in the US is currently wondering when they should raise the interest rate of the US.  Chairwoman of FRB has been always saying  “It is based on economic data“.  But I am not sure she cares about data (conversations) on social networking services in the US. What do you think?

The reason why computers may replace experts in many fields. View from “feature” generation.


Hi friends, I am Toshi. I updated my weekly letter.  Today I explain 1. How classification, do or do not, can be obtained with probabilities and 2. Why computers may replace experts in many fields from legal service to retail marketing.   These two things are closely related to each other. Let us start now.


1.  How can classification be obtained with probabilities?

Last week, I explained that “target” is very important and “target” is expressed by “features”.  For example Customer “buy” or “not buy” may be expressed by customers age and  the number of  overseas trips a year.  So I can write this way : “target” ← “features”.   This week, I try to show you the value of “target” can be a probability, which is  a number between 0 and 1.  If the “target” is closer to “1”,  the customer is highly likely to buy.   If the target is closer to “0”,  the customer is less likely to buy.   Here is our example of “target” and “features” in the table below.

customer data

I want  Susumu’s value of the “target” to be close to “1” in calculations by using “features”.  How can we do that?   Last week we added “features” with“weight” of each feature.   For example  (-0.2)*30+0.3 *3+6,  the answer is 0.9.  “-0.2″ and “0.3” are the weight for each feature respectively. “6” is a kind of adjustment.  Next let us introduce this curve below. In the case of Susumu, his value from his features is 0.9. So let us put 0.9 on the x-axis, then what is the value of y? According to this  curve, the value of y is around 0.7. It means that  Susumu’s probability of buying products is around 0.7.  If probability is over 0.5, it is generally considered that customer is likely to buy.


In the case of Tom, I want his value of the “target” to be close to “0” in calculations by using “features”.  Let us add his value of features as follows  (-0.2) *56+0. 3 *1+6,  the answer is -4.9.  His value from his features is -4.9. So let us put  -4.9 on the x-axis, then what is the value of y?  According to this curve, Tom’s probability of buying products is almost 0. Unlike Susumu’s case, Tom is less likely to buy.


This curve is called “logistic curve“.   It is interesting that whatever value “x” takes, “y” is always between 0 and 1.  By using this curve, everyone can have the value between 0 and 1, which is considered as the probability of the event. This curve is so simple and useful that it is used in many fields.  In short, everyone has a probability of buying products, which is expressed as the value of “y”.  It means that we can predict who is likely to buy in advance as long as “features”are obtained! The higher value customers have, the more likely they will buy the products.



2.  Why may computers replace experts in many fields?

Now you understand what are”features”.  “Features” generally are set up based on expert opinion. For example, if you want to know who is in default in the future, “features”needed are considered “annual income”, “age”, “job”, “the past delinquency” and so on. I know them because I used to be a credit risk manager in consumer finance company in Japan.  Each expert can introduce the features in the business and industries.  That is why the expert’s opinion is valuable, so far. However, computers are also creating their features based on data. They are sometimes so complex that no one can understand them. For example, ” -age*3-number of jobs in the past” has no meaning for us. No one knows what it means. But computers do. Sometimes computers can predict “target”, which means “do” or “not do” with their own features more precisely than we do.


In the future,  I am sure much more data will be available to us.  It means computers have more chance to create better “features” than experts do. So experts should use the results of predictions by computers and introduce them into their insight and decisions in each field.  Otherwise, we cannot compete with computers because computers can work 24 hours/day and 365 days/year. It is very important that the results of predictions should be used effectively to enhance our own expertise in future.



Notice: TOSHI STATS SDN. BHD. and I, author of the blog,  do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithm or ideas contained herein, or acting or refraining from acting as a result of such use. TOSHI STATS SDN. BHD. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on TOSHI STATS SDN. BHD. and me to correct any errors or defects in the codes and the software.

“Classification” is significantly useful for our business, isn’t it?


Hello, I am Toshi. Hope you are  doing well. Now I consider how we can apply data analysis to our daily businesses.  So I would like to introduce “classification” to you.

If you are working in marketing/sales departments, you want to know who are likely to buy your products and services. If you are in legal services, you would like to know who wins the case in a court. If you are in financial industries, you would like to know who will be in default among your loan customers.

These cases are considered as same problems as “classfication”.  It means that you can classify a thing or an event you are interested in from all populations you have on hand.  If you have data about who bought your products and services in the past, we can apply “classification” to predict who are likely to buy and make better business decisions. Based on the results of classification,  you can know who is likely to win cases and who will be in default with a numerical measure of certainty,  which is called “probability”.  Of course, “classification” can not be a fortune teller.  But “classification” can provide us who is likely to do something or what is likely to occur with some probabilities.  If your customer has 90% of probabilities based on “classification”, it means that they are highly likely to buy your products and services.


I would like to tell several examples of “classification” for each business. You may want to know the clues about the questions below.

  • For the sales/marketing personnel

What is the movie/music in the Top 10 ranking in the future?

  • For personnel in the legal services

Who wins the cases ?

  • For personnel in the financial industries or accounting firms

Who will be in default in future?

  • For personnel in healthcare industries

Who is likely to have a disease or cure diseases?

  • For personnel in asset management marketing

Who is rich enough to promote investments?

  • For personnel in sports industries

Which team wins the world series in baseball?

  • For engineers

Why was the spaceship engine exploded in the air?


We can consider a lot of  examples more as long as data is available.  When we try to solve these problems above,  we need data in the past, including the target variable, such as who bought products, who won the cases and who was default in the past.  Without data in the past, we can predict nothing. So data is critically important for “classification” to make better business decisions.   I think data is “King”.


Technically, several methods are used in classification.  Logistic regression,  Decision trees,  Support Vector Machine and Neural network and so on. I recommend to learn Logistic regression first as it is simple, easy to apply real problems and can be basic knowledge to learn more complex methods such as neural network.


I  would like to explain how classification works in the coming weeks.  Do not miss it!  See you next week!

Is it possible to raise the quality of services if computers can talk to you?


When you go to Uniqlo,  people of Uniqlo talk to you and advise how you can coordinate your favorite fashion.  When you go to hospitals, doctors ask you what your condition is and advise you what you should do in order to be healthy.  Then let us consider whether computers can talk to you and answer your questions, instead of a human being.

It is the first step to know the customers in service industries,  students in education.  So there are many people working to face with customers and students. If computers can face with customers and students,  it means that quality of services dramatically is going up because computers are cost-effective and operate 24hours per day, 365 days per year without rest time.


I like taking courses in open online courses.  It is very convenient as we can look at courses whenever we want as long as internet connection is available.  But the biggest problem is that there are no teachers to be asked for each learner when you want to ask.  This description explains this problem very well.

Because of the nature of MOOC-style instruction (Massive Open Online Course), teachers cannot provide active feedback to individual learners. Most MOOCs have thousands of learners enrolled at the same time and engaging personally with each learner is not possible.”

When I cannot understand the course lectures and solve the problems in exams by myself, it is very difficult to continue to learn because I feel powerless.  This is one of the reasons why completion rate is very low in open online courses (usually less than 10%).  If you need assistance from instructors,  you should pay fees which are not cheep for people in developing countries. I want to change this situation.


A technology called “Machine learning” may enable us to enjoy conversations with computers cross industries from financial to education.  Computers can understand what you ask and provide answers in real-time basis.  It takes some time to develop to make computers more sophisticated, so that computers can answer exactly what you want.  This is like a childhood.  At the beginning, there is very little knowledge so It may be difficult to answer questions. Then computers start learning from interactions with human.  The more knowledge they have,  the more sophisticated their answer is.

So I would like to start to examine how computer is learning in order to provide sophisticated answers to learners and customers. If computers obtain enough knowledge effectively, they can talk to you and enjoy conversations with you.  I hope computers can be good partners to us.

Three self-paced online courses that I strongly recommend. They are awesome and free!


If you are businessmen/women, your schedule sometimes cannot be controlled by yourself.  Meeting with clients may be required by your client with short notice.  The emergency situation may happen and you should cope with it.  That is why it is difficult for business men/women to complete on-line training/courses with limited time.

However, there is no need to worried about that.  As the number of online courses is increasing,  the number of self paced courses is also increasing.  In Coursera, one of the biggest platforms of online courses, has 70 on-demand courses. Unlike session courses, self paced courses have no deadline to complete. It is very good for busy business men/women because schedules can be more flexible to complete.

Now I enroll several self-paced courses that I am interested in but have no concrete schedule to complete them so far. Instead, when I have spare time, such as time to wait my flight in the airport or suddenly cancelled meetings,  I can enjoy these courses any time I want. I think it is good!  Here is the list of self-paced online courses I recommend.


1.  Machine Learning

This is the best course for people who want to understand what is going on in the digital economy deeply.  Andrew Ng. Associate Professor, Stanford University; Chief Scientist, Baidu; Chairman and Cofounder, Coursera, provides us the course about Machine learning. It is the science of getting computers to act without being explicitly programmed.  This state of art technologies is explained in plain English so that people with knowledge of high school math can understand what machine learning is and how it is used in the real world.  I always recommend this course. But the problem was that we had to complete the course within three months.  It is considered too short for most of business men/women.  Now this course is available as self -paced course!  Then we can learn the course at your own pace!


2. Managing Fashion and Luxury Companies

This course is about fashion trends and industries.  It says “This module is dedicated to a general introduction to fashion and luxury concepts, what they mean, how they are perceived, how they differ, and other basic information on this peculiar industry.”  This kind of courses are very few in on-line courses so I recommend this course.  I expect we can obtain new insights about fashion industries.


3. Chinese for Beginners

One of the candidates of self-paced courses to take is the one about languages because it can be repeated many times by ourselves. I currently choose the course about Chinese.  Xianoyu Liu, Associate Professor School of Chinese as A Second Language, Peking University provides the course for beginners of Chinese.


Yes, you can go to a coffee shop from now, where wifi connections are available. Then open your mobile and access to Coursera website and sign up.  You can enjoy the courses you choose anytime you want!

This course is the best for beginners of data analysis. It is free, too!


Last week, I started learning on-line course about data analysis. It is “The Analytics Edges” in edx, one of the biggest platforms of MOOCs all over the world (www.edx.org).  This course says “Through inspiring examples and stories, discover the power of data and use analytics to provide an edge to your career and your life.”   Now I completed Unit one and two out of  total nine in the course and found that it is the best course for beginners of data analysis in MOOCs. Let me tell you why it is.


1. There are a variety of data sets to analyze

When you start learning data analysis, data is very important to motivate yourself to continue to learn.  When you are sales personnel, sales data is the best to learn data analysis because you are interested in sales as professional.  When you are in financial industries, financial data is the best for you.   This course uses a variety of data from crime rate to automobile sales.  Therefore, you can see the data you are interested in. It is critically important for beginners of data analysis.


2. This course focuses on how to use analytics tools, quite than the theory behind the analysis

Many of data analysis courses take a long time to explain the theory behind the analysis.  It is required when you want to be a data scientist because theory is needed to construct an analytic method by yourself. However, most of business managers do not want to be data scientists.  All business managers need is the way to analyze data to make better business decisions. For this purpose, this course is good and well-balanced between theory and practice.  Firstly, a short summary of theory is provided, then move on to practice. Most of  the lectures focus on “how to use R for data analysis”. R is one of the famous programming languages for data analysis, which is free for everyone.  It enables beginners to use R in analyzing data step by step.


3. It covers major analytic methods of data analysis.

When you see the schedule of the course,  you find many analytic methods from linear regression to optimizations.  This course covers major methods that beginners must know.  I recommend to focus on linear regression and logistic regression when you do not have enough time to compete all units because both of method is applicable to many cases in the real world.



I think it is worth seeing only the video in Unit 1 and 2.  Interesting topics are used especially for people who like baseball. If you do not have enough time to learn R programming, it is OK to skip it. The story behind the analysis is very good and informative for beginners. So you may enjoy the videos about the story and skip videos of programming for the first time. If you try to obtain a certificate from edx, you should obtain 55% at least over the homework, competition and final exam.  For beginners, it may be difficult to complete the a whole course within limited time (three-month).  Do not worry.  I think this course can be learned again in time to come.  So first time,  please focus on Unit1 and Unit2, then a second time, try a whole course if  you can. In addition, most of edx courses including this are free for anyone.   You can enjoy anytime, anywhere as long as you have an internet access.  Could you try this course with me (www.toshistats.net) ?