Do you know how computers can read e-mails instead of us?


Hello, friends. I am Toshi. Today I update my weekly letter. This week’s topic is “e-mail”.   Now everyone uses email to communicate with customers, colleagues and families. It is useful and efficient. However, if you try to read massive amounts of e-mails at once manually, it takes a lot of time.  Recently, computers can read e-mail and classify potentially relevant e-mail from others instead of us. So I am wondering how computers can do that. Let us consider it a little.

1.  Our words can become “data”

When we hear the word “data”,  we imagine numbers in spreadsheets.  This is a kind of “traditional” data.  Formally, it is called “structured data”. On the other hand, text such as words in e-mail, Twitter, Facebook can be “data”, too.  This kind of data is called “unstructured data“. Most of our data exist as “unstructured data” around us.  However, computers can transform these data into data that can be analyzed. This is generally an automated process. So we do not need to check each of them one by one. Once we can create these new data, computers can analyze them at astonishing speed.  It is one of the biggest advantages to use computers in analyzing e-mails.

2. Classification comes again

Actually, there are many ways for computers to understand e-mails. These methods are sometimes called Natural language processing (NLP)“.  One of the most sophisticated one is a method using machine learning and understanding the meaning of sentences by looking at the structures of sentences. Here I would like to introduce one of the simplest methods so that everyone can understand how it works.  It is easy to imagine that the “number of each word” can be data.  For example, ” I want to meet you next week.”.  In this case, (I,1), (want,1),(to,1), (meet,1),(you,1), (next,1),(week,1) are data to be analyzed. The longer sentences are, the more words appear as data. For example, we try to analyze e-mails from customers to assess who are satisfied with our products. If the number of positive words, such as like, favorite, satisfy, are high,  it might mean customers are satisfied with the products, vice versa.  This is a problem of “classification“.  So we can apply the same method as I explained before. The “target” is “customers satisfied” or “not satisfied” and “features” are the number of each word. 

3. What’s the impact to businesses?

If computers understand what we said in text such as e-mails,  we can make the most out of it in many fields. For the marketing, we can analyze the voices of customers from the massive amount of e-mails. For the legal services, computers identify what e-mails are potentially relevant as evidences for litigations.  It is called “e-discovery“.  In addition to that, I found that Bank of England started monitoring social networks such as Twitter and Facebook in order to research economies.  This is a kind of “new-wave” of economic analysis.  These are just examples. I think  you can create many examples of applications for businesses by yourself because we are surrounded by a lot of e-mails now.  

In my view, natural language processing (NLP) will play a major role in the digital economy.   Would you like to exchange e-mail with computers?

When are self-driving cars available in Asia? We should re-consider regulations about it.


Last year I learned “machine learning” on coursera and found that it is very useful to develop self-driving car.  This course was created in 2011.  Since then,  there has been much progress in self-driving cars. Last week I found two articles on self-driving cars. One is self-driving cars by google and the other is an autonomous truck. Let us see what they are and consider the impacts of these cars when they are available to us.


1. Self-driving cars

This is one the most aggressive project of self-driving cars because the goal of the project is cars without driver intervention. According to Google website, it says”a few of the prototype vehicles we’ve created will leave the test track and hit the familiar roads of Mountain View, Calif., with our safety divers aboard.”.  It looks so small and cute. However, with computers and sensors, it can run without intervention by humans. I imagine machine learning is used to control self-driving cars as I learned it on coursera before. Because the machine can “learn” new things from data, the more self-driving cars run, the safer and more sophisticated they become. Therefore collecting many data on self-driving cars is critically important.  I wonder when they can drive without drivers in future.


2. Autonomous truck

The other is autonomous trucks.  According to Bloomberg, “Regulatory and technological obstacles may hold back the driverless car for decades. But one of the first driverless semi-trucks is already driving, legally, on the highways of Nevada.” This is a truck which can be controlled on highways. But in difficult tasks such as driving in parking lots, human should take over and drive them. It looks like “a truck, which is supported by computers”.  Unlike self-driving cars by google, this truck needs human drivers. But it must be helpful for truck drivers when they drive on highways for long time.


3. What is needed to promote self-driving cars?

Firstly, we need to consider regulations about how self-driving cars are allowed to run in public. Because the more data is available, the more sophisticated self-driving cars become. In order to accelerate development of self-driving cars,  data is like “fuel” to develop computers in order to control cars. Therefore regulations are very important to allow self-driving cars to run in the real world  in order to collect data.


4. What are the impacts to our society?

In aging societies such as Japan,  older people sometimes feel difficulties to drive a car to go to hospitals or shopping malls. In such a case, the self-driving car is one of the solutions for the problem.  With self-driving cars, senior personnel can go anywhere they want without driving.  In the emerging countries like Asean,  a lot of trucks are needed to prepare the infrastructures and lifelines all over the countries. So it is very useful when self-driving trucks are permitted to run across country borders.  Therefore, regulations should be considered as a region rather than country by country.

In the long run, we should prepare the shift from current situations to a digital economy. It means that some of jobs might be replaced by computers with machine learning.  The more self-driving cars are available, the less truck drivers and taxi drivers are needed. Andrew Ng, the famous researcher of machine learning,  talked about this shift on the article.  “A midrange challenge might be truck-driving. Truck drivers do very similar things day after day, so computers are trying to do that too.”



No one knows exactly when self-driving cars are available in public. It does not look long-term future as I look at the development of technologies.  We may have a lesson of self-driving cars.   Andrew Ng says in the article, “Computers enhanced by machine learning are eliminating jobs long done by humans. The trend is only accelerating.”

What do you think?

Is this a game changer of MBA in a digital economy, isn’t it?


Hi friends, I am Toshi. Today I update my weekly letter. This week’s topic is about online MBA.  If you have a plan to obtain an MBA, I hope it is good information to you.

I love MOOCs (Massive Open Online Courses). Because I can choose any topics from computer programming to languages.  In addition, I can learn anytime and anywhere I want. Finally, Most of courses are free. You do not need to pay any cost to take these courses. So as CEO of start-up, this is the best choice to learn new things I need. However, it might not be applicable to persons who want MBA titles. Because most of the certificates on MOOCs are not regarded as formal academic credit, although MOOCs are provided by many famous universities. For example, I took “Machine learning” by associate professor Andrew.Ng in Stanford university last year and got the statement of accomplishment of that course.  I think this is one of the best courses to learn state of the art algorithm of Machine learning. But unfortunately  Stanford university does not provide academic credits to learners of this course.


But when I found this article of new online MBA course from the University of Illinois at Urbana-Champaign, I thought it can be a game changer. Because, firstly this is a full MBA course with credits.   Second, the cost of the MBA is around 20,000 USD and significantly lower than similar online courses.  Third, we have opportunities to take courses without certain projects in the MBA program before paying fees.  Third one is significantly important to lower the entry barrier, especially for the beginners of MOOCs.  “Digital marketing“, one of the parts of the MBA program, is already open for everyone.  So we can try this and confirm how this online MBA works before applying the admission processes. Therefore, there are few risk where we have mismatches between the contents of the program and the needs of students.


I think one of the reasons why this new online MBA is developed may be that a lot of students cannot repay student loans in the United states.  Some financial experts warn that these bad loans might be the biggest risk in the credit market. So high cost of higher educations are not only students’ problems, but also society’s problems.  This is not sustainable anymore. This new online MBA can be one of the solutions to this problem. Since major MOOCs platforms such as edx and Coursera  opened in 2012,  MOOCs certificate has not been considered as equivalent to academic credits.  However, this new online MBA may change this situation.  I would like to see what other top MBA schools do, going forward.


If you are beginners of MOOCs,  how about start “Digital marketing”?  You can do it without any fee. If you like it and want to be an MBA holder,  this new-online course can be one of the candidates of MBA for you to consider in addition to residential MBA.  I have already started “Digital marketing” by myself in order to enhance my expertise.  Could you join us?

Now I challenge the competition of data analysis. Could you join with us?


Hi friends.  I am Toshi.  Today I update the weekly letter.  This week’s topic is about my challenge.  Last Saturday and Sunday I challenged the competition of data analysis in the platform called “Kaggle“. Have you heard of that?   Let us find out what the platform is and how good it is for us.


This is the welcome page of Kaggle. We can participate in many challenges without any fee.  In some competitions,  the prize is awarded to a winner. First, data are provided to be analyzed after registration of competitions.  Based on the data, we should create our models to predict unknown results. Once you submit the result of your predictions,  Kaggle returns your score and ranking in all participants.


In the competition I participated in, I should predict what kind of news articles will be popular in the future.  So “target” is “popular” or “not popular”. You may already know it is “classification” problem because “target” is “do” or “not do”  type. So I decided to use “logistic curve” to predict, which I explained before.  I always use “R” as a tool for data analysis.

This is the first try of my challenge,  I created a very simple model with only one “feature”. The performance is just average.  I should improve my model to predict the results more correctly.


Then I modified some data from characters to factors and added more features to be input.  Then I could improve performance significantly. The score is getting better from 0.69608  to 0.89563.

In the final assessment, the data for predictions are different from the data used in interim assessments. My final score was 0.85157. Unfortunately, I could not reach 0.9.  I should have tried other methods of classification, such as random forest in order to improve the score. But anyway this is like a game as every time I submit the result,  I can obtain the score. It is very exciting when the score is getting improved!



This list of competitions below is for the beginners. Everyone can challenge the problems below after you sign off.  I like “Titanic”. In this challenge we should predict who could survive in the disaster.  Can we know who is likely to survive based on data, such as where customers stayed in the ship?  This is also “classification”problem. Because the “target” is “survive”or “not survive”.



You may not be interested in data-scientists itself. But it is worth challenging these competitions for everyone because most of business managers have opportunities to discuss data analysis with data-scientists in the digital economy. If you know how data is analyzed in advance, you can communicate with data-scientists smoothly and effectively. It enables us to obtain what we want from data in order to make better business decisions.  With this challenge I could learn a lot. Now it’s your turn!