Who will be the winner in the competition of convenience stores in Japan?


Today I am in Tokyo, Japan. So I would like to write convenience stores in Japan.  Since 7-eleven opened in 1974, there are a lot of varieties of  convenience stores in Japan. But as the populations of Japan are decreasing  and too many convenience stores are already in Japan,  Competions of convenience stores are getting tougher and tougher. Therefore, small and medium sized store chains have a hard time and some of them are acquired d by big  convenience store chains.  Now it is almost clear that the big three, which are 7-elevenLawson and Family Mart, dominate the market.

There are many convenience stores in Japan.  Therefore, they have huge impacts to merchandise there. For example,  a cup of coffee is served at the counter of most of convenience stores. Some of them are self-serviced. The taste is very good, although it is reasonable (around 100JPY).  They are getting popular and compete with canned  coffee in vendor machines or coffee shops.  You can try this coffee when you come to Japan as there are many convenience stores near the stations.


In terms of usage of big data for business decisions,  I am very interested in Lawson because it analyses data from stores and predict what products are popular.  This picture is my “Ponta CARD”.  When I buy products at Lawson, I present it at the counter.  So Lawson knows what and when I buy there.  It works all over Japan, therefore a huge amount of data is collected and analyzed everyday.


According to “Top Management Message October 7, 2015” by Lawson,  it introduces more advanced “semi-automatic ordering system”. Let see what it is.

“We began introducing our new semi-automatic ordering system from June to improve the delivery of products to our stores. The system is designed to recommend the most appropriate product lineup and number of items for delivery based on a range of data for ready-made snack meals and other categories, such as Ponta member purchasing trends, a store’s most recent sales data and information on heavy user purchases, information from other stores with a similar customer base, the weather, and finally information on the various campaigns conducted. The semi-automatic ordering system had been introduced in approximately 7,500 stores at the end of August 2015.” (1)

It is amazing!  Convenience store is usually not so big. Therefore, it is very important to know how many and what products are on store shelves. Data tells us how to do that accurately! I would like to research  what important factors are in this analysis going forward. You may be interested in them, too!


When you come to Japan,  you can find convenience stores at the every corner of the cities. There are many onigiri (rice ball), bento (lunch box), breads, beverages and sweets. Most of them are open 24 hours a day so you can enjoy shopping  anytime you want. Let us go there and see who will be the winner in the competition of convenience stores in Japan!



1. Lawson website




Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

What will be the flight service in the future? I write it in the air!


Now I am in the air from Kuala Lumpur to Tokyo as I have a business trip.  I always use Air Asia because it is convenient and reasonable. Since AirAsia has operated,  it is getting cheaper to flight from Kuala Lumpur to Tokyo. It is very good, especially for younger generations. I would like to welcome them in Japan very much.  Then I am wondering what the flight service will be in the future. Let us consider it with me!


1. Service on flight

Low cost carriers, including AirAisa increase the number of customers per flight compared with legacy carriers to reduce the price of the flight. Therefore services for each customer are not the sane as legacy carriers.  I think, however, it will be improved dramatically supported by digital technologies.  At each site, electronic dashboard might be equipped and all information, such as flight schedules,  emergency evacuation methods might be provided.  These are translated into many languages with machine translations so there is no need to worry about language barriers. ( In my flight of AirAsia, English, Japanese and Malay are used in the flight announcement. ) Meals in a fight will be improved, too.  We might order meals on demand through the electronic dashboard whenever you want to eat. These data can be collected customer by customer.  Therefore, preference of each customer might be known in advance.  This technology is called “personalization”. So low cost carriers might predict what kind of meals are needed in the flight based on past experience of  each customer. It enables them to widen the variety of meals served because there is less risk to have a lack of inventories of meals on the flight.  To serve meals to each customer,  robots of cabin attendant assistants might support cabin attendants so that meals are served smoothly. I am excited if I can choose many varieties of meals on demand.


2. Immigration

Before getting on the board,  it takes time to pass immigration.  I always think it might be more effective with technologies called “face recognition”.  Computers can identify who you are by comparing to your face image stored on the passport. It is good to take less time to pass immigration for everyone. If it is connected to a database of INTERPOL,  it can enhance identification of criminals.


3.  Maintenance

Airplanes have a massive amount of parts. Therefore, maintenance is critically important to keep flights safe. Especially for low cost carriers, there is less time to maintain airplanes from landing to taking off again.  It can be enhanced by technologies   called “internet of things” and “predictive analytics“.  In internet of things,  each part has sensors and provide data periodically thought the internet.  Data from the sensors are collected and analyzed by “predictive analytics” to predict which parts are likely to fail in  advance.  Maintenance can be  more effective by using the results of  predictive analytics. Data from sensors can be transmitted from airlines to airports, even though they are in the air. Therefore failed parts or potential one can be identified before air plains land.  It enables us to decrease the time of maintenance.


Beyond low cost carriers,  the airplane in the air might be connected to other industries such as hotels. For example, the flight might be delayed due to bad weather and customers need reservations of hotels as the flight will land at the midnight. In such case, We can reserve hotels thought digital dashboard of each sheet. It is good to have reservations of the hotel even if we are in the air!

I hope my flights will be more comfortable in the future!  Could you agree?




Note: Toshifumi Kuga’s opinions and analyses are personal views and are intended to be for informational purposes and general interest only and should not be construed as individual investment advice or solicitation to buy, sell or hold any security or to adopt any investment strategy.  The information in this article is rendered as at publication date and may change without notice and it is not intended as a complete analysis of every material fact regarding any country, region market or investment.

Data from third-party sources may have been used in the preparation of this material and I, Author of the article has not independently verified, validated such data. I and TOSHI STATS.SDN.BHD. accept no liability whatsoever for any loss arising from the use of this information and relies upon the comments, opinions and analyses in the material is at the sole discretion of the user. 

This new toy looks so bright! Do you know why ?


Last week I found that new toy  called “CogniToys” for infants will be developed in the project of Kickstarter, one of the biggest platforms in cloud funding.  The developer is elemental path, one of the three winners of the IBM Watson competition. Let see why it is so bright!

According to the web site of this company,  this toy is connected to the internet.  When a child talks to this toy, it can reply because this toy can see what a child says and answer the question from a child.  It usually requires less than one second to answer because IBM Watson-powered system is powerful enough to calculate answers quickly.


Let us look at the descriptions of this company’s technology.

“The Elemental Path technology is built to easily license and integrate into existing product lines. Our dialog engine is able to utilize some of the most advanced language processing algorithms available driving the personalization of our platform, and keeping the interaction going between toy and child.”

Key words are 1. Dialog    2. Language processing   3. Personalization


1. Dialog

This toy communicates with children by conversation, rather than programming. Therefore technology called “speech recognition” is needed in it.  This technology is applied in real-time machine translation such as Microsoft Skype, too.


2. Language processing

In the area of machine learning, it is called “Natural language processing”. Based on the structure of sentence and phrase, the toy understands what children say.  IBM Watson is very expert in the field of natural language processing because Watson should understand the meaning of questions in Jeopardy contests before.


3. Personalization

It is beneficial when children talk to this toy, it knows children preference in advance. This technology is called “Personalization”.  Through interactions between children and the toy, it can learn what children like to cognize. This technology is oftentimes used in retailers such as Amazon and Netflix. There is no disclosure about the method of personalization as far as I know.  I am very interested in how the personalization mechanism works.


In short, machine learning enables this toy to work and be smart. Functions of Machine Learning are provided as a service by big IT companies, such as IBM and Microsoft.  Therefore, this kind of applications is expected to be put out to the market in future. This is amazing, isn’t it?  I imagine next versions of the toy can see images,  identify what they are and share images with children because technology called image recognition is also offered as a service by big companies.

I ordered one CogniToy through Kickstarter. It is expected to deliver in November this year. I will report how it works when I get it!


Note:IBM, IBM Watson Analytics, the IBM logo are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. 

Mobile services will be enhanced by machine learning dramatically in 2015, part 2


Happy new year !   At the beginning of 2015,  it is a good time to consider what will happen in the fields of machine learning and mobile services in 2015.  Followed by the blog last week,  we consider recommender systems and internet of things as well as investment technologies. I hope you can enjoy it !


3. Recommender systems

Recommender systems are widely used from big companies such as amazon.com and small and medium-sized companies.  Going forward,  as image recognition technology progresses rapidly, consumer generated data such as pictures and videos must be taken to analyze consumers behaviors and construct consumers preferences effectively.  It means that unstructured data can be taken and analyzed by machine learning in order to make recommendations more accurate. This creates a virtuous cycle. More people take pictures by smartphones and send them thorough the internet, more accurate recommendations are.  It is one of the good examples of personalization. In 2015 a lot of mobile services have functions for personalization so that everyone can be satisfied with mobile services.


4. Internet of things

This is also one of big theme of the internet.  As sensors are smaller and cheaper,  a lot of devices and equipments from smart phone to automobile have more sensors in it. These sensors are connected to the internet and send data in real-time basis.  It will change the way to maintain equipments completely.  If fuel consumption efficiency of your car is getting worse, it may be caused by failure of engines so maintenance will be needed as soon as possible. By using classification algorithm of machine learning, it must be possible to predict fatal failure of automobiles, trains and even homes.  All notifications will be sent to smartphones in real-time basis. It leads to green society as efficiency are increasing in terms of energy consumption and emission control.


5. Investment technology

I have rarely heard that new technologies will be introduced in investment and asset management in 2014 as far as I concerned.  However I imagine that some of fin-tech companies might use reinforcement learning, one of the categories of machine leaning.  Unlike the image recognition and machine translation, right answers are not so clear in the fields of investment and asset management. It might be solved by reinforcement learning  in practice in order to apply machine learning into this field. Of course, the results of analysis must be sent to smart phone in real-time basis to support investment decisions.


Mobile services will be enhanced in 2015 dramatically because machine learning technologies are connected to mobile phone of each customer. Mobile service with machine learning will change the landscape of each industries sooner rather than later. Congratulations!


Mobile services will be enhanced by machine learning dramatically in 2015


Merry Christmas !  The end of 2014 is approaching.  It is a good time to consider what will happen in the fields of machine learning and mobile services in 2015.  This week we consider machine translation and image recognition,  next week recommender systems and internet of things as well as mobile services by machine leaning. I hope you can enjoy it !


1.  Machine translation / Text mining

Skype is a top innovator in this fields.   Microsoft already announced that machine translation between English and Spanish is available by Skype. So in 2015,  it would be possible to translate between English and other languages. Text translation is also available among 40 languages in its chat service.  So language barrier are getting lower and lower.  It is still difficult to answer to questions by computers automatically.  But it is also gradually improved.  Mizuho bank announced that it will use IBM Watson, one of the famous artificial intelligence to assist call center operators.  These technologies make global service to be developed more easily as manuscripts and frequent Q&A are translated from the language to another automatically.  I love that because my educational programs can be expanded to all over the world!


2. Image recognition

Since computers identified the image of cats automatically by deep learning, images recognition technology progresses dramatically.  Soft bank announced that Pepper, new robot for consumers, will be able to read human emotions. In my view, the most important factor to read emotions must be image recognition of  human facial expressions. Pepper could be very good at doing this therefore it can read human emotions.  Image recognition technology is very good for us as each smart phone has a nice camera and it is easy for people to take pictures and send them to clouds and social media.  Image recognition can enable us to analyze massive amount of images, which are sent through internet. That data must be a treasure for us.


These machine learning technologies must be connected to mobile phone of each customer in 2015. It means that mobile services are enhanced by machine learning dramatically. All information around us will be collected through internet and send to machine learning in real-time basis and machine learning will return the best answer for individuals. This will be standard model of mobile services as speed of calculation and communication are increasing rapidly.

Next week we consider recommender systems,  internet of things and investment technology.  See you next week!

Financial industry and artificial intelligence


UBS announced that it will deliver personalized advice to the bank’s wealthy clients by using artificial intelligence (AI). UBS plans to roll out a digital service in Asia next April.

I think this is one of the example for financial institutions to go to “digital personalized marketing”  by artificial intelligence.  In future  personalized services by AI are one of the key strategic technologies in the financial industry. Let us consider how artificial intelligence are implemented and used in marketing of  financial industries more details.


1. data

This is a basis for the analysis to predict what financial products customers want.  According to this article about UBS, in the presentation by founders of Sqreem, they said that they crawl through a wide range of openly available, unstructured data. I would like to explain unstructured data. It means the data is not organized in a database as we usually see. So I assume massive amount data could be gathered automatically.  Data might be gathered in real-time basis so final outputs such as recommendations also might be provided in real-time basis. It is a dynamic process, rather than a static process.


2. algorithm

There is no disclosure about how calculations are done in details as far as I know. So this is my assumptions based on the article.  This might be one of the recommender systems. As the article says, this focuses on the behavior of customers.  Behavior of customers could be identified in deeper level and precise recommendations to individual customers could be  provided effectively.  In my thought,  this system might be on-line learning system, too. It means that algorithm could learn new things by themselves, could be updated based on stream data in real-time basis and adjust the change of customers’ preferences.


3. output

This is also my assumptions based on the article. The articles mentioned mobile phones and other digital devices.   I think recommendations might be mainly provided to individual customers through their mobile phones. Mobile phones could be personal interface against banks and financial institutions.  One of the biggest advantage of mobile phones is that customers preference could be gathered through interaction between customers and banks without any official inquiry to the customers.


This is not the end of story but the beginning of it.  As technology is progressed,  a lot of industries will try to introduce such kind of personalized recommender systems. This is marketing of digital era so that everyone can obtain the best products and services among a lot of choices. How wonderful it is !

What is singular value decomposition?


Last week I introduced inner product as a simple model in recommender systems. This week I would like to introduce more advanced model for recommender systems. It is called singular value decomposition.


According to Mining Massive datasets in Coursera, one of the best on-line courses about machine learning and big data,  singular value decomposition or SVD is defined as follows.

Matrix A=UΣV’

U : left singular matrix

Σ : singular matrix

V : right singular matrix

Row vectors and column vectors of matrix A can be transformed into lower dimensional space. This space is called “concept”. In other words row vectors and column vectors can be mapped to concept space, which has smaller dimensions than row and column vectors of matrix A. Strength of each concept is defined in singular matrix where diagonal values are positive. When SVD is applied to recommeder systems,  row vectors of matrix A can be customers’ preference and column vectors can be items features.  For example, movies can be classified as a SF movie or a romance movie, which are “concept”.   Each customer may like SF movies or romance movies. We can predict unknown rating for customers and items by using SVD.


SVD is also used for dimensionality reduction and advantages of  SVP are as follows.

1.  find hidden correlations

2.  make visualization of data easier

3.  reduce the amount of data


Therefore SVD can be applied to not only recommender system but other kinds of business applications.


Let us see R to analyze data by singular value decomposition. R has a function of  singular value decomposition, SVD. Therefore we can execute singular value decomposition by just inputting data into function of svd() in R. IDE below is RStudio.


In this case,  matrix ss is decomposed into $d,$u and $v.

$u : left singular matrix

$d:  singular matrix

$v : right singular matrix

When we look at $d,  value of the first and second column are large, therefore we focus on the first concept and second concept.  In $u, the row vectors of ss are mapped to concept space.  In $v, the column vectors of ss are also mapped to same concept space.  Red rectangular and blue rectangular show similarity based on “concept”. I recommend you to try svd() to analyze data in R as it is very easy and effective.

SVD is a little complicated than inner product but it is very useful when there are a lot of data which has large dimensions. Let us be familiar with SVD because we would like to use this model going forward.

Recommender engines and inner product


Last week, I introduced inner product of vectors as an essential tool for statistical models.  Let us apply inner product to recommender engines this week.


Could you remember a utility function?  Let me review it a little here. The utility function is expressed as follows.


U:utility of customers,  θ:customers’preferences,  x:Item features,  R:ratings of the items for the customers

As you know,  θ:customers’preferences,  x:Item features, both are vectors.  Let us take an example of movies. Movie features are expressed as follows.


A1: Science fiction movie

A2: Love romance movie

A3.: Historical movie

A4: US movie

A5: Japanese movie

A6: Hong Kong movie



First let us consider customer’s preferences. If you like some of the features of movies, assign 1 to the features.  If you like them very much, assign 2,  if you do not like it, just put 0 to the features.  I like Science fiction movie and US movie very much and like Japanese and Hong Kong movie,  while I do not like love romance movie and historical movie. These preferences can be expressed as a vector. My preference vector θ is [2,0,0,2,1,1] because A1=2, A2=0, A3=0,A4=2, A5=1, A6=1 according to my preference. I recommend you to make your own preference vector the same way as I did here.


Then let us move on to item features.  StarWars, A Chinese ghost story, Seven samurai and Titanic are taken as our selections of movies. Then what movies are recommended to me?

OK, let us make item feature vector of each movie. For example, if the movie is US movie, A4=1, A5=0, A6=0.

StarWars : x=[1,0,0,1,0,0]

A Chinese ghost story : x=[0,1,0,0,0,1]

Seven samurai : x=[0,0,1,0,1,0]

Titanic : x=[0,1,0,1,0,0]


Finally, let us calculate the value of the utility function for each movie. If the value is bigger, it means that I like this movie more and recommendations should be provided for me to watch the movie.  The value can be obtained by calculate inner product of  θ:customers’preferences and  x:Item features.  In StarWars case, the value of utility function is [2,0,0,2,1,1]*[1,0,0,1,0,0]’ = 4.


StarWars : U=4

Chinese ghost story : U=1

Seven samurai : U=1

Titanic : U=2


So the highest value goes to StarWars. So it should be recommended to me. the second is Titanic so it may be recommended. If you prepare your own preference vector, you can calculate the value of your utility functions and find what movie should be recommended to you !


Anyway this is one of the most simple model to calculate the value of utility for each movie. It uses inner product of vectors as I said before. Inner product can transform a lot of data into a single number. In this case, only six features are selected. Even thought number of features can be far more than six, inner product can transform a lot of data into a single number, which can be used for better business decisions!

Logistic regression model or Matrix factorization?


When I used to be a risk manager in financial industry,  I would like to use logistic regression model. This model is widely used to measure probability of defaults of counter parties.  So this model is very famous in the financial industry.  In the field of machine learing, this models is regarded as one of classifiers as it enable us to classify data based on the results of calculations.   Both numerical data and categorical data can be used in this model.  It is simple and flexible  so I want to use this model as our recommender engine.

In addition to that, I found that matrix factorization model is widely used in the industries currently.  It has been popular since this models had a good performance in the Netflix Prize competition in 2009.  Once we obtain the matrix which provides ratings according to users and items,  matrix factorization is applied to this matrix and divides it into two matrices, One is the matrix for users’ preferences and the other is items features. By using these two matrices, we can provide recommendations to users  even though users do not provide any ratings to the specific items. It is simple but very powerful to solve problems. This performance was proved in the Netflix Prize competition  in 2009.


When we have two models, there are two advantages as follows.

1  We can compare the results from each model each other.

By using same data, we can compare how each model provides recommendations effectively.  I think it is good because it is very difficult to evaluate how the model works well without comparison to other models.


2  We can combine two models into one model.

In practice, several models are sometimes combined into one model so that the results are more accurate compared with the results by just one model. For example, matrix factorization provides us features automatically,  These features may be used as inputs in logistic regression models. Liner product of each model is one of the methods of combining models as well.


Yes, we have two major models as our recommendation engines. So let us make them more accurate and effective going forward. The more we have experiences of developing models, the more recommendations by our models are accurate and effective. These models are expected to be implemented with R language, our primary tool for data analysis. It must be exciting!  Why don’t you join us?  you will be going to be an expert of recommender systems with this blog!

What is a utility function in recommender systems?


Let us go back to recommender systems as I did not mention last week.   Last month I found that customers’ preference and items features are key to provide recommendations. Then I started developing the model used in recommender systems.  Now I think I should explain the initial problem setting in recommender systems.  This week I looked at “Mining Massive datasets” in Coursera and I found that problem setting of recommender systems in this course is simple and easy to understand.  So I decided to follow this. If you are interested in this more detail,  I recommend to look at this course, excellent MOOCs in Coursera.


Let us introduce a utility function, which tells us how customers are satisfied with the items. The term of “utility function ” is coming from micro economics. So some of you may learn it before.  I think it is good to use a utility function here because we can use the method of economics when we analyze the impacts of recommender systems to our society going forward.  I hope  more people, who are not data-scientists, are getting interested in recommender systems.

The utility function is expressed as follows


U:utility of customers,  θ:customers’preferences,  x:Item features,  R:ratings of the items for the customers

This is simple and easy to understand what utility function is.  I would like to use this definition going forward. I think ratings may be one, two, three…, or it may be a continuous number according to recommender systems.

When we look at the simple models, such as linear regression model and logistic regression model,  Key metrics are explanatory variables or features and its weight or parameters. It is represented as x and θ respectively.  And product of θx shows us how much it has an impact on variables, which we want to predict. Therefore I would like to introduce θx as a critical part of my recommender engine.   ”θx” means that each x is multiplied to it’s correspondent weight θ and summing up all products .This is critically important for recommender systems. Mathematically θx is calculations of products of vectors/matrices. It is simple but has a strong power to provide recommendations effectively. I would like to develop my recommender engine by using θx next week.


Yes, we should consider what color of shirts maximize our utility functions, for example.  In futures, utility functions of every person might be stored in computers and recommendations might be provided automatically in order to maximize our utility functions. So everyone may be satisfied with everyday life. What a wonderful world it is!