It is awesome if you can create your own news-broadcasting, isn’t it?


News broadcastings are well-known from everyone. For example, CNN, financial times and Bloomberg, etc.  If you can make your own news broadcasting, it is awesome and amazing. But is it possible?  One of the obstacles is how we can collect articles and information from all over the world in real-time basis.  Of course I do not have my own network of news correspondents all over the globe. Then, what should we do about that?

Last week I found the blog about “GDELT 2.0“. The GDELT Project, which monitors events driving global society, creating a free, open platform for computing in the entire world, was founded and led by Kalev H. Leetaru. The GDELT Project’s full name stands for the Global Database of Events, Language, and Tone (GDELT).  Now this project is going to a new stage of “GDELT 2.0”.  Compare with “GDELT 1.0”,  “GDELT 2.0” has a great deal of progress as follows


1.  “GDELT 2.0” can cover documents and information written in 65 languages

There is a lot of linguistic communication to be written and spoken all over the world. If we try to cover all over the Earth, we need to understand languages other than English. For example, an apple is called “Ringo” in Japanese. If computers cannot read what “Ringo”means, it is impossible to collect the information about apple in Japan because few of the articles are translated from Japanese to English. There is no need to worry about them. GDELT 2.0” can do that by using real time machine translation. This function is called “GDELT Translingual“.  It means that global news that GDELT monitors in 65 languages, representing 98.4% of its daily non-English monitoring volume, is transformed in real time into English. It is amazing because the media of the non-Western world can be included in our coverage. There are no language barriers to worry about.


2. “GDELT 2.0” can be updated in near-real time basis

A blog of  “GDELT 2.0″ says ” In essence, within 15 minutes of GDELT monitoring a news report breaking anywhere the world, it has translated it, processed it to identify all events, counts, quotes, people, organizations, locations, themes, emotions, relevant imagery, video, and embedded social media posts, placed it into global context, and made all of this available via a live open metadata firehouse enabling open research on the planet itself.”  These data use to be updated once a day. Now it is updated within 15 minutes. I think it is critically important when we try to create our own news-broadcasting.


3. “GDELT 2.0” can exercise content analysis for each article in near-real time basis

“GDELT 2.0” can also judge whether the articles are positive or negative. The blog says “GDELT 2.0” can quantify the extraordinary array of latent emotional and thematic signals subconsciously encoded in the world’s media each day. 18 content analysis systems totaling more than 2,230 dimensions are now run on each news article seen by GDELT each day and all of these scores are available. It is called “the Global Content Analysis Measures (GCAM)”.


In short,  information all over the world can be updated with real-time machine translation and content analysis.  It is definitely amazing. With this database of “GDELT 2.0”,  we might create our own news broadcasting!  Could you try it now?

If you are interested in “GDELT 2.0”, it is a nice video for an introduction.

Thailand from the past to the present


Investors and international business people worry about what will happen in Thailand. After the coup happened,  it seems that no solutions of this turmoil can be seen in the near future.  Here I would like to look at some data of Thailand and help us to understand what Thailand is.


DataHero Urban Population By Country

This chart shows how many percent of the total population lives in urban areas by county.  In Thailand, around 35 percent of people live in an urban area in 2012.  It means that nearly 65% people live in a rural area in Thailand. This rate has increased only gradually since 2005.  Compare with other Asian countries, the concentration of population is moderate in Thailand.  Most reports of the mass media are broadcasted from Bangkok. However, 65% of people live in a rural area. So I would like to see what people in rural area think of the coup.


DataHero Internet Users By Country-2

This chart shows how many people use internet out of 100.  In Thailand the ratio is around 26.5 in 2012.  It seems that internet users are not increasing as Singapore and Malaysia in terms of ratio against the total population.  There is another statistic. According to data from Social Inc., as of last month, 28 million Thailand users are on Facebook, 4.5 million have joined Twitter and 1.7 million has Instagram accounts, it means more than 40 percent of the total population is on Facebook. I think a lot of user start using the internet in Thailand recently.

Not only in Thailand but also other countries which limit democracy,  the first step by the governmental body is to control the mass media and the internet in the country. Although the penetration rate is not so high in Thailand compared with other countries, Military of Thailand may control internet traffic in the country.


DataHero International Tourism Arrivals By Country-2

I am wondering how the tourism industry in Thailand is suffering from this turmoil.  Since 2009,  Tourists from abroad has been increasing steadily in Thailand.  So if  this turmoil and instability continue, Tourists from abroad must be decreasing.  It is very bad for the tourism in Thailand.


I presented only three data about Thailand and other countries here.  The more data are presented,  the better we can understand what the country is.   Those data are obtained by Quandl, the data provide service. If you are interested in data about countries, please go to the company’s web site.  You can find a lot of information in it. Once you obtain data that you are interested in, you may like to visualize them. I recommend DataHero to do it as it is very easy and efficient to do so. The charts above are created by DataHero.  Is it cool, isn’t it?



Let us go surfing the sea of big data !

In the morning, I check my smart phone and i pad mini to see what happened during the night.  Every time I touch them, data is generated automatically.  How many devices such as smart phones and tablets are there in the world?  I am sure a lot of data is generated at this moment.

It is also noted that FRB, World Bank, IMF and other public institutions make their data available to public through their web sites. In addition, a lot of public data is getting easier to access thanks to the data gathering services. Data is the first key thing to consider when we start data analytics. Therefore, it is very important to know what kind of data is available at your disposal in analyzing data.

I have been using a data gathering service called “Quandl”. Quandl is a “data platform“, which enable us to collect numerical data published by hundreds of different sources, and host them on a single easy-to-use website.  Currently it can be used for free. Once I obtain the data, I visualize it in order to understand what it means and what the mechanism is  behind the data.  I use ” DataHero” to visualize the data I obtain.  It is easy to produce many kinds of charts and graphs. By “DataHero”, I can produce a lot of graphs by following its instructions, then choose the best one to present what I want to say.  It can be used for basic functionality without any fee. If you pay fees, you can get more functionality such as a tool to combine multiple datasets.  


According to the sun newspaper on May 20 2014, Mr Najib Abdul Razak, prime minister of Malaysia, said that the 6.2% of GDP growth in the first quarter of this year as extremely outstanding.  This is the highest among the list he presented in his Facebook site.  Let us see what is going on from the past to the present in terms of economic growth of Malaysia. I pick up the data of Real GDP growth rate,  Unemployment rate and Consumer price index (CPI) since 1990 in Malaysia by using  Quandl and visualize these data by using DataHero. It is very easy and takes less than 5 minutes if you are getting familiar with these systems.    Source : Open Data for Africa (IMF)

DataHero Malaysia economic growth

This graph tells us economic growth in Malaysia since 1990.  The growth rate is over 5%, except two economic crisis in 1998, 2009 and 2001.  The unemployment rate has been around 3%, which is good for the economy.  CPI is also around 3% and stable. I can say economic growth without inflation currently is achieved in Malaysia.

I would like to compare it to the situation in Japan since 1980.  Let us see the graph below.

DataHero Japan economic growth

GDP growth peaked out in the late of the 1980s, when the bubble economy was peaked.   Since 1990, when the bubble burst,  Japan has experienced the low economic growth.  CPI has been very low and sometimes went to negative as Japan has been in deflation.  The unemployment rate has been gradually increasing and  peaked over 5%. This period is called “the lost two decades” as Japan has poor economic performances. It is not easy to explain why it happened in Japan.  Some economists blamed monetary policy was not so effective enough to recover its economy.  Others criticized the fiscal stimulate was too late, too small and too short. I would like to analyze the mechanism of ” lost two decades” going forward in this blog.