I have researched recommender systems for more than two weeks and I am very surprised to see there are a lot of recommender systems. Therefore, I think I should have a proper way to classify these systems to explain the characteristic of each system. Most of research papers and documents focus on the system called “collaborative filtering”. But I think it is a little difficult to explain the difference between systems by using the word “collaborative filtering”. So I would like to focus data which are used in developing models. Especially I am interested in content features, such as actors and actresses, time of production, directors, countries where it was produced, and so on in case items are movies. I hope beginners for recommender systems can understand them easily.
1. Recommender systems with content features
When we have content features in our data set, this type of models is used. They are useful when there are less data about customers’ rating and interactions because recommendation to the customer can be produced without other customers’ data. It means that we can avoid “cold start issues”. On the other hand, we need data about content features for each item. It may take time and cost to prepare it, although it is worth doing so. I think in most cases this type of recommender systems are used in businesses now.
2. Recommender systems without content features
When we have no content features in our data set, this type of models should be considered. Without content features, similarity between customers are used to produce recommendations. Similarity between items are also used without contents features. This type models may be referred as”collaborative filtering” in documents and research papers. It has advantages as there is no need to obtain content features. On the other hand, we need customers ratings and interactions in advance to develop models. It is not good for startups as they have less data about customers in general.
In practice, it is good for beginners to classify recommender systems based on whether content features are used or not. Because there is no need to know the mechanisms of recommender systems. There are a lot of methods to develop recommender engine. For example, regression, classification, clustering, collaborative filtering, etc. So I think it is very difficult to classify recommender systems based on the way of “how to calculate recommendations”. When beginners are getting familiar with the methods above, they can understand how each method works in recommender systems.
When I research the documents and papers about recommender systems, I found that Netflix prize, a kind of programming competition where the winner was granted one million USD in 2009. It is very interesting to go into deeper because these models discussed during the competition were superior to existing models to provide recommendations accurately and easy to learn even for beginners. I would like to discuss these methods in the next blog. See you next time!