Introduction to session-based approach
Nowadays, all e-commerce market players (among other things) are constantly striving to provide the best possible customer experience. One option how to reach that is by being able to recommend products to customers in a tailored way, and ideally in real-time. In the previous article, we discussed why the individual recommendation is a MUST.
It has been shown that using the most recent information from the user’s session (i.e. what users are looking at now, clicking on, etc.) can significantly improve the resulting product recommendation (see e.g. research regarding Evaluation of Session-based Recommendation Algorithms here). Moreover, thanks to this approach, products are recommended to the customer gradually according to their current behavior on the website. And last but not least, often the long-term data could completely miss which might ruin many models that benefit only from this perspective. Briefly said, short-term customer intent is very important as input for the recommendation.
Therefore, in this article, we will focus on models that can effectively use information from the current user’s session and compare different algorithms. Also, an example will be shown where a model with better accuracy metrics does not necessarily mean it is the winner. The article will be divided into two parts. Let’s go into it!
In order to understand the concept of the session-based approach and all the important outcomes from experiments we are going to show here easily, it is crucial to outline the basic concepts of recommendation systems. Feel free to skip this part if you have already been a recommendation specialist.
What is the recommendation task? I bet you must have already been part of some recommendation systems. It does not matter whether you listen to music on Spotify, watch movies on Netflix or buy some things at Amazon. You have surely noticed phrases like: “Content for you”, “You may also like this” or “Other customers buy”. They serve you products that you would exactly like the most. How do they do that? All these companies have their own recommendation engine which is composed of some recommendation model(s).
A definition of such a recommendation system (model), or RS for short, might sound as this:
“RS is a system designed to recommend things to user to help him or her find the most relevant items of interest using many different features. It usually tries to know the user better than him/her self.”
RS learns from a huge amount of data and then the recommendation task is to predict a list of items in the kind a way as a classification task: model will be successful if user will like that list and unsuccessful in the opposite case.
The art of Recommendation System
There are many more or less popular ways how to deal with recommendation tasks. You can serve products to customers according to:
1. popular items,
2. customer’s last seen/purchased items,
3. interactions that were made by other customers and are somehow similar* to this customer,
4. items’ similarity — also known as content-based filtering approach: as a result it shows similar products to ones the user has already seen/bought
5. sequence-based/session-aware/session-based approach (which we are focusing on here),
6. hybrid of any of these.
*similarity can be measured variously — e.g. by similar interactions (then it is known as collaborative filtering approach), by customer’s profile data, …
The first two methods are both easy to understand and implement. The main difference between 3. and 4. can be read from the following picture:
Collaborative filtering vs Content-based filtering
In the case of collaborative filtering only actions of other users are needed for building the model and on the opposite, content-based filtering requires only features of items to be given.
To make it not so easy, there are plenty of techniques to tackle each recommendation method above — such as (deep) factorization machines, un(supervised) learning methods, any wide or deep learning model, etc.
In ML prediction tasks, it is usually straightforward to select the best models — concerning underlying data one can choose from well-known metrics such as accuracy, AUC, prediction error, F1 score, etc. However, within recommendation models there is no exact metric that you can select and say that your model is good or that in 95% it will give a good recommendation to the customer.
Fortunately, many well-known metrics applicable to recommendation tasks have already been introduced. You may have heard for example about hit rate or ndcg.
Except for the fact that everyone can create his own metric or modify any from the existing ones, we can divide them into two main categories described below: accuracy and beyond-accuracy metrics where @k means the metric is measured for first k recommendations:
Accuracy:
Beyond-accuracy:
We have been using more than twenty of metrics for our evaluation purposes. You can read more details in one of the previous articles.
Are these metrics all you need to say that the recommendation model is great? The answer is no, apart from these numerical metrics there is one more but not less important factor that should be always considered — and that is the quality of recommended items — in other words, if every customer will be satisfied by the recommendation all the time. Even though it does not look like that, it is almost impossible to easily measure if every person like the recommendation list — because how can you calculate that level of each person’s satisfaction?
The fact that the user will click on/buy the item that was recommended to him does not necessarily mean that he liked it. Maybe there can be item that he would like much more but because it was not recommended/shown to him we could never know if he would appreciate it more.
Last but not least, we can measure the performance of recommendation systems at two different states: offline or via real A/B test (with other metrics like revenue per session, CTR, average order value or other business KPIs). Both have their pros and cons. But what has to be mentioned here is the following: For developing of such system (and for finding the most suitable model) you usually have no choice but to rely on offline evaluation. However, even the very best model developed based on offline metrics mentioned above does not mean victory unless you find out the real impact of that model from real customers.
If you are creating a model for the specific scenario (for example for one client) then usually yes. But in general, this is still not enough, this kind of measurement should be evaluated at many kinds of datasets.
There is for example an option to compare the results of your model with several public datasets focused on RS presented at competitions like Kaggle or DrivenData.
You can also consider the creation of pre-specified data segments in order to understand the behavior of your model. Unfortunately, this is something often neglected in the research field. You may find out that the users who viewed fewer items in their session have different needs on the model. And then probably the only solution is to create a hybrid model.
Also, the performance needs (such as memory consumption, training time etc.) might be other parameters to take into account when implementing model to client’s webpage.
And that should finally be all. So, let’s dive into the most important part of this article that you all have been waiting for.
The session-based approach belongs to recommendation techniques relying solely on the user’s actions within an ongoing session and which adapt their recommendations to the user’s actions.
In contrast to traditional methods described above, it incorporates the ordering of past events when predicting the next ones.
In recent years, an increased interest in session-based recommendation scenarios was observed. Many algorithms that can be used for this task have been developed quite a long time ago. By that nature, they often have much more trivial designs but despite that they can beat the more complex approaches of today based on deep neural networks.
Someone might be overwhelmed by three terms used in this context and that sound very similar: sequence-aware, session-aware and session-based. Sequence aware approach learns from historical sequences of user interactions and tries to predict the next one with no respect to the session. Both session-based and session-aware, as their names suggest, consider session as a key parameter and are subclasses of sequence-aware mechanism. The difference is that in the cases when we have interactions from previous user sessions, the recommendations can be personalized according to the users’ long-term preferences which we call session-aware recommendations. In the session-based approach only the current user’s session is used for the recommendation.
Let’s check the picture below which should help with any future concerns:
Advances in Session-Based and Session-Aware Recommendation, Dissertation zur Erlangung des Grades eines DoktorsderNaturwissenschaften der Technischen Universität Dortmund an der Fakultät für Informatik von Malte Ludewig, 2020
To sum up, Session-based approach has many advantages:
In the rest of the article, we will talk about the session-based approach but the same also applies to the session-aware one.
Intro Example
Imagine user John is currently viewing Apple Watch Series 6 on the web page. He also saw Apple iPhone 13 five minutes ago and Samsung Galaxy S20 FE ten minutes ago. So, the sequence of his items looks like this:
Items view order
For simplification, data for learning will be very short — consisting of 3 users and their sessions (items are sorted chronologically):
Users and sessions
The main goal, as you would expect, is to find the best next item for John, that will interest him the most. There are many various algorithms that can be used for session-based approach.
Association rules (AR)
Association rule algorithm (also known as market-basket analysis) is a technique to find hidden associations of frequently bought/seen items together. More easily, it learns rules like Customer who bought .. also bought… These rules and their corresponding importance are “learned” by counting how often the items A and B occurred together in a session of any user. They do not necessarily extract user preferences, but rather work in a collaborative-filtering way.
In our session-based approach, we use only the rules of size two. The score of such a rule (e.g.: {A, B}) is derived from the number of co-occurrence of items A and B in all sessions within the whole dataset. The more they occur together the higher the score. And the final recommendation list of length k for example for item C consists of k rules of paired items sorted by the score in descending order.
Using our example, we would learn these rules and corresponding scores via AR:
Because the last item John saw was Apple Watch Series 6, the recommendation list for him would be: Apple Watch Series 7 at the first position and Apple iPhone 13 mini at the second position and Apple Watch Nike Series at the third position.
You can notice that even though we do recommendations based only on the last session item, using this AR algorithm we are able to recommend items from different categories — not only some other watches but also mobile phones from the same brand. That happens thanks to learning from several sessions of various users.
Sequential rules
Sequential rules algorithm is partly similar to AR — it is also designed to find hidden rules of frequently co-occurring items but where also an order of items in session matters. It basically learns the frequent sequences which the algorithm name is derived from.
We create a rule when item A appeared after an item B in a session even when other events (viewing/buying items) happened between A and B. In our example with John, we now learn these rules and corresponding scores via SR:
* The score is a bit less than one because Apple Watch Nike Series was not the immediate consequent item after Apple Watch Series 6.
We would recommend to John: Apple iPhone 13 mini at the first position, Apple Watch Series 7 at the second position and Apple Watch Nike Series at the third position. The first two positions have different orders compared to the previous AR algorithm. By the way, which recommendation list do you think is better for John? We will talk about it later on.
Note: The subset of SR is Markow chain algorithm that works in a more strictly way — rules are created based on how often users viewed/bought item A immediately after viewing/buying item B.
KNN is a non-parametric supervised learning method that can also be used very well for session-aware (or collaborative-filtering) recommendation. It finds the K most similar items to a particular item based on a given distance metric (usually cosine similarity, Euclidean distance, …) and item features.
There are various modifications of KNN algorithm into needs of session-based approach:
Item-KNN (i-KNN)
This version only considers the last element in a given session and then returns those items as recommendations that are most similar to it in terms of their co-occurrence in other sessions
Technically, each item is encoded as a binary vector (it can be seen in the picture below -> e.g. Apple Watch Series 6 would have vector: (0, 1, 1) because it was seen in two sessions (Session 2 and Session 3) of some people).
Now, if we are about to recommend an item to the user who has currently viewed Xiaomi Mi Watch Lite as his last item, since the vector of item: Apple Watch Series 6 is the most similar to vector of Xiaomi Mi Watch Lite and because this user has not seen this item yet: we would recommend exactly that item to him.
I-KNN input matrix example: 1 means the item occured in the session
Session-KNN (s-KNN)
Instead of considering only the last event in the current session, the s-knn method compares the entire current session with past sessions of other users in the training data to determine the items to be recommended.
When we have some current sessions, we can obtain the rules in the following steps:
1. Determine K most similar sessions (neighbors) by applying a suitable session similarity measure.
2. Calculate the score for all items based on these K sessions as:
May look complicated but it basically means that a high score would have those items that occur in many similar sessions and even higher if those sessions are the most similar to the one which we are giving the recommendation.
Now, if we are about to recommend an item to a user who has currently viewed Xiamoi Mi Watch Lite and Apple iPhone 12 128 GB (see Session 1), since Session 3 is the most similar to user’s Session 1 and because this user has not seen the item occurring in Session 3: Apple Watch Series 6 -> i-knn would recommend exactly that item.
s-KNN example
We can also think of some other modifications of these two -> for example put the more importance on items that were seen at the latest and many others.
Recurrent neural networks, which are capable of learning things from sequentially ordered data, are a “natural choice” for this problem.
Again, more adaptations exist here. For example, GRU4REC approach that was specifically designed for session-based recommendations. It models the user session with the goal to predict the probability of subsequent events using RNN with so called Gated Recurrent Units. Computationally, it is more complex approach and requires quite a high volume of data.
Other algorithms are: BERT4Rec, SASRec or BST.
These have been currently the most popular algorithms used for session-based(aware) recommendation approach that has been shown as an important part of any recommendation system. We have outlined metrics where we can measure the success of such created model together with the thought that they themselves are not enough. And now there is a question which model from all of these session-based models should I use for my purpose? Which is the best one? You can look forward on that in our next part.
Co-author Simona Navrátilová https://www.linkedin.com/in/simona-navr%C3%A1tilov%C3%A1-7876a621b/
Let's talk about it more!
Aguan s.r.o.Kaprova 42/14110 00 PrahaCzech republicIN: 24173681+420 222 253 015info@lundegaard.eu
Lundegaard a.s.Futurama Business ParkSokolovská 651/136a186 00 Praha 8 - KarlínCzech republicIN: 25687221+420 222 253 015info@lundegaard.eu
Lundegaard a.s.Ponávka Business CentreŠkrobárenská 502/1617 00 Brno - jihCzech republicIN: 25687221+420 222 253 025info@lundegaard.eu
Lundegaard a.s.Velké náměstí 1/3500 03 Hradec KrálovéCzech republicIN: 25687221+420 222 253 015office.hradec-kralove@lundegaard.eu
All rights reserved by Lundegaard a.s.
Services provided by Aguan s.r.o.