This paper is a direct extension of
this paper. It adds an explanation task objective and jointly training both rating prediction and explanation tasks.
Please refer my other
post for multi-pointer co-attention learning.
|
Figure 1: Co-Attentive Multi-Task Learning for Explainable Recommendation |
From Figure 1, one major difference from this
post is
Task 2. Task 2 itself is a GRU network used to generate text explanations. Let's denote its output $\boldsymbol{o}_t$ as the distribution of corresponding words. $Y=(y_1, ..., y_T)$ the generated texts.
There are two additional losses from Task 2.
1. Concept relevance loss $\mathcal{L}_c$. During training, $\mathcal{L}_c$ is used to increase the probability that the selected concepts appear in $Y$. It is computed by
2. Negative log-likelihood loss $\mathcal{L}_n$. To ensure that the generated words are similar to the ground truth ones.
Plus, the original rating prediction loss:
The model jointly training the following objective:
Comments
Post a Comment