Skip to main content

Reading notes: Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

This paper regarding interpretable AI introduces Contextual Decomposition (CD) which can extract complex interactions from LSTM networks. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM.

Recall that the LSTM has the form of

Contextual Decomposition 

Given an arbitrary phrase x_q,...,x_r where 1≤𝑞≤𝑟≤𝑇, assume the output and cell state vectors c_t and h_t can be written as a sum of two contributions

where \beta_t corresponds to contributions made solely by the given phrase to h_t, and that \gamma_t corresponds to contributions involving, at least in part, elements outside of the phrase.

Assuming that  the linearized version of a non-linear function 𝜎 can be written as 𝐿_𝜎, then 𝜎 can be linearly decomposed in the following way


For example, for the input gate i_t, we have


recall that

By linearization and decomposition relations (8) and (9) we have


Note that the first term contains contributions only from the given phrase (𝑥_𝑡, 𝛽_(𝑡−1)), while the second term has contributions from other parts (𝛾_(𝑡−1)^𝑐 ). By this insight, we rewrite the above equation:

Similarly, because
we have
 
And because we have 
 
 We can write a decomposition of 𝑐_𝑡 by
 
Hence h_𝑡
 
With the above update equations of 𝑐_𝑡 and ℎ_𝑡, we can recursively compute our decomposition, with the initializations. 

Linearization

The problem is how to write

 In the cases where there is a natural ordering to {𝑦_𝑖}, priorwork(Murdoch&Szlam,2017) has used a telescoping sum consisting of differences of partial sums as a linearization technique







 However, since the terms such as 𝛽,𝛾,𝑥 have no clear ordering, there is no natural way to order the sum in Equation 26. Hence the authors compute an average over all orderings. Letting 𝜋_1,…,𝜋_(𝑀_𝑁 )  denote the set of all permutations of 1,…,𝑁,
 
For example, when there are only 2 terms
 


References:

Murdoch, W. James, and Arthur Szlam. "Automatic rule extraction from long short term memory networks." ICLR (2017).

Murdoch, W. J., Liu, P. J., & Yu, B. (2018). Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs. International Conference on Learning Representations.

Comments

Popular posts from this blog

Reading notes: On the Connection Between Adversarial Robustness and Saliency Map Interpretability

Etmann et al. Connection between robustness and interpretability On the Connection Between Adversarial Robustness and Saliency Map Interpretability Advantage and Disadvantages of adversarial training? While this method – like all known approaches of defense – decreases the accuracy of the classifier, it is also successful in increasing the robustness to adversarial attacks Connections between the interpretability of saliency maps and robustness? saliency maps of robustified classifiers tend to be far more interpretable, in that structures in the input image also emerge in the corresponding saliency map How to obtain saliency maps for a non-robustified networks? In order to obtain a semantically meaningful visualization of the network’s classification decision in non-robustified networks, the saliency map has to be aggregated over many different points in the vicinity of the input image. This can be achieved either via averaging saliency maps of noisy versions of the image (Smilkov

Reading Notes: Probabilistic Model-Agnostic Meta-Learning

Probabilistic Model-Agnostic Meta-Learning Reading Notes: Probabilistic Model-Agnostic Meta-Learning This post is a reading note for the paper "Probabilistic Model-Agnostic Meta-Learning" by Finn et al. It is a successive work to the famous MAML paper , and can be viewed as the Bayesian version of the MAML model. Introduction When dealing with different tasks of the same family, for example, the image classification family, the neural language processing family, etc.. It is usually preferred to be able to acquire solutions to complex tasks from only a few samples given the past knowledge of other tasks as a prior (few shot learning). The idea of learning-to-learn, i.e., meta-learning, is such a framework. What is meta-learning? The model-agnostic meta-learning (MAML) [1] is a few shot meta-learning algorithm that uses gradient descent to adapt the model at meta-test time to a new few-shot task, and trains the model parameters at meta-training time to enable rapid adap

Evaluation methods for recommender systems

There are plenty of recommender systems available, the question is, for a specific recommendation problem, which recommender system model to use? The prediction accuracy (ratio of correct predicted items) is a straightforward approach, however, this is in most cases doesn't give a good indication on how 'good' the model is? Because usually, the ratings are somehow ordinal, which means the ratings are ordered instead of categorical, a prediction of 4 star is better than prediction of 5 star for a ground truth of 3 star, while when evaluate with accuracy, 4-star prediction and 5-star are treated equal -- incorrect prediction. There are plenty of better evaluation methods available,  in this post, I will introduce some of them. But first of all, lets review some basic concepts in model evaluation. To simplify our settings, lets say that we have a binary classification model, it made predictions on a test dataset, the prediction result is shown in Figure 1. Then the pr