This paper regarding interpretable AI introduces Contextual Decomposition (CD) which can extract complex interactions from LSTM networks. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM.
Recall that the LSTM has the form of
where \beta_t corresponds to contributions made solely by the given phrase to h_t, and that \gamma_t corresponds to contributions involving, at least in part, elements outside of the phrase.
Assuming that the linearized version of a non-linear function 𝜎 can be written as 𝐿_𝜎, then 𝜎 can be linearly decomposed in the following way
For example, for the input gate i_t, we have
recall that
By linearization and decomposition relations (8) and (9) we have
Note that the first term contains contributions only from the given phrase (𝑥_𝑡, 𝛽_(𝑡−1)), while the second term has contributions from other parts (𝛾_(𝑡−1)^𝑐 ). By this insight, we rewrite the above equation:
we have
In the cases where there is a natural ordering to {𝑦_𝑖}, priorwork(Murdoch&Szlam,2017) has used a telescoping sum consisting of differences of partial sums as a linearization technique
However, since the terms such as 𝛽,𝛾,𝑥 have no clear ordering, there is no natural way to order the sum in Equation 26. Hence the authors compute an average over all orderings. Letting 𝜋_1,…,𝜋_(𝑀_𝑁 ) denote the set of all permutations of 1,…,𝑁,
For example, when there are only 2 terms
References:
Murdoch, W. James, and Arthur Szlam. "Automatic rule extraction from long short term memory networks." ICLR (2017).
Murdoch, W. J., Liu, P. J., & Yu, B. (2018). Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs. International Conference on Learning Representations.
Recall that the LSTM has the form of
Contextual Decomposition
Given an arbitrary phrase x_q,...,x_r where 1≤𝑞≤𝑟≤𝑇, assume the output and cell state vectors c_t and h_t can be written as a sum of two contributionswhere \beta_t corresponds to contributions made solely by the given phrase to h_t, and that \gamma_t corresponds to contributions involving, at least in part, elements outside of the phrase.
Assuming that the linearized version of a non-linear function 𝜎 can be written as 𝐿_𝜎, then 𝜎 can be linearly decomposed in the following way
For example, for the input gate i_t, we have
recall that
By linearization and decomposition relations (8) and (9) we have
Note that the first term contains contributions only from the given phrase (𝑥_𝑡, 𝛽_(𝑡−1)), while the second term has contributions from other parts (𝛾_(𝑡−1)^𝑐 ). By this insight, we rewrite the above equation:
Similarly, because
And because we have
We can write a decomposition of 𝑐_𝑡 by
Hence h_𝑡
With the above update equations of 𝑐_𝑡 and ℎ_𝑡, we can recursively compute our decomposition, with the initializations.
Linearization
The problem is how to writeIn the cases where there is a natural ordering to {𝑦_𝑖}, priorwork(Murdoch&Szlam,2017) has used a telescoping sum consisting of differences of partial sums as a linearization technique
However, since the terms such as 𝛽,𝛾,𝑥 have no clear ordering, there is no natural way to order the sum in Equation 26. Hence the authors compute an average over all orderings. Letting 𝜋_1,…,𝜋_(𝑀_𝑁 ) denote the set of all permutations of 1,…,𝑁,
For example, when there are only 2 terms
References:
Murdoch, W. James, and Arthur Szlam. "Automatic rule extraction from long short term memory networks." ICLR (2017).
Murdoch, W. J., Liu, P. J., & Yu, B. (2018). Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs. International Conference on Learning Representations.
Comments
Post a Comment