Skip to main content

Posts

Showing posts from October, 2019

Reading notes: Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

This paper regarding interpretable AI introduces Contextual Decomposition (CD) which can extract complex interactions from LSTM networks. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM. Recall that the LSTM has the form of Contextual Decomposition  Given an arbitrary phrase x_q,...,x_r where 1≤𝑞≤𝑟≤𝑇, assume the output and cell state vectors c_t and h_t can be written as a sum of two contributions where \beta_t corresponds to contributions made solely by the given phrase to h_t, and that \gamma_t corresponds to contributions involving, at least in part, elements outside of the phrase. Assuming that  the linearized version of a non-linear function 𝜎 can be written as 𝐿_𝜎, then 𝜎 can be linearly decomposed in the following way For example, for the input gate i_t, we have recall that By linearization and decomposition relations (8) and (9) we have Note tha...