This post is the reading notes about this paper . This paper proposes a new LSTM structure that can be interpreted with the help of mixture attention, which includes both variable importance and temporal importance. Background RNNs trained over multi-variable data capture nonlinear correlation of historical values of target and exogenous variables to the future target values. However, current RNNs fall short of interpretability for multi-variable data due to their opaque hidden states. Existing works aiming to enhance the interpretability of recurrent neural networks rarely touch the internal structure of RNNs to overcome the opacity of hidden states on multivariable data. This paper tries to achieve a unified framework of accurate forecasting and importance interpretation. Proposed Model This model basically does two things first explores the internal structure of LSTM to enable hidden states to encode individual variables, then, mixture attention is designed to summarize ...