Tricks of the trade for training Long Short-Term Memory networks.
End-to-end differentiable memory through attention mechanisms.