Viet-Anh on Software Logo

What is: Memory Network?

SourceMemory Networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

A Memory Network provides a memory component that can be read from and written to with the inference capabilities of a neural network model. The motivation is that many neural networks lack a long-term memory component, and their existing memory component encoded by states and weights is too small and not compartmentalized enough to accurately remember facts from the past (RNNs for example, have difficult memorizing and doing tasks like copying).

A memory network consists of a memory m\textbf{m} (an array of objects indexed by m_i\textbf{m}\_{i} and four potentially learned components:

  • Input feature map II - feature representation of the data input.
  • Generalization GG - updates old memories given the new input.
  • Output feature map OO - produces new feature map given II and GG.
  • Response RR - converts output into the desired response.

Given an input xx (e.g., an input character, word or sentence depending on the granularity chosen, an image or an audio signal) the flow of the model is as follows:

  1. Convert xx to an internal feature representation I(x)I\left(x\right).
  2. Update memories m_im\_{i} given the new input: m_i=G(m_i,I(x),m)m\_{i} = G\left(m\_{i}, I\left(x\right), m\right), i\forall{i}.
  3. Compute output features oo given the new input and the memory: o=O(I(x),m)o = O\left(I\left(x\right), m\right).
  4. Finally, decode output features oo to give the final response: r=R(o)r = R\left(o\right).

This process is applied at both train and test time, if there is a distinction between such phases, that is, memories are also stored at test time, but the model parameters of II, GG, OO and RR are not updated. Memory networks cover a wide class of possible implementations. The components II, GG, OO and RR can potentially use any existing ideas from the machine learning literature.

Image Source: Adrian Colyer