Viet-Anh on Software Logo

What is: Unitary RNN?

SourceUnitary Evolution Recurrent Neural Networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

A Unitary RNN is a recurrent neural network architecture that uses a unitary hidden to hidden matrix. Specifically they concern dynamics of the form:

h_t=f(Wh_t1+Vx_t)h\_{t} = f\left(Wh\_{t−1} + Vx\_{t}\right)

where WW is a unitary matrix (WW=I)\left(W^{†}W = I\right). The product of unitary matrices is a unitary matrix, so WW can be parameterised as a product of simpler unitary matrices:

h_t=f(D_3R_2F1D_2PR_1FD_1h_t1+Vxt)h\_{t} = f\left(D\_{3}R\_{2}F^{−1}D\_{2}PR\_{1}FD\_{1}h\_{t−1} + Vxt\right)

where D_3D\_{3}, D_2D\_{2}, D_1D\_{1} are learned diagonal complex matrices, and R_2R\_{2}, R_1R\_{1} are learned reflection matrices. Matrices FF and F1F^{−1} are the discrete Fourier transformation and its inverse. P is any constant random permutation. The activation function f(h)f\left(h\right) applies a rectified linear unit with a learned bias to the modulus of each complex number. Only the diagonal and reflection matrices, DD and RR, are learned, so Unitary RNNs have fewer parameters than LSTMs with comparable numbers of hidden units.

Source: Associative LSTMs