A small language model using a Recurrent Neural Network to generate text based on a corpus.
To launch the code, run the following command:
cargo run --release <architecture> <filepath>
The architecture
provided must be equal to rnn
(for a vanilla Recurrent Neural Network) or lstm
(for a Long Short-Term Memory architecture). The parameter passed in <filepath>
is used as a training text for the network. A showcase file containing 40k lines of Shakespeare's plays is provided in "data/shakespeare-40k.txt"
.
The implementation is done is Rust and uses the nalgebra
library for linear algebra computations. The vanilla RNN implementation is based on this blog post by Andrej Karpathy, that I adapted to support LSTM.
Hyperparameters are defined in src/rnn.rs
; you can for instance change the number of neurons in the hidden layer.
The simple RNN uses the following recurring equations:
The Long Short-Term Memory architecture uses the following recurring equations:
where .component_mul
in the Rust code.
The optimization is done with AdaGrad. For each parameter
where
After ... iterations of LSTM for sequences of length 25, I obtained this following result sample on the shakespeare-40k.txt
dataset:
QUEEN ELIZABETH:
The fraine thy most a twagenest.
Dle staffold!
To you doubs,
And weal drief.
PRESBERHA:
I ungo to cursess witor'd; us lave whil, than enough.
KING RICHARD III:
That lighterde for thy dascore defol,
And with leavend my sagn of the good burtief.
Thou wor he art I. Prince, but bournon they have my lord.