RNN Shakespeare example giving error #44

PoLabs · 2017-11-13T19:47:20Z

This code in the tutorial leads to the error:

model <- mx.lstm(X.train, X.val, 
                 ctx=mx.gpu(),
                 num.round=num.round, 
                 update.period=update.period,
                 num.lstm.layer=num.lstm.layer, 
                 seq.len=seq.len,
                 num.hidden=num.hidden, 
                 num.embed=num.embed, 
                 num.label=vocab,
                 batch.size=batch.size, 
                 input.size=vocab,
                 initializer=mx.init.uniform(0.1), 
                 learning.rate=learning.rate,
                 wd=wd,
                 clip_gradient=clip_gradient)

Error: Error in check.data(train.data, batch.size, TRUE) : could not find function "check.data"
Is this function from an unspecific package?

Additionally, there are several missing files referenced in the tutorial:
rnn_model.R, rnn.R, lstm.R, etc

best,

The text was updated successfully, but these errors were encountered:

jeremiedb · 2017-12-23T19:59:41Z

The RNN API has recently been reworked to facilitate the handling of diverse use cases, such as time-series or seq-to-one models, as well as making the symbolic graph compatible with the general feedforward model training function. I made a few examples, which aims to cover brader use cases. Don't hesitate to open a request for changes or additional features that are still missing.

PoLabs · 2018-04-15T04:25:27Z

Did this change again? I'm getting a 'can not find fxn rnn.graph.'

I really appreciate you posting tutorials, but I always find the sine/cosine wave example absolute batty. Just difficult to understand. Something simple like time-series with medical or financial data would be easy to grasp: I'm having a hard time following you iter set up.

jeremiedb · 2018-04-15T07:16:50Z

It should still be there (https://github.com/apache/incubator-mxnet/blob/master/R-package/R/rnn.graph.R#L14). What version of the package are you

rnn.graph function serves as graph builder helper. I'm not aware of a single fits all solution, but I'd be glad to improve the tutorials based on suggestions.

PoLabs · 2018-04-16T21:43:19Z

Right on, your tutorial ran great on my updated instance. From trying to adapt it to my project, I think some extra commenting could help alot. For instance:

What is the 'samples' variable? Is it single points from 192 waves at a given time-step? This might be analogous to patient laboratory values or financial indicators at a given time-step. 'seq_len' seems to be the number of observations/time-steps.

I'm having errors creating mx.io.arrayiter so it's possible my data structure is wrong. Currently it is a dataframe with x as 68 variables and y as 20k observations. The end goal is to predicting the yth observation for any given x variable.

jeremiedb · 2018-04-26T05:53:08Z

Hi, the documentation is admittedly very scarce on these tutorials. I've added some comments and tried to remove some ambiguities in the CPU tutorial: file:///C:/Data/GitHub/mxnet_R_bucketing/docs/TimeSeries_CPU.html

The "samples" effectively refer to the number of independent time-series. It has been renamed to "n" in the update. As for data dimensions, an important difference is that whereas in a normal regression problems we would feed network with array of size [num_features X batch_size] with a target of size [batch_size], in a time serie model, features are [num_features X seq_length X batch_size] and target is [seq_length X batch_size], since for each time serie, we have seq_length observations.

PoLabs · 2018-05-10T00:34:49Z

These clarifications helped a ton! I've made it to the training step with my silly minute to minute cryptocurrency data, But think I'm having issues with the '@param seq_len int, number of time steps to unroll.' My thought is it would be the same as the length of sequences it's being fed (100 for your example, 20 for mine).

Working with 1,000 sequences of 20x min-obs:

batch.size = 40
train.data <- mx.io.arrayiter(data = x[,,1:800, drop = F], label = y[, 1:800], 
                              batch.size = batch.size, shuffle = TRUE)
eval.data <- mx.io.arrayiter(data = x[,,800:1000, drop = F], label = y[, 800:1000], 
                             batch.size = batch.size, shuffle = FALSE)

Going straight from your tutorial:

symbol <- rnn.graph.unroll(seq_len = 2, 
                           num_rnn_layer =  1, 
                           num_hidden = 50,
                           input_size = NULL,
                           num_embed = NULL, 
                           num_decode = 1,
                           masking = F, 
                           loss_output = "linear",
                           dropout = 0.2, 
                           ignore_label = -1,
                           cell_type = "lstm",
                           output_last_state = F,
                           config = "one-to-one")
system.time(model <- mx.model.buckets(symbol = symbol,
                                      train.data = train.data, 
                                      eval.data = eval.data, 
                                      num.round = 250, ctx = ctx, verbose = TRUE,
                                      metric = mx.metric.mse.seq, 
                                      initializer = initializer, optimizer = optimizer, 
                                      batch.end.callback = NULL, 
                                      epoch.end.callback = epoch.end.callback))

Error in sym_ini$infer.shape(input.shape) : 
  Error in operator split13: [20:28:42] c:\jenkins\workspace\mxnet\mxnet\src\operator\./slice_channel-inl.h:216: Check failed: ishape[real_axis] == static_cast<size_t>(param_.num_outputs) (20 vs. 2) If squeeze axis is True, the size of the sliced axis must be the same as num_outputs. Input shape=[20,40,1], axis=0, num_outputs=2.

It looks like the issue is with the 'seq_len = 2', trying 'seq_len = 1' produced a similar error, but seq_len=20 gave:

Error in sym_ini$infer.shape(input.shape) : 
  Error in operator loss: Shape inconsistent, Provided=[760], inferred shape=[800,1]

(running on mxnet_1.2.0 prebuilt CPU for windows, although the documentation says it requires CUDA? Thanks again!)

jeremiedb · 2018-05-10T05:45:01Z

Correct, seq_len param should be set to sequence length, therefore to 20 in your case. In the tutorial, it was shown as 2 to make it possible to visualize the resulting graph, but the model is indeed run on seq_len = 100, I'll make the doc more explicit.

Still, it seems like there's a remaining shape issue. I would look at the asctual dimensions of the data and labels fed to the iterator, and to the iterator result as well to confirm the shapes fed to the network. Generating the graph as in the tutorial (with a small seq_len, otherwise it will render forever) should also help identify where' the glitch.

PoLabs changed the title ~~RNN Shakespeare example giving eror~~ RNN Shakespeare example giving error Nov 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RNN Shakespeare example giving error #44

RNN Shakespeare example giving error #44

PoLabs commented Nov 13, 2017 •

edited

Loading

jeremiedb commented Dec 23, 2017

PoLabs commented Apr 15, 2018

jeremiedb commented Apr 15, 2018

PoLabs commented Apr 16, 2018

jeremiedb commented Apr 26, 2018

PoLabs commented May 10, 2018 •

edited

Loading

jeremiedb commented May 10, 2018

RNN Shakespeare example giving error #44

RNN Shakespeare example giving error #44

Comments

PoLabs commented Nov 13, 2017 • edited Loading

jeremiedb commented Dec 23, 2017

PoLabs commented Apr 15, 2018

jeremiedb commented Apr 15, 2018

PoLabs commented Apr 16, 2018

jeremiedb commented Apr 26, 2018

PoLabs commented May 10, 2018 • edited Loading

jeremiedb commented May 10, 2018

PoLabs commented Nov 13, 2017 •

edited

Loading

PoLabs commented May 10, 2018 •

edited

Loading