Sample selection when trying to train a Model #527
Unanswered
azraelxuemo
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, sorry to bother you, but I have a question~
description
In the Figure 5.9, the demo have a smaple length of 6
And first select 1-6 tokens to train
And next select 7-12 tokens to train.
problem
That comes a problem, the 4,5,6 token can never see the 8 9 token.
And the same questions exist when context-length increase to 1024.
The part of the first batch token, can never see the part of second batch token.
And When we train the model, we use this kind method to train a epoch.
So I think it may cause some problems.
The solutions heard
And I headered that some people use random starting index to select the context~
What I want to know
What I want to know is that when you actually train the model in the enterprise, what do you do to get the data sample.
Thanks and best regrads. ( sorry for that I do not know how can I get the figure, so I just give the number)
Finally, I would like to say that this is the best book I have encountered while studying LLM!!!
Beta Was this translation helpful? Give feedback.
All reactions