First 5 tokens form one chunk
Each position predicts the next token
Input
Target
Stride:
Collected pairs