And more importantly batches, so taking the example from the blog post, it would be 32 tokens per each forward pass in the decoding phase.
And more importantly batches, so taking the example from the blog post, it would be 32 tokens per each forward pass in the decoding phase.