having developed a large-batch workflow for a client using gemini models, this is a welcome improvement. however, no news on the DSQ [1] issues is a bummer.
at least for us, the bottleneck is the amount of retries/waiting needed to max out how many requests we can make in parallel.
at least for us, the bottleneck is the amount of retries/waiting needed to max out how many requests we can make in parallel.
[1] https://cloud.google.com/vertex-ai/generative-ai/docs/dynami...