Training is now GA!
Since launching the beta of Baseten Training in May, we’ve introduced a ton of improvements, including;
A more robust ML Cookbook, great starting points for:
Training a coding model with GRPO
Long-context training using multi-node with Qwen3 30B A3B
A variety of examples with Qwen3, gpt-oss, Gemma3, and Llama
Resume from checkpoint: Launch jobs that pick up right where you left off
A ton of other improvements, including:
Broader checkpoint recognition across FSDP, VeRL, and Megatron checkpointing formats
More availability for InfiniBand-backed multi-node training runs
Improved management and handling of the training cache
Per-GPU metric visibility and improved logs
Quality of life improvements around Training Cache
And much more!
After months of positive feedback from early users and thousands of training runs completed, Baseten Training is now immediately available for anyone on Baseten. Get started here.