Our Series E: we raised $300M at a $5B valuation to power a multi-model future.
READ
Product
Product
Platform
Platform
Developer
Developer
Resources
Resources
Research
Research
Customers
Customers
Pricing
Pricing
Log in
Get started
Philip Kiely
Lead Developer Advocate
Model performance
Day zero benchmarks for Qwen 3 with SGLang on Baseten
Yineng Zhang
2 others
Infrastructure
Accelerating inference with NVIDIA B200 GPUs
Philip Kiely
Community
Building performant embedding workflows with Chroma and Baseten
Philip Kiely
AI engineering
The best open-source embedding models
Philip Kiely
Model performance
How we built BEI: high-throughput embedding, reranker, and classifier inference
Michael Feil
1 other
Model performance
How multi-node inference works for massive LLMs like DeepSeek-R1
Phil Howes
1 other
Infrastructure
Testing Llama 3.3 70B inference performance on NVIDIA GH200 in Lambda Cloud
Pankaj Gupta
1 other
AI engineering
Private, secure DeepSeek-R1 in production in US & EU data centers
Yineng Zhang
2 others
Model performance
How we built production-ready speculative decoding with TensorRT-LLM
Pankaj Gupta
2 others
1
2
3
...
8
Explore Baseten today
Start deploying
Talk to an engineer