Product

Create custom environments for deployments on Baseten

Test and deploy ML models reliably with production-ready custom environments, persistent endpoints, and seamless CI/CD.

3 others

Introducing canary deployments on Baseten

Our canary deployments feature lets you roll out new model deployments with minimal risk to your end-user experience.

3 others

Using asynchronous inference in production

Learn how async inference works, protects against common inference failures, is applied in common use cases, and more.

2 others

Baseten Chains explained: building multi-component AI workflows at scale

A Delightful Developer Experience for Building and Deploying Compound ML Inference Workflows

New in May 2024

AI events, multicluster model serving architecture, tokenizer efficiency, and forward-deployed engineering

New in April 2024

Use four new best in class LLMs, stream synthesized speech with XTTS, and deploy models with CI/CD

New in March 2024

Fast Mistral 7B, fractional H100 GPUs, FP8 quantization, and API endpoints for model management.

New in February 2024

3x throughput with H100 GPUs, 40% lower SDXL latency with TensorRT, and multimodal open source models.

New in January 2024

A library for open source models, general availability for L4 GPUs, and performance benchmarking for ML inference