Changelog

See our latest feature releases, product improvements and bug fixes

Apr 14, 2025

Flexible instance types per model deployment

Model deployments now support changing instance types, enabling you to experiment with different hardware configurations and use specific hardware for staging, development, and production...

Apr 10, 2025

Stream Baseten logs from the terminal

For users who love working in the terminal, we're excited to announce truss push --tail, which streams Baseten logs directly to your command line.

Apr 7, 2025

Docs refresh

We’ve overhauled the Baseten docs to make them more readable, structured, and easier to navigate for both new and returning users. Some highlights:

Mar 21, 2025

Baseten is now fully OpenAI compatible

The OpenAI SDK has become a standard for interacting with AI models, making it extremely important in the inference space. We’re happy to announce official OpenAI-compatible APIs for both chat...

Feb 10, 2025

Baseten Chains is now GA: Deploy ultra-low-latency compound AI systems

Now with improved performance, robustness, and an even more delightful DevEx since our beta launch, we’re thrilled to announce the general availability of Baseten Chains for production compound AI!

Jan 30, 2025

Health checks are now customizable

We run health checks on your deployments to ensure they’re able to run inference. Now, you can customize these checks to monitor anything, from tracking 500 errors to detecting CUDA issues and more.

Jan 21, 2025

GPU metrics now available on MIG instance types

We've expanded our metrics support to include GPU memory usage and utilization for MIG (Multi-Instance GPU) instance types. These metrics were previously unavailable for MIG configurations. This...

Dec 20, 2024

New metrics dashboard customization

We’ve revamped our metrics dashboard to make monitoring and debugging easier! Here’s what’s new:

Dec 19, 2024

Introducing the Speculative Decoding Engine Builder integration

Our new Speculative Decoding integration lets you leverage speculative decoding as part of our streamlined TensorRT-LLM Engine Builder flow. Just modify the new speculator configuration in the Engine...

Dec 13, 2024

New REST API endpoints

We’ve added several new endpoints to our REST API, giving you even more control over your deployments, environments, and resources. Here’s what’s new:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Explore Baseten today

Start deploying

Talk to an engineer