A checklist for switching to open source ML models
Switching from a closed source ecosystem where you consume ML models from API endpoints to the world of open source ML models can seem intimidating. But this checklist will give you all of the resources you need to make the leap.
Pick an open source model
The biggest advantage of the open source ecosystem in ML is the sheer number and variety of models to choose from. But that amount of choice can be overwhelming. Here are some alternatives to closed-source models to get you started:
Large language models (LLMs):
Text embedding models:
Closed source: OpenAI text-embedding-3
Open source: BAAI text embedding models
Speech to text (audio transcription) models:
Closed source: Whisper from the Audio API
Open source: Whisper on your own infra
Text to speech (audio generation) models:
Closed source: Audio API text to speech endpoint
Open source: Orpheus TTS
Choose a GPU for model inference
Inference for most generative models like LLMs requires GPUs. Picking the right GPU is essential: you want the least expensive GPU powerful enough to run the model with acceptable performance.
For a 70 billion parameter LLM like Llama 3.3 70B, you need 2-4 H100 GPUs, but for the largest LLMs like DeepSeek-R1, you'll need H200 GPUs or multi-node inference. Partial H100 GPUs via multi-instance GPU (MIG) or smaller, cheaper L4 GPUs give great performance for smaller models like Whisper and embedding models.
Here are some buyer’s guides to GPUs:
Find optimizations relevant to your use case
If you’re just experimenting with open source models or you need to get something in production yesterday, you can skip this step. But one of the most powerful things that switching to open source models unlocks is the ability to optimize a balance of latency, throughput, quality, and cost to align with your use case.
Get started with:
Deploy your model
Once you have your model and hardware configuration, it’s time to deploy. You can deploy a curated selection of models from our model library in just a couple of clicks or use Truss, our open source model packaging framework, to get any model up and running behind an API endpoint.
Dive into deployment with:
Open source models in the Baseten model library.
A quickstart guide for Truss, an open source model packaging framework.
Integrate your new model endpoint
Once you’ve deployed your model, you’ll need to use the model endpoint to integrate your model into your application.
Baseten has guides for:
If you want to dive deeper, check out our guide to open source alternatives for ML models. Wherever you are in your journey from evaluation to adoption for open source ML models, we’re here to help at support@baseten.co.
Subscribe to our newsletter
Stay up to date on model performance, GPUs, and more.