Philip Kiely

Lead Developer Advocate

Philip Kiely

News

Introducing function calling and structured output for open-source and fine-tuned LLMs

Bryce Dubayah

Philip Kiely

Bryce Dubayah

1 other

Function calling + JSON Mode

AI engineering

The best open-source image generation model

Philip Kiely

Philip Kiely

Best image generation models

Model performance

How to double tokens per second for Llama 3 with Medusa

Abu Qader

Philip Kiely

Abu Qader

1 other

Double Llama TPS with Medusa

Community

SPC hackathon winners build with Llama 3.1 on Baseten

Philip Kiely

Philip Kiely

SPC Hackathon winners

News

Introducing automatic LLM optimization with TensorRT-LLM Engine Builder

Abu Qader

Philip Kiely

Abu Qader

1 other

TensorRT-LLM Engine Creation

Community

Ten reasons to join Baseten

Dustin Michaels

Philip Kiely

Dustin Michaels

1 other

Join Baseten

Model performance

How to serve 10,000 fine-tuned LLMs from a single GPU

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

10,000 LoRAs 1 GPU

Infrastructure

Control plane vs workload plane in model serving infrastructure

Colin McGrath

Matt Howard

Philip Kiely

Colin McGrath

2 others

Control plane vs workload plane

Model performance

Comparing tokens per second across LLMs

Philip Kiely

Philip Kiely

Comparing TPS across LLMs

1 2 3 4...8

Explore Baseten today

Start deploying

Talk to an engineer