large language
 Cogito v2 70B
Cogito v2 70B
 Cogito v2 70B
Cogito v2 70BSOTA 70B dense model trained for better outputs from shorter reasoning chains
Model details
View repositoryExample usage
Cogito v2 70B, currently in preview, is a frontier LLM that offers SOTA intelligence with shorter reasoning chains thanks to Iterated Distillation & Amplification (IDA), a novel research technique that distills improvements from inference-time reasoning back into the model weights.
Thanks to IDA, Cogito models arrive at strong results using fewer reasoning tokens, improving their cost and speed in real-world agents and applications.
Input
1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Write FizzBuzz."}
16    ],
17    temperature=0.6,
18    max_tokens=1000,
19)
20print(response_chat)JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}



