Tulu 3 8B Reward

A reward model based on Llama 3.1 8B

Deploy now

‌

Model details

Developed by
Allen AI
Model family
Tulu
Use case
embedding
Version
V3
Variant
Reward
Size
8B
Optimization
BEI
Hardware
H100 MIG 40GB
License
Llama 3.1

View repository

Example usage

allenai/Llama-3.1-Tulu-3-8B-RM is a text-classification model, used to classify a text into a category.

It is frequently used in sentiment analysis, spam detection, and more. It's also used for deployment of chat rating models, e.g. RLHF reward models or toxicity detection models.

Input
import requests
import os

headers = {
    f"Authorization": f"Api-Key {os.environ['BASETEN_API_KEY']}"
}

requests.post(
    headers=headers,
    url="https://model-xxxxxx.api.baseten.co/environments/production/sync/predict",
    json={
        "inputs": [["Baseten is a fast inference provider"], ["classify this separately."]],
        "raw_scores": True,
        "truncate": True,
        "truncation_direction": "Right"
    }
)
JSON output
[
    [
        {
            "label": "excitement",
            "score": 0.99
        }
    ],
    [
        {
            "label": "excitement",
            "score": 0.01
        }
    ]
]