"Inference Engineering" is now available. Get your copy here
embedding

BAAIBGE Embedding ICL

BGE Embedding ICL is an excellent all-around model for text embedding.

Model details

View repository

Example usage

BAAI/bge-en-icl is a text-embeddings model, producing a 1D embeddings vector, given an input. It's frequently used for downstream tasks like clustering, used with vector databases.

This model is quantized to FP8 for deployment, which is supported by Nvidia's newest GPUs e.g. H100, H100_40GB or L4. Quantization is optional, but leads to higher efficiency.

Input
1from openai import OpenAI
2import os
3
4client = OpenAI(
5    api_key=os.environ['BASETEN_API_KEY'],
6    base_url="https://model-xxxxxx.api.baseten.co/environments/production/sync/v1"
7)
8
9embedding = client.embeddings.create(
10    input="Baseten Embeddings are fast",
11    model="model"
12)
JSON output
1{
2    "data": [
3        {
4            "embedding": [
5                0
6            ],
7            "index": 0,
8            "object": "embedding"
9        }
10    ],
11    "model": "thenlper/gte-base",
12    "object": "list",
13    "usage": {
14        "prompt_tokens": 512,
15        "total_tokens": 512
16    }
17}

🔥 Trending models