Skip to content

Embeddings

The embeddings endpoint generates vector representations of text. These can be used for semantic search, clustering, classification, and retrieval-augmented generation (RAG) pipelines.

Endpoint

POST /v1/embeddings

Requires authentication. See Authentication.

Request body

Field Type Required Description
model string Yes The embedding model ID to use. Get available IDs from List Models.
input string or array Yes The text to embed. Can be a single string or an array of strings for batch embedding.
encoding_format string No The format for the returned vectors. "float" (default) returns an array of floating-point numbers.

Example — single input

Request:

{
  "model": "nomic-embed-text",
  "input": "The quick brown fox jumps over the lazy dog"
}

Response:

{
  "object": "list",
  "model": "nomic-embed-text",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0094, 0.0156, ...]
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

The embedding array contains the vector. Its length (dimensionality) depends on the model — check the dimensions field in the model listing.

Example — batch input

Pass an array of strings to embed multiple texts in a single request:

Request:

{
  "model": "nomic-embed-text",
  "input": [
    "Paris is the capital of France.",
    "Berlin is the capital of Germany.",
    "Tokyo is the capital of Japan."
  ]
}

Response:

{
  "object": "list",
  "model": "nomic-embed-text",
  "data": [
    { "object": "embedding", "index": 0, "embedding": [...] },
    { "object": "embedding", "index": 1, "embedding": [...] },
    { "object": "embedding", "index": 2, "embedding": [...] }
  ],
  "usage": {
    "prompt_tokens": 27,
    "total_tokens": 27
  }
}

Results are returned in the same order as the input array.

Finding available embedding models

Use the /optima/v1/models endpoint to list available models. Embedding models appear in the embed_models array. Each entry includes the model ID and the vector dimensions. See Listing Models.

Notes

  • Embedding models are separate from language models. You cannot use a chat model ID with this endpoint, or an embedding model ID with /v1/chat/completions.
  • The same vector dimensions must be used consistently throughout a project — embeddings from different models (or even the same model at different quantisations) are not compatible with each other.
  • For RAG pipelines, embed both your document corpus and your query with the same model, then compare using cosine similarity or dot product.