Embeddings
The embeddings endpoint generates vector representations of text. These can be used for semantic search, clustering, classification, and retrieval-augmented generation (RAG) pipelines.
Endpoint
POST /v1/embeddings
Requires authentication. See Authentication.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | The embedding model ID to use. Get available IDs from List Models. |
input |
string or array | Yes | The text to embed. Can be a single string or an array of strings for batch embedding. |
encoding_format |
string | No | The format for the returned vectors. "float" (default) returns an array of floating-point numbers. |
Example — single input
Request:
{
"model": "nomic-embed-text",
"input": "The quick brown fox jumps over the lazy dog"
}
Response:
{
"object": "list",
"model": "nomic-embed-text",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023, -0.0094, 0.0156, ...]
}
],
"usage": {
"prompt_tokens": 10,
"total_tokens": 10
}
}
The embedding array contains the vector. Its length (dimensionality) depends on the model — check the dimensions field in the model listing.
Example — batch input
Pass an array of strings to embed multiple texts in a single request:
Request:
{
"model": "nomic-embed-text",
"input": [
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"Tokyo is the capital of Japan."
]
}
Response:
{
"object": "list",
"model": "nomic-embed-text",
"data": [
{ "object": "embedding", "index": 0, "embedding": [...] },
{ "object": "embedding", "index": 1, "embedding": [...] },
{ "object": "embedding", "index": 2, "embedding": [...] }
],
"usage": {
"prompt_tokens": 27,
"total_tokens": 27
}
}
Results are returned in the same order as the input array.
Finding available embedding models
Use the /optima/v1/models endpoint to list available models. Embedding models appear in the embed_models array. Each entry includes the model ID and the vector dimensions. See Listing Models.
Notes
- Embedding models are separate from language models. You cannot use a chat model ID with this endpoint, or an embedding model ID with
/v1/chat/completions. - The same vector dimensions must be used consistently throughout a project — embeddings from different models (or even the same model at different quantisations) are not compatible with each other.
- For RAG pipelines, embed both your document corpus and your query with the same model, then compare using cosine similarity or dot product.