embed.mdx

History

title	sidebarTitle	description
Create Embedding	POST /embed	Get embeddings. Returns a 424 status code if the model is not an embedding model

Generating an embedding from a dense embedding model

{
  "inputs": "The model input",
  "prompt_name": null,
  "normalize": true,
  "truncate": false,
  "truncation_direction": "right"
}

curl -X POST \
     -H "Content-Type: application/json"\
     -d '{"inputs": "test input"}' \
     --url "http://$ENDPOINT/embed"

import requests

endpoint = "<your-custom-endpoint>"

requests.post(f"{endpoint}/embed", json={
    "inputs": ["test input", "test input 2"]
});

## or 

requests.post(f"{endpoint}/embed", json={
    "inputs": "test single input"
});

```json 200 Embeddings [ [ 0.038483415, -0.00076982786, -0.020039458 ... ], [ 0.04496114, -0.039057795, -0.022400795, ... ] ] ```

{
    "error": "Batch size error",
    "error_type": "validation"
}

{
    "error": "Tokenization error",
    "error_type": "validation"
}

{
    "error": "Inference failed",
    "error_type": "backend"
}

{
    "error": "Model is overloaded",
    "error_type": "overloaded"
}

Inputs that need to be embedded The name of the prompt that should be used by for encoding. If not set, no prompt will be applied.

Must be a key in the sentence-transformers configuration prompts dictionary.

For example if prompt_name is "doc" then the sentence "How to get fast inference?" will be encoded as "doc: How to get fast inference?" because the prompt text will be prepended before any text to encode.

Automatically truncate inputs that are longer than the maximum supported size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embed.mdx

embed.mdx

Files

embed.mdx

Latest commit

History

embed.mdx

File metadata and controls