title | sidebarTitle | description |
---|---|---|
Create Embedding |
POST /embed |
Get embeddings. Returns a 424 status code if the model is not an embedding model |
Generating an embedding from a dense embedding model
{
"inputs": "The model input",
"prompt_name": null,
"normalize": true,
"truncate": false,
"truncation_direction": "right"
}
curl -X POST \
-H "Content-Type: application/json"\
-d '{"inputs": "test input"}' \
--url "http://$ENDPOINT/embed"
import requests
endpoint = "<your-custom-endpoint>"
requests.post(f"{endpoint}/embed", json={
"inputs": ["test input", "test input 2"]
});
## or
requests.post(f"{endpoint}/embed", json={
"inputs": "test single input"
});
{
"error": "Batch size error",
"error_type": "validation"
}
{
"error": "Tokenization error",
"error_type": "validation"
}
{
"error": "Inference failed",
"error_type": "backend"
}
{
"error": "Model is overloaded",
"error_type": "overloaded"
}
Must be a key in the sentence-transformers
configuration prompts dictionary.
For example if prompt_name
is "doc" then the sentence "How to get fast inference?" will be encoded as "doc: How to get fast inference?" because the prompt text will be prepended before any text to encode.