Skip to content

Latest commit

 

History

History
113 lines (91 loc) · 2.19 KB

embed.mdx

File metadata and controls

113 lines (91 loc) · 2.19 KB
title sidebarTitle description
Create Embedding
POST /embed
Get embeddings. Returns a 424 status code if the model is not an embedding model

Generating an embedding from a dense embedding model

{
  "inputs": "The model input",
  "prompt_name": null,
  "normalize": true,
  "truncate": false,
  "truncation_direction": "right"
}
curl -X POST \
     -H "Content-Type: application/json"\
     -d '{"inputs": "test input"}' \
     --url "http://$ENDPOINT/embed" 
import requests

endpoint = "<your-custom-endpoint>"

requests.post(f"{endpoint}/embed", json={
    "inputs": ["test input", "test input 2"]
});

## or 

requests.post(f"{endpoint}/embed", json={
    "inputs": "test single input"
});
```json 200 Embeddings [ [ 0.038483415, -0.00076982786, -0.020039458 ... ], [ 0.04496114, -0.039057795, -0.022400795, ... ] ] ```
{
    "error": "Batch size error",
    "error_type": "validation"
}
{
    "error": "Tokenization error",
    "error_type": "validation"
}
{
    "error": "Inference failed",
    "error_type": "backend"
}
{
    "error": "Model is overloaded",
    "error_type": "overloaded"
}
Inputs that need to be embedded The name of the prompt that should be used by for encoding. If not set, no prompt will be applied.

Must be a key in the sentence-transformers configuration prompts dictionary.

For example if prompt_name is "doc" then the sentence "How to get fast inference?" will be encoded as "doc: How to get fast inference?" because the prompt text will be prepended before any text to encode.

Automatically truncate inputs that are longer than the maximum supported size