using-trieve-vector-inference.mdx

History

title	description	icon
Install Trieve vector inference	Install Trieve Vector Inference	files

Installation Requirements

eksctl >= 0.171 (eksctl installation guide)
aws >= 2.15 (aws installation guide)
kubectl >= 1.28 (kubectl installation guide)
helm >= 3.14 (helm installation guide)

You'll also need a license to run Trieve Vector Inference

Getting your license

(contact us here)

Check AWS quota

Ensure you have quotas for

At least 4 vCPUs for On-Demand G and VT instances in the region of choice.

Check quota for us-east-2 here

At least **1 load-balancer per each model you want.

Check quota for us-east-2 here

Deploying the Cluster

Setting up environment variables

Create eks cluster and install needed plugins

Your AWS Account ID:

export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query "Account" --output text)"

Your AWS REGION:

export AWS_REGION=us-east-2

Your Kubernetes cluster name:

export CLUSTER_NAME=trieve-gpu

Your machine types, we recommend g4dn.xlarge, as it is the cheapest on AWS. A single small node is needed for extra utility.

export CPU_INSTANCE_TYPE=t3.small
export GPU_INSTANCE_TYPE=g4dn.xlarge
export GPU_COUNT=1

Create your cluster

curl ./create_cluster.sh | sh

This will take around 25 minutes to complete

Install Trieve Vector Inference

Specify your embedding models

Modify embedding_models.yaml for the models that you want to use

Install the helm chart

helm upgrade -i vector-inference \
    oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/trieve/trieve-embeddings  \
    -f embedding_models.yaml

Get your model endpoints

kubectl get ingress