title | description | icon |
---|---|---|
Install Trieve vector inference |
Install Trieve Vector Inference |
files |
eksctl
>= 0.171 (eksctl installation guide)aws
>= 2.15 (aws installation guide)kubectl
>= 1.28 (kubectl installation guide)helm
>= 3.14 (helm installation guide)
You'll also need a license to run Trieve Vector Inference
(contact us here)
Ensure you have quotas for
- At least 4 vCPUs for On-Demand G and VT instances in the region of choice.
Check quota for us-east-2 here
- At least **1 load-balancer per each model you want.
Check quota for us-east-2 here
Create eks cluster and install needed plugins
Your AWS Account ID:
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query "Account" --output text)"
Your AWS REGION:
export AWS_REGION=us-east-2
Your Kubernetes cluster name:
export CLUSTER_NAME=trieve-gpu
Your machine types, we recommend g4dn.xlarge
, as it is the cheapest on AWS. A single small node is needed for extra utility.
export CPU_INSTANCE_TYPE=t3.small
export GPU_INSTANCE_TYPE=g4dn.xlarge
export GPU_COUNT=1
curl ./create_cluster.sh | sh
This will take around 25 minutes to complete
Modify embedding_models.yaml
for the models that you want to use
helm upgrade -i vector-inference \
oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/trieve/trieve-embeddings \
-f embedding_models.yaml
kubectl get ingress
curl -X POST -H "Content-Type: application/json" -d '{"inputs": "cancer" ,"model": "en"}
cluster_name=trieve-gpu
region=us-east-2
helm uninstall vector-release
helm uninstall nvdp -n kube-system
helm uninstall aws-load-balancer-controller -n kube-system
eksctl delete cluster --region=${REGION} --name=${CLUSTER_NAME}