AI has become an essential tool for discovering insights faster from large and diverse datasets. However, setting up AI models can be challenging. Nvidia NIM provides tools to simplify the deployment of generative AI models on your infrastructure or in the cloud. With EDB's AI Accelerator, you can easily integrate various models with Postgres and use AI directly in your database. In this post, we will explore how to set up a NIM model via Docker and connect it to Postgres.
Accessing NIM Models
To use NIM models, you can either:
- Use Nvidia's cloud service.
- Deploy a model using Docker.
Setting Up with Docker
For this tutorial, we will use Ubuntu 24.04 LTS on an EC2 g5.8xlarge instance with 1024 GB of gp3 storage.
1. Install Nvidia CUDA Toolkit
Download and install the CUDA Toolkit from Nvidia's official page.
2. Install Docker
Download and install docker, instructions can be found here.
3. Generate an NGC API Key
Obtain an API key from Nvidia NGC.
Login to the NGC container registry:
docker login nvcr.io
Use the following credentials:
Username: $oauthtoken
Password: <NGC API KEY>
4. Install Nvidia NGC CLI
Download and install the Nvidia NGC CLI from here.
5. Run the NIM Model
Save the following script as a shell script and execute it (more information can be found from Nvidia's documentation):
# Choose a container name
export CONTAINER_NAME=Llama3-8B-Instruct
NGC_API_KEY="<NGC API KEY>"
# Define the repository and image
Repository=nim/meta/llama3-8b-instruct
export IMG_NAME="nvcr.io/${Repository}:latest"
# Set local cache path
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
# Start the LLM NIM container
docker run -it --rm --name=$CONTAINER_NAME \
--runtime=nvidia \
--gpus all \
--shm-size=16GB \
-e NGC_API_KEY=$NGC_API_KEY \
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
-u $(id -u) \
-p 8000:8000 \
$IMG_NAME
6. Test the Deployment
Run the following command to verify the model is working:
curl -X POST "http://0.0.0.0:8000/v1/chat/completions" \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"model": "meta/llama3-8b-instruct",
"messages": [
{
"role": "user",
"content": "Tell me a story"
}
]
}'
Setting Up with Cloud
Alternatively, you can use Nvidia's cloud-based NIM. Sign up here.
1. Select a Model
Choose a model from Nvidia's model library.
2. Generate an API Key
API keys can be created from the model’s page.
Integrating the Model with AI Accelerator
1. Enable AI Accelerator in EDB Postgres AI
Sign up for EDB’s generative AI features here.
Enable the AI Accelerator extension:
CREATE EXTENSION aidb CASCADE;
2. Register the Model
If using the Cloud Offering:
SELECT aidb.create_model(
'my_nim_llm',
'nim_completions',
'{"model": "meta/llama-3.3-70b-instruct"}',
'{"api_key":"<NIM API KEY>"}'::JSONB
);
If Running in a Docker Container:
SELECT aidb.create_model(
'my_nim_llm',
'nim_completions',
'{"model": "meta/llama-3.3-70b-instruct", "url": "http://<NIM_HOST>:8000/v1/chat/completions"}'::JSONB
);
3. Run the Model
Execute the following query to interact with the model:
SELECT aidb.decode_text('my_nim_llm', 'Tell me a short, one sentence story');
You should see results that resemble the following:
decode_text
----------------------------------------------------------------------------------------
As the clock struck midnight, a single tear fell from the porcelain doll's glassy eye.
You can now successfully use Nvidia NIM models via EDB’s AI Accelerator. To explore how AI features in Postgres can enhance your workloads, you can learn more here.