Model name: nim_clip

About CLIP

CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks.

This specific model runs on NVIDIA NIM. More information about CLIP on NIM can be found here.

Supported aidb operations

  • encode_text
  • encode_text_batch
  • encode_image
  • encode_image_batch

Supported models

NVIDIA NGC

  • nvidia/nvclip (default)

Creating the default model

SELECT aidb.create_model(
    'my_nim_clip_model', 
    'nim_clip',
    credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB
);

There is only one model, the default nvidia/nvclip, so we do not need to specify the model in the configuration.

Model configuration settings

The following configuration settings are available for CLIP models:

  • model - The NIM model to use. The default is nvidia/nvclip and is the only model available.
  • url - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to https://integrate.api.nvidia.com/v1/embeddings.
  • dimensions - Model output vector size, defaults to 1024

Model credentials

The following credentials are required if executing inside NVIDIA NGC:

  • api_key - The NVIDIA Cloud API key to use for authentication.

Could this page be better? Report a problem or suggest an addition!