In this sample we will create a multimodal retrieval pipeline that will enable us to search over both text and images simultaneously. To do this we will create two indexes within our KDB.AI table, one index will contain the text embeddings, and the other index will contain the image embeddings. We will create our embeddings using OpenAI’s CLIP embedding model.
#Set up the schema and indexes for KDB.AI table
schema = [
{"name": "image_path", "type": "str"},
{"name": "text_path", "type": "str"},
{"name": "text", "type": "str"},
{"name": "image_embedding", "type": "float32s"},
{"name": "text_embedding", "type": "float32s"}
]
indexes = [
{
"name": "image_index_qFlat",
"type": "qFlat",
"column": "image_embedding",
"params": {"dims": 512, "metric": "L2"},
},
{
"name": "text_index_qFlat",
"type": "qFlat",
"column": "text_embedding",
"params": {"dims": 512, "metric": "L2"},
},
]
Get hands on and give it a try! See the code on GitHub, or open the notebook directly with Google Colab.