All Products
Search
Document Center

DashVector:Vectorize image data by using open source embedding models of ModelScope

Last Updated:Apr 11, 2024

This topic describes how to vectorize image data by using visual vectorization models of ModelScope and import the vectors in DashVector for vector search.

ModelScope seeks to build a next-generation open source model-as-a-service (MaaS) platform and provide pan-AI developers with flexible, easy-to-use, and cost-efficient one-stop models.

ModelScope aims to reduce repeated R&D costs and provide an environment-friendlier and opener AI development environment and model services by bringing together industry-leading pre-trained models. This way, ModelScope can contribute to the cause of the digital economy. ModelScope provides various types of high-quality models in an open source manner. Developers can download and experience the models from ModelScope free of charge.

On ModelScope, you can:

  • Use and download pre-trained models free of charge.

  • Perform command line-based model prediction to validate model effects simply and quickly.

  • Fine-tune models with your own data for customization.

  • Engage in theoretical and practical training to effectively improve your R&D abilities.

  • Share your ideas with the entire community.

Prerequisites

  • DashVector:

  • ModelScope:

    • The SDK of the latest version is installed by running the pip install -U modelscope command.

Similarity search model based on product characteristics extracted from product images

Overview

This model extracts product characteristics based on product images, vectorizes the characteristics, and then allows consumers to carry out large-scale searches for the same or similar products. The model automatically performs image matting for luggage products and extracts product characteristics based on image matting results without additional input.

Model ID

Vector dimensions

Distance metric

Vector data type

Remarks

damo/cv_resnet50_product-bag-embedding-models

512

Cosine

Float32

Example

Note

You must perform the following operations for the code to run properly:

  1. Replace {your-dashvector-api-key} in the sample code with your DashVector API key.

  2. Replace {your-dashvector-cluster-endpoint} in the sample code with the endpoint of your DashVector cluster.

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from dashvector import Client


product_embedding = pipeline(
    Tasks.product_retrieval_embedding,
    model='damo/cv_resnet50_product-bag-embedding-models'
)


def generate_embeddings(img: str):
    result = product_embedding(img)
    return result['img_embedding']


# Create a DashVector client.
client = Client(
    api_key='{your-dashvector-api-key}',
    endpoint='{your-dashvector-cluster-endpoint}'
)

# Create a DashVector collection.
rsp = client.create('resnet50-embedding', dimension=512)
assert rsp
collection = client.get('resnet50-embedding')
assert collection

# Convert text into a vector and store it in DashVector.
img_url = 'https://mmsearch.oss-cn-zhangjiakou.aliyuncs.com/maas_test_img/tb_image_share_1666002161794.jpg'
collection.insert(
    ('ID1', generate_embeddings(img_url))
)

# Perform a vector search.
docs = collection.query(
    generate_embeddings(img_url)
)
print(docs)