Embedding models convert data such as text, images, and videos into vectors for downstream tasks, including semantic search, recommendation, clustering, classification, and anomaly detection.
Prerequisites
Get an API key and export the API key as an environment variable. If you use the OpenAI SDK or DashScope SDK to make calls, install the SDK.
Get embeddings
Text embedding
Specify the text to embed and the model name in your request.
OpenAI compatible API
import os
from openai import OpenAI
input_text = "The quality of the clothes is excellent"
client = OpenAI(
# API keys vary by region. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"), # If the environment variable is not set, replace this with your API key.
# This base_url is for Singapore. To use a model in China (Beijing), replace the base_url with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
completion = client.embeddings.create(
model="text-embedding-v4",
input=input_text
)
print(completion.model_dump_json())
const OpenAI = require("openai");
// Initialize the OpenAI client.
const openai = new OpenAI({
// If the environment variable is not set, replace this with your API key.
// API keys vary by region. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
apiKey: process.env.DASHSCOPE_API_KEY,
// This baseURL is for Singapore. To use a model in China (Beijing), replace the baseURL with: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
async function getEmbedding() {
try {
const inputTexts = "The quality of the clothes is excellent";
const completion = await openai.embeddings.create({
model: "text-embedding-v4",
input: inputTexts,
dimensions: 1024 // Specify the vector dimension. This parameter is supported only by text-embedding-v3 and text-embedding-v4.
});
console.log(JSON.stringify(completion, null, 2));
} catch (error) {
console.error('Error:', error);
}
}
getEmbedding();curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/embeddings' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v4",
"input": "The quality of the clothes is excellent"
}'DashScope
import dashscope
from http import HTTPStatus
# To use a model in China (Beijing), replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
input_text = "The quality of the clothes is excellent"
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input=input_text,
)
if resp.status_code == HTTPStatus.OK:
print(resp)import com.alibaba.dashscope.embeddings.TextEmbedding;
import com.alibaba.dashscope.embeddings.TextEmbeddingParam;
import com.alibaba.dashscope.embeddings.TextEmbeddingResult;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import java.util.Collections;
public class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
// For China (Beijing), replace it with: https://dashscope.aliyuncs.com/api/v1
}
public static void main(String[] args) {
String inputTexts = "The quality of the clothes is excellent";
try {
// Build the request parameters.
TextEmbeddingParam param = TextEmbeddingParam
.builder()
.model("text-embedding-v4")
// Input text.
.texts(Collections.singleton(inputTexts))
.build();
// Create a model instance and call it.
TextEmbedding textEmbedding = new TextEmbedding();
TextEmbeddingResult result = textEmbedding.call(param);
// Print the result.
System.out.println(result);
} catch (NoApiKeyException e) {
// Catch and handle the exception for an unset API key.
System.err.println("An exception occurred during the API call: " + e.getMessage());
System.err.println("Check whether your API key is correctly configured.");
e.printStackTrace();
}
}
}# ======= Important =======
# To use a model in China (Beijing), replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding
# === Delete this comment before execution ===
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v4",
"input": {
"texts": [
"The quality of the clothes is excellent"
]
}
}'Independent multimodal vectors
Generates an independent vector for each modality (text, image, video). Ideal for processing each content type separately.
Use the DashScope SDK or call the API directly to generate independent multimodal vectors. This feature is not supported by the OpenAI compatible interface or the console.
Python
import dashscope
import json
import os
from http import HTTPStatus
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
# The base_url above is for Singapore. To use a model in China (Beijing), replace the base_url with: https://dashscope.aliyuncs.com/api/v1
# The input can be a video.
# video = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/en-US/20250107/lbcemt/new+video.mp4"
# input = [{'video': video}]
# or an image.
image = "https://dashscope.oss-cn-beijing.aliyuncs.com/images/256_1.png"
input = [{'image': image}]
resp = dashscope.MultiModalEmbedding.call(
# If the environment variable is not set, replace the following line with your Model Studio API key: api_key="sk-xxx",
# API keys vary by region. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv('DASHSCOPE_API_KEY'),
model="tongyi-embedding-vision-plus",
input=input
)
print(json.dumps(resp.output, indent=4))
Java
import com.alibaba.dashscope.embeddings.MultiModalEmbedding;
import com.alibaba.dashscope.embeddings.MultiModalEmbeddingItemImage;
import com.alibaba.dashscope.embeddings.MultiModalEmbeddingItemVideo;
import com.alibaba.dashscope.embeddings.MultiModalEmbeddingParam;
import com.alibaba.dashscope.embeddings.MultiModalEmbeddingResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.Constants;
import java.util.Collections;
public class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
// For China (Beijing), replace it with: https://dashscope.aliyuncs.com/api/v1
}
public static void main(String[] args) {
try {
MultiModalEmbedding embedding = new MultiModalEmbedding();
// The input can be a video.
// MultiModalEmbeddingItemVideo video = new MultiModalEmbeddingItemVideo(
// "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/en-US/20250107/lbcemt/new+video.mp4");
// or an image.
MultiModalEmbeddingItemImage image = new MultiModalEmbeddingItemImage(
"https://dashscope.oss-cn-beijing.aliyuncs.com/images/256_1.png");
MultiModalEmbeddingParam param = MultiModalEmbeddingParam.builder()
// If an environment variable is not set, add your Model Studio API key, for example: .apiKey("sk-xxx")
.model("tongyi-embedding-vision-plus")
.contents(Collections.singletonList(image))
.build();
MultiModalEmbeddingResult result = embedding.call(param);
System.out.println(result);
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.err.println("An exception occurred during the API call: " + e.getMessage());
e.printStackTrace();
}
}
}Multimodal fused vectors
Combines content from different modalities (text, image, video) into a single fused vector. Suitable for text-to-image search, image-to-image search, text-to-video search, and cross-modal retrieval.
Use the Python DashScope SDK or call the API directly to generate a multimodal fused vector. This feature is not supported by the OpenAI compatible interface, the Java DashScope SDK, or the console.
qwen3-vl-embedding: Generates both fused and independent vectors. To generate a fused vector, set the Boolean parameterenable_fusiontotrue.qwen2.5-vl-embedding: Generates only fused vectors.
Python
import dashscope
import json
import os
from http import HTTPStatus
# Multimodal fused vector: Combines text, image, and video into a single fused vector.
# Suitable for use cases like cross-modal retrieval and image search.
text = "This is a test text used to generate a multimodal fused vector"
image = "https://dashscope.oss-cn-beijing.aliyuncs.com/images/256_1.png"
video = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/en-US/20250107/lbcemt/new+video.mp4"
# The input contains text, image, and video. A fused vector is generated by using the enable_fusion parameter.
input_data = [
{"text": text},
{"image": image},
{"video": video}
]
# Use qwen3-vl-embedding to generate a fused vector.
resp = dashscope.MultiModalEmbedding.call(
# If the environment variable is not set, replace the following line with your Model Studio API key: api_key="sk-xxx",
api_key=os.getenv("DASHSCOPE_API_KEY"),
model="qwen3-vl-embedding",
input=input_data,
enable_fusion=True,
# Optional parameter: Specifies the vector dimension. Supported values: 2560, 2048, 1536, 1024, 768, 512, and 256. Default: 2560.
# dimension = 1024
)
print(json.dumps(resp.output, indent=4))Java (HTTP)
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
public class Main {
public static void main(String[] args) throws Exception {
// If the environment variable is not set, replace the following line with your Model Studio API key: String apiKey = "sk-xxx";
String apiKey = System.getenv("DASHSCOPE_API_KEY");
// Multimodal fused vector: Use enable_fusion to combine text, image, and video into a fused vector.
String requestBody = "{"
+ "\"model\": \"qwen3-vl-embedding\","
+ "\"input\": {"
+ " \"contents\": ["
+ " {\"text\": \"This is a test text used to generate a multimodal fused vector\"},"
+ " {\"image\": \"https://dashscope.oss-cn-beijing.aliyuncs.com/images/256_1.png\"},"
+ " {\"video\": \"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/en-US/20250107/lbcemt/new+video.mp4\"}"
+ " ]"
+ "},"
+ "\"parameters\": {"
+ " \"enable_fusion\": true"
+ "}"
+ "}";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("https://dashscope.aliyuncs.com/api/v1/services/embeddings/multimodal-embedding/multimodal-embedding"))
.header("Authorization", "Bearer " + apiKey)
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(requestBody))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
}
}Model selection
The model to use depends on your input data type and use case.
For plain text or code: Use
text-embedding-v4. It is the highest-performing model, supports advanced features like task instructions and sparse vectors, and covers most text processing use cases.For multimodal content:
Fused embedding: To represent single-modal or mixed-modal inputs as a fused embedding for use cases like cross-modal retrieval and image search, use
qwen3-vl-embedding. For example, input an image of a shirt with the text "find a similar style that looks more youthful," and the model fuses the image and task instruction into a single embedding.Independent embedding: Generate an independent embedding for each input, such as an image and its corresponding text caption, using
tongyi-embedding-vision-plus,tongyi-embedding-vision-flash, or the general-purpose multimodal modelmultimodal-embedding-v1.
For large-scale data: Process large-scale, non-real-time text data with
text-embedding-v4and the OpenAI compatible batch API to reduce costs.
Specifications for all available embedding models:
Text embedding
Singapore
Model | Embedding dimensions | Batch size | Max batch tokens (Note) | Price / 1M tokens | Language | Free quota (Note) |
text-embedding-v4 Part of the Qwen3-Embedding series | 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, 64 | 10 | 8,192 | $0.07 | 100+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian | 1 million tokens Valid for 90 days after activating Model Studio |
text-embedding-v3 | 1,024 (default), 768, 512 | 50+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian | 500,000 tokens Valid for 90 days after activating Model Studio |
China (Beijing)
Model | Embedding dimensions | Batch size | Max batch tokens (Note) | Price / 1M tokens | Language |
text-embedding-v4 Part of the Qwen3-Embedding series | 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, 64 | 10 | 8,192 | $0.072 | 100+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian, and multiple programming languages |
China (Hong Kong)
Model | Embedding dimensions | Batch size | Max batch tokens (Note) | Price / 1M tokens | Language |
text-embedding-v4 Part of the Qwen3-Embedding series | 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, 64 | 10 | 8,192 | $0.07 | 100+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian, and multiple programming languages |
Batch size is the maximum number of texts you can process in a single API call. For example, text-embedding-v4 has a batch size of 10, meaning you can include up to 10 texts per request, each not exceeding 8,192 tokens. This limit applies to:
string array input: The array can contain up to 10 elements.
file input: The text file can contain up to 10 lines.
Multimodal embedding
The model generates vector embeddings from inputs such as text, images, or videos. These embeddings are used for video classification, image classification, image-text retrieval, text-to-image search, and text-to-video search.
You can upload a single text segment, image, or video file. The API also supports combinations of input types, such as text and images. Some models support multiple inputs of the same type, such as multiple images. For details, refer to the limitations of each specific model.
Singapore
Model | Embedding dimensions | Text length limit | Image size limit | Video size limit | Price (per 1M input tokens) | Free quota (Note) |
tongyi-embedding-vision-plus | 1152 | 1,024 tokens | Up to 3 MB per image. Supports up to 8 images. | Up to 10 MB per video file | Image/Video: $0.09 Text: $0.09 | 1 million tokens Valid for 90 days after activating Model Studio |
tongyi-embedding-vision-flash | 768 | Image/Video: $0.03 Text: $0.09 |
China (Beijing)
Model | Embedding dimensions | Text length limit | Image size limit | Video size limit | Price (per 1M input tokens) |
qwen3-vl-embedding | 2560 (default), 2048, 1536, 1024, 768, 512, 256 | 32,000 tokens | Up to 5 images, up to 5 MB per image | Up to 50 MB per video file | Image/Video: $0.258 Text: $0.1 |
multimodal-embedding-v1 | 1024 | 512 tokens | Up to 8 images, 3 MB each | Up to 10 MB per video file | Free trial |
Input and language restrictions
Fused multimodal models | ||||
Model | Text | Image | Video | Request limit |
qwen3-vl-embedding | Supports 33 major languages, including Chinese, English, Japanese, Korean, French, and German. | JPEG, PNG, WEBP, BMP, TIFF, ICO, DIB, ICNS, SGI (URL or Base64 supported) | MP4, AVI, MOV (URL only) | Up to 20 content elements per request, with a maximum of 5 images and 1 video. |
Independent multimodal models | ||||
Model | Text | Image | Video | Request limit |
tongyi-embedding-vision-plus | Chinese and English | JPG, PNG, BMP (URL or Base64 supported) | MP4, MPEG, MOV, MPG, WEBM, AVI, FLV, MKV (URL only) | No limit on the number of content elements. The total number of input tokens must not exceed the batch processing token limit. |
tongyi-embedding-vision-flash | ||||
multimodal-embedding-v1 | JPG, PNG, BMP (URL or Base64 supported) | Up to 20 content elements per request, with a maximum of 20 text segments, 1 image, and 1 video. | ||
Core features
Custom vector dimensions
The text-embedding-v4, text-embedding-v3, tongyi-embedding-vision-plus, tongyi-embedding-vision-flash, and qwen3-vl-embedding models support custom vector dimensions. Higher dimensions preserve more semantic information but increase storage and compute costs.
General use cases (Recommended): A dimension of 1024 provides an optimal balance between performance and cost, making it ideal for most semantic search tasks.
High-precision scenarios: For high-precision applications, select a dimension of 1536 or 2048. This improves precision but significantly increases storage and compute overhead.
Resource-constrained environments: In cost-sensitive scenarios, select a dimension of 768 or lower. This significantly reduces resource consumption at the cost of some semantic information.
OpenAI-compatible API
import os
from openai import OpenAI
client = OpenAI(
# API keys vary by region. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# This is the URL for Singapore. To use a model in the China (Beijing) region, replace `base_url` with: https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
resp = client.embeddings.create(
model="text-embedding-v4",
input=["I like it and will buy from here again"],
# Set the vector dimension to 256
dimensions=256
)
print(f"Vector dimension: {len(resp.data[0].embedding)}")
DashScope
import dashscope
# To use a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input=["I like it and will buy from here again"],
# Set the vector dimension to 256
dimension=256
)
print(f"Vector dimension: {len(resp.output['embeddings'][0]['embedding'])}")
Query vs. document text (text_type)
This parameter is only available through the DashScope SDK and API.
To achieve optimal results in search-related tasks, vectorize content differently based on its intended role. The text_type parameter is designed for this purpose:
text_type: 'query': Use for a user's query text. The model generates a "title-like" vector that is more directional and optimized for "asking" and "finding."text_type: 'document'(default): Use for the document text stored in your knowledge base. The model generates a "body-like" vector that contains more comprehensive information and is optimized for matching.
When matching a short text against a long text, distinguish between query and document. However, for tasks such as clustering or classification where all texts have the same role, you do not need to set this parameter.
Task instructions (instruct)
This parameter is only available through the DashScope SDK and API.
Provide a clear English task instruction to guide the text-embedding-v4 model to optimize vector quality for specific retrieval scenarios. When using this feature, set the text_type parameter to query.
# Scenario: When building document vectors for a search engine, you can add an instruction to optimize vector quality for retrieval.
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input="Research papers on machine learning",
text_type="query",
instruct="Given a research paper query, retrieve relevant research paper"
)
Dense and sparse vectors
This parameter is only available through the DashScope SDK and API.
The text-embedding-v4 and text-embedding-v3 models support three vector output types for different retrieval strategies.
Vector type (output_type) | Core advantages | Key limitations | Typical use cases |
dense | Deep semantic understanding; identifies synonyms and context, leading to more relevant retrieval results. | Higher compute and storage costs; does not guarantee an exact match for keywords. | Semantic search, AI-powered Q&A, content recommendation. |
sparse | High computational efficiency; focuses on exact match for keywords and enables fast filtering. | Lacks semantic understanding; cannot process synonyms or context. | Log retrieval, product SKU search, precise information filtering. |
dense&sparse | Combines semantic and keyword matching for optimal search results. Generation cost remains the same, and the API call overhead is the same as for single-vector mode. | Requires more storage, and the system architecture and retrieval logic are more complex. | High-quality, production-grade hybrid search engine. |
Examples
Code is for demonstration only. For production, pre-compute and store embeddings in a vector database. At retrieval, you only need to generate the query embedding.
Semantic search
Achieve precise semantic matching by computing embedding similarity between a query and documents.
import dashscope
import numpy as np
from dashscope import TextEmbedding
# If you use a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
def cosine_similarity(a, b):
"""Calculate cosine similarity."""
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def semantic_search(query, documents, top_k=5):
"""Perform semantic search."""
# Generate the query embedding.
query_resp = TextEmbedding.call(
model="text-embedding-v4",
input=query,
dimension=1024
)
query_embedding = query_resp.output['embeddings'][0]['embedding']
# Generate the document embeddings.
doc_resp = TextEmbedding.call(
model="text-embedding-v4",
input=documents,
dimension=1024
)
# Calculate similarities.
similarities = []
for i, doc_emb in enumerate(doc_resp.output['embeddings']):
similarity = cosine_similarity(query_embedding, doc_emb['embedding'])
similarities.append((i, similarity))
# Sort and return the top-k results.
similarities.sort(key=lambda x: x[1], reverse=True)
return [(documents[i], sim) for i, sim in similarities[:top_k]]
# Example usage
documents = [
"Artificial intelligence is a branch of computer science",
"Machine learning is an important method for achieving artificial intelligence",
"Deep learning is a subfield of machine learning"
]
query = "What is AI?"
results = semantic_search(query, documents, top_k=2)
for doc, sim in results:
print(f"Similarity: {sim:.3f}, Document: {doc}")Recommendation system
Analyze embeddings from a user's behavior history to identify interests and recommend similar items.
import dashscope
import numpy as np
from dashscope import TextEmbedding
# If you use a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
def cosine_similarity(a, b):
"""Calculate cosine similarity."""
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def build_recommendation_system(user_history, all_items, top_k=10):
"""Build a recommendation system."""
# Generate user history embeddings.
history_resp = TextEmbedding.call(
model="text-embedding-v4",
input=user_history,
dimension=1024
)
# Calculate the user preference embedding (by averaging).
user_embedding = np.mean([
emb['embedding'] for emb in history_resp.output['embeddings']
], axis=0)
# Generate all item embeddings.
items_resp = TextEmbedding.call(
model="text-embedding-v4",
input=all_items,
dimension=1024
)
# Calculate recommendation scores.
recommendations = []
for i, item_emb in enumerate(items_resp.output['embeddings']):
score = cosine_similarity(user_embedding, item_emb['embedding'])
recommendations.append((all_items[i], score))
# Sort and return the recommendation results.
recommendations.sort(key=lambda x: x[1], reverse=True)
return recommendations[:top_k]
# Example usage
user_history = ["Science Fiction", "Action", "Suspense"]
all_movies = ["Future World", "Space Adventure", "Ancient War", "Romantic Journey", "Superhero"]
recommendations = build_recommendation_system(user_history, all_movies)
for movie, score in recommendations:
print(f"Recommendation Score: {score:.3f}, Movie: {movie}")Text clustering
Automatically group similar texts by analyzing distances between their embeddings.
# scikit-learn is required: pip install scikit-learn
import dashscope
import numpy as np
from sklearn.cluster import KMeans
# If you use a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
def cluster_texts(texts, n_clusters=2):
"""Cluster a set of texts."""
# 1. Get the embeddings for all texts.
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input=texts,
dimension=1024
)
embeddings = np.array([item['embedding'] for item in resp.output['embeddings']])
# 2. Use the KMeans algorithm for clustering.
kmeans = KMeans(n_clusters=n_clusters, random_state=0, n_init='auto').fit(embeddings)
# 3. Organize and return the results.
clusters = {i: [] for i in range(n_clusters)}
for i, label in enumerate(kmeans.labels_):
clusters[label].append(texts[i])
return clusters
# Example usage
documents_to_cluster = [
"Mobile phone company A releases a new phone",
"Search engine company B launches a new system",
"World Cup final: Argentina vs. France",
"China wins another gold medal at the Olympics",
"A company releases its latest AI chip",
"European Cup match report"
]
clusters = cluster_texts(documents_to_cluster, n_clusters=2)
for cluster_id, docs in clusters.items():
print(f"--- Cluster {cluster_id} ---")
for doc in docs:
print(f"- {doc}")Text classification
Perform zero-shot text classification by computing embedding similarity between an input text and predefined labels. This approach classifies text into new categories without pre-labeled examples.
import dashscope
import numpy as np
# If you use a model in the China (Beijing) region, replace the base_url with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
def cosine_similarity(a, b):
"""Calculate cosine similarity."""
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def classify_text_zero_shot(text, labels):
"""Perform zero-shot text classification."""
# 1. Get the embeddings for the input text and all labels.
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input=[text] + labels,
dimension=1024
)
embeddings = resp.output['embeddings']
text_embedding = embeddings[0]['embedding']
label_embeddings = [emb['embedding'] for emb in embeddings[1:]]
# 2. Calculate the similarity with each label.
scores = [cosine_similarity(text_embedding, label_emb) for label_emb in label_embeddings]
# 3. Return the label with the highest similarity.
best_match_index = np.argmax(scores)
return labels[best_match_index], scores[best_match_index]
# Example usage
text_to_classify = "The fabric of this dress is comfortable and the style is nice"
possible_labels = ["Digital Products", "Apparel & Accessories", "Food & Beverage", "Home & Living"]
label, score = classify_text_zero_shot(text_to_classify, possible_labels)
print(f"Input text: '{text_to_classify}'")
print(f"Best matching category: '{label}' (Similarity: {score:.3f})")Anomaly detection
Identify anomalous data by computing its embedding similarity to the central embedding of normal samples. A low similarity score indicates an anomaly.
The threshold in the example code is for demonstration only. In production, similarity scores vary based on data content and distribution, so no universal threshold exists. Calibrate this value on your own dataset.import dashscope
import numpy as np
def cosine_similarity(a, b):
"""Calculate cosine similarity."""
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def detect_anomaly(new_comment, normal_comments, threshold=0.6):
# 1. Generate embeddings for all normal comments and the new comment.
all_texts = normal_comments + [new_comment]
resp = dashscope.TextEmbedding.call(
model="text-embedding-v4",
input=all_texts,
dimension=1024
)
embeddings = [item['embedding'] for item in resp.output['embeddings']]
# 2. Calculate the center embedding (average) of the normal comments.
normal_embeddings = np.array(embeddings[:-1])
normal_center_vector = np.mean(normal_embeddings, axis=0)
# 3. Calculate the similarity between the new comment's embedding and the center embedding.
new_comment_embedding = np.array(embeddings[-1])
similarity = cosine_similarity(new_comment_embedding, normal_center_vector)
# 4. Determine if it is an anomaly.
is_anomaly = similarity < threshold
return is_anomaly, similarity
# Example usage
normal_user_comments = [
"Today's meeting was productive",
"The project is progressing smoothly",
"The new version will be released next week",
"User feedback is positive"
]
test_comments = {
"Normal comment": "The feature works as expected",
"Anomaly - meaningless garbled text": "asdfghjkl zxcvbnm"
}
print("--- Anomaly Detection Example ---")
for desc, comment in test_comments.items():
is_anomaly, score = detect_anomaly(comment, normal_user_comments)
result = "Yes" if is_anomaly else "No"
print(f"Comment: '{comment}'")
print(f"Is anomaly: {result} (Similarity to normal samples: {score:.3f})\n")API reference
General text embedding
Multimodal embedding
Error codes
If the model call fails and returns an error message, see Error messages for resolution.
Rate limiting
For rate limits, see Rate limiting.
Model performance (MTEB/CMTEB)
Evaluation benchmark
MTEB (Massive Text Embedding Benchmark): A comprehensive benchmark that assesses general-purpose performance of text embeddings across tasks such as classification, clustering, and retrieval.
CMTEB (Chinese Massive Text Embedding Benchmark): A large-scale benchmark that specifically evaluates Chinese text embeddings.
Scores range from 0 to 100. Higher scores indicate better performance.
Model | MTEB | MTEB (retrieval task) | CMTEB | CMTEB (retrieval task) |
text-embedding-v3 (512 dimensions) | 62.11 | 54.30 | 66.81 | 71.88 |
text-embedding-v3 (768 dimensions) | 62.43 | 54.74 | 67.90 | 72.29 |
text-embedding-v3 (1024 dimensions) | 63.39 | 55.41 | 68.92 | 73.23 |
text-embedding-v4 (512 dimensions) | 64.73 | 56.34 | 68.79 | 73.33 |
text-embedding-v4 (1024 dimensions) | 68.36 | 59.30 | 70.14 | 73.98 |
text-embedding-v4 (2048 dimensions) | 71.58 | 61.97 | 71.99 | 75.01 |