Embedding is the process of converting text data into vectors in a mathematical space. The proximity or angle between vectors indicates the similarity between the data items. This is useful for tasks such as classification, retrieval, and recommendation.
Scenarios
Recommendation: Recommends relevant items based on input data. For example, recommends products based on user purchase history and browsing behavior.
Clustering: Groups input data by relevance. For example, categorizes news articles into topics such as technology, sports, and entertainment.
Search: Ranks search results by relevance to input data. For example, a text embedding model return relevant web pages based on user query.
Anomaly detection: Detects data points that deviate from the norm. For example, in the finance sector, extract feature vectors from transaction records to identify unusual transactions as potential frauds.
Supported models
General text embedding
Vector dimension refers to the number of elements in a vector. For example, a 1,024-dimensional vector contains 1,024 numerical values. Higher dimensions allow vectors to represent richer information, thus capturing text characteristics more precisely.
Name | Vector dimensions | Maximum rows | Maximum tokens per row | Supported languages | Price (Million input tokens) | Free quota |
text-embedding-v3 | 1,024 (default), 768 or 512 | 10 | 8,192 | Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, Russian, and more than 50 other languages | $0.07 | 500,000 tokens Valid for 180 days after activation |
Quick Start
You must first obtain an API key and set the API key as an environment variable. If you need to use OpenAI SDK or DashScope SDK, you must install the SDK.
String input
OpenAI compatible
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # Replace with your API Key if you have not configured environment variables
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # base_url for Model Studio
)
completion = client.embeddings.create(
model="text-embedding-v3",
input='The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again',
dimensions=1024,
encoding_format="float"
)
print(completion.model_dump_json())
import OpenAI from "openai";
import process from 'process';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
async function getEmbedding() {
try {
const completion = await openai.embeddings.create({
model: "text-embedding-v3",
input: 'The quality of the clothing is excellent, very beautiful. It was worth the long wait. I like it, and I will come back here to buy again.',
dimensions: 1024, // Specify vector dimension (only supported by text-embedding-v3)
encoding_format: "float"
});
console.log(JSON.stringify(completion, null, 2));
} catch (error) {
console.error('Error:', error);
}
}
getEmbedding();
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/embeddings' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": "The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again",
"dimension": "1024",
"encoding_format": "float"
}'
DashScope
import dashscope
from http import HTTPStatus
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
resp = dashscope.TextEmbedding.call(
model=dashscope.TextEmbedding.Models.text_embedding_v3,
input='The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again',
dimension=1024,
output_type="dense&sparse"
)
print(resp) if resp.status_code == HTTPStatus.OK else print(resp)
import java.util.Arrays;
import java.util.concurrent.Semaphore;
import com.alibaba.dashscope.common.ResultCallback;
import com.alibaba.dashscope.embeddings.TextEmbedding;
import com.alibaba.dashscope.embeddings.TextEmbeddingParam;
import com.alibaba.dashscope.embeddings.TextEmbeddingResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
public final class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
public static void basicCall() throws ApiException, NoApiKeyException{
TextEmbeddingParam param = TextEmbeddingParam
.builder()
.model(TextEmbedding.Models.TEXT_EMBEDDING_V3)
.texts(Arrays.asList("Shall I compare thee to a summers day", "Thou art more lovely and more temperate", "Rough winds do shake the darling buds of May", "And summers lease hath all too short a date")).build();
TextEmbedding textEmbedding = new TextEmbedding();
TextEmbeddingResult result = textEmbedding.call(param);
System.out.println(result);
}
public static void callWithCallback() throws ApiException, NoApiKeyException, InterruptedException{
TextEmbeddingParam param = TextEmbeddingParam
.builder()
.model(TextEmbedding.Models.TEXT_EMBEDDING_V3)
.texts(Arrays.asList("Shall I compare thee to a summers day", "Thou art more lovely and more temperate", "Rough winds do shake the darling buds of May", "And summers lease hath all too short a date")).build();
TextEmbedding textEmbedding = new TextEmbedding();
Semaphore sem = new Semaphore(0);
textEmbedding.call(param, new ResultCallback<TextEmbeddingResult>() {
@Override
public void onEvent(TextEmbeddingResult message) {
System.out.println(message);
}
@Override
public void onComplete(){
sem.release();
}
@Override
public void onError(Exception err){
System.out.println(err.getMessage());
err.printStackTrace();
sem.release();
}
});
sem.acquire();
}
public static void main(String[] args){
try{
callWithCallback();
}catch(ApiException|NoApiKeyException|InterruptedException e){
e.printStackTrace();
System.out.println(e);
}
try {
basicCall();
} catch (ApiException | NoApiKeyException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": {
"texts": [
"The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again"
]
},
"parameters": {
"dimension": 1024
}
}'
String list input
OpenAI compatible
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # Replace with your API Key if you have not configured environment variables
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # base_url for DashScope service
)
completion = client.embeddings.create(
model="text-embedding-v3",
input=['Shall I compare thee to a summers day', 'Thou art more lovely and more temperate', 'Rough winds do shake the darling buds of May', 'And summers lease hath all too short a date'],
encoding_format="float"
)
print(completion.model_dump_json())
import OpenAI from "openai";
import process from 'process';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
async function getMultipleEmbeddings() {
try {
const completion = await openai.embeddings.create({
model: "text-embedding-v3",
input: [
'Shall I compare thee to a summers day?',
'Thou art more lovely and more temperate.',
'Rough winds do shake the darling buds of May,',
'And summers lease hath all too short a date.'
],
encoding_format: "float"
});
console.log(JSON.stringify(completion, null, 2));
} catch (error) {
console.error('Error:', error);
}
}
getMultipleEmbeddings();
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/embeddings' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": [
"Shall I compare thee to a summers day",
"Thou art more lovely and more temperate",
"Rough winds do shake the darling buds of May",
"And summers lease hath all too short a date"
],
"encoding_format": "float"
}'
DashScope
import dashscope
from http import HTTPStatus
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
DASHSCOPE_MAX_BATCH_SIZE = 25
inputs = ['Shall I compare thee to a summers day', 'Thou art more lovely and more temperate', 'Rough winds do shake the darling buds of May', 'And summers lease hath all too short a date']
result = None
batch_counter = 0
for i in range(0, len(inputs), DASHSCOPE_MAX_BATCH_SIZE):
batch = inputs[i:i + DASHSCOPE_MAX_BATCH_SIZE]
resp = dashscope.TextEmbedding.call(
model=dashscope.TextEmbedding.Models.text_embedding_v3,
input=batch,
dimension=1024
)
if resp.status_code == HTTPStatus.OK:
if result is None:
result = resp
else:
for emb in resp.output['embeddings']:
emb['text_index'] += batch_counter
result.output['embeddings'].append(emb)
result.usage['total_tokens'] += resp.usage['total_tokens']
else:
print(resp)
batch_counter += len(batch)
print(result)
import java.util.Arrays;
import java.util.List;
import com.alibaba.dashscope.embeddings.TextEmbedding;
import com.alibaba.dashscope.embeddings.TextEmbeddingParam;
import com.alibaba.dashscope.embeddings.TextEmbeddingResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
public final class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
private static final int DASHSCOPE_MAX_BATCH_SIZE = 25;
public static void main(String[] args) {
List<String> inputs = Arrays.asList(
"Shall I compare thee to a summers day",
"Thou art more lovely and more temperate",
"Rough winds do shake the darling buds of May",
"And summers lease hath all too short a date"
);
TextEmbeddingResult result = null;
int batchCounter = 0;
for (int i = 0; i < inputs.size(); i += DASHSCOPE_MAX_BATCH_SIZE) {
List<String> batch = inputs.subList(i, Math.min(i + DASHSCOPE_MAX_BATCH_SIZE, inputs.size()));
TextEmbeddingParam param = TextEmbeddingParam.builder()
.model(TextEmbedding.Models.TEXT_EMBEDDING_V3)
.texts(batch)
.build();
TextEmbedding textEmbedding = new TextEmbedding();
try {
TextEmbeddingResult resp = textEmbedding.call(param);
if (resp != null) {
if (result == null) {
result = resp;
} else {
for (var emb : resp.getOutput().getEmbeddings()) {
emb.setTextIndex(emb.getTextIndex() + batchCounter);
result.getOutput().getEmbeddings().add(emb);
}
result.getUsage().setTotalTokens(result.getUsage().getTotalTokens() + resp.getUsage().getTotalTokens());
}
} else {
System.out.println(resp);
}
} catch (ApiException | NoApiKeyException e) {
e.printStackTrace();
}
batchCounter += batch.size();
}
System.out.println(result);
}
}
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": {
"texts": [
"Shall I compare thee to a summers day",
"Thou art more lovely and more temperate",
"Rough winds do shake the darling buds of May",
"And summers lease hath all too short a date"
]
},
"parameters": {
"dimension": 1024
}
}'
File input
Sample file:
OpenAI compatible
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # Replace with your API Key if you have not configured environment variables
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # base_url for Model Studio
)
with open('texts_to_embedding.txt', 'r', encoding='utf-8') as f:
completion = client.embeddings.create(
model="text-embedding-v3",
input=f
)
print(completion.model_dump_json())
import OpenAI from "openai";
import process from 'process';
import fs from 'fs/promises';
// Initialize OpenAI client
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY, // Read from environment variable
baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1'
});
async function getEmbeddingsFromFile() {
try {
// Read the content of the file
const fileContent = await fs.readFile('texts_to_embedding.txt', 'utf-8');
// Create embedding vectors
const completion = await openai.embeddings.create({
model: "text-embedding-v3",
input: fileContent
});
console.log(JSON.stringify(completion, null, 2));
} catch (error) {
console.error('Error:', error);
}
}
getEmbeddingsFromFile();
FILE_CONTENT=$(cat texts_to_embedding.txt | jq -Rs .)
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/embeddings' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": ['"$FILE_CONTENT"']
}'
DashScope
from http import HTTPStatus
import dashscope
from dashscope import TextEmbedding
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
with open('texts_to_embedding.txt', 'r', encoding='utf-8') as f:
resp = TextEmbedding.call(
model=TextEmbedding.Models.text_embedding_v3,
input=f
)
if resp.status_code == HTTPStatus.OK:
print(resp)
else:
print(resp)
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import com.alibaba.dashscope.embeddings.TextEmbedding;
import com.alibaba.dashscope.embeddings.TextEmbeddingParam;
import com.alibaba.dashscope.embeddings.TextEmbeddingResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
public final class Main {
static {
Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
}
public static void main(String[] args) {
try (BufferedReader reader = new BufferedReader(new FileReader(""))) {
StringBuilder content = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
content.append(line).append("\n");
}
TextEmbeddingParam param = TextEmbeddingParam.builder()
.model(TextEmbedding.Models.TEXT_EMBEDDING_V3)
.text(content.toString())
.build();
TextEmbedding textEmbedding = new TextEmbedding();
TextEmbeddingResult result = textEmbedding.call(param);
if (result != null) {
System.out.println(result);
} else {
System.out.println("Failed to get embedding: " + result);
}
} catch (IOException | ApiException | NoApiKeyException e) {
e.printStackTrace();
}
}
}
FILE_CONTENT=$(cat texts_to_embedding.txt | jq -Rs .)
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "text-embedding-v3",
"input": {
"texts": ['"$FILE_CONTENT"']
}
}'
Sample output
OpenAI compatible
{
"data": [
{
"embedding": [
0.0023064255,
-0.009327292,
....
-0.0028842222,
],
"index": 0,
"object": "embedding"
}
],
"model":"text-embedding-v3",
"object":"list",
"usage":{"prompt_tokens":26,"total_tokens":26},
"id":"f62c2ae7-0906-9758-ab34-47c5764f07e2"
}
DashScope
{
"status_code": 200,
"request_id": "617b3670-6f9e-9f47-ad57-997ed8aeba6a",
"code": "",
"message": "",
"output": {
"embeddings": [
{
"embedding": [
0.09393704682588577,
2.4155092239379883,
-1.8923076391220093,
.,
.,
.
],
"text_index": 0
}
]
},
"usage": {
"total_tokens": 26
}
}
Sample code
Semantic recommendation
API reference
Error code
If the call failed and an error message is returned, see Error messages.