百鍊模型 - Hologres

介紹阿里雲百鍊與 Hologres 的打通方式：通過 API Key 在 Hologres 中部署百鍊模型，使用 AI Function 調用，資料不出庫即可完成 AI 開發。

介紹

阿里雲百鍊是一站式大模型開發與應用平台，整合了千問及主流第三方模型，為開發人員提供相容 OpenAI 的 API 及全鏈路模型服務，同時提供可視化應用構建能力。阿里雲百鍊提供開箱即用的模型服務，無需自行部署或營運，即可直接調用千問（Qwen）全系列模型。

Hologres 與阿里雲百鍊深度打通，通過 API Key 即可在 Hologres 中部署百鍊模型，然後使用 AI Function 調用百鍊模型，資料不出庫即可完成 AI 開發、構建 AI 應用。

費用說明

網路費：百鍊的 region 為北京/新加坡，Hologres 執行個體調用百鍊可能產生網路打通費用。當前 Hologres 調用百鍊模型屬於 beta 階段，暫不收取網路費用，具體開始收費時間以官網通知為準。
模型調用費：調用百鍊模型由百鍊收模數型調用費，按模型調用量計費，詳見模型調用計費與百鍊控制台。

使用限制

支援的執行個體版本：Hologres V4.0.18 及以上版本、Hologres V4.1.2 及以上版本。
支援的地區：當前僅支援烏蘭察布和北京地區。

模型列表與參數說明

部署模型

在Hologres管理主控台，進入實例清單，進入目標執行個體後，在執行個體詳情頁頂部選擇 AI模型，在 模型列表 頁面可一鍵部署百鍊模型，選擇模型提供方為 阿里雲百鍊 並填寫相關參數。主要配置包括：

模型類別：當前支援部署的百鍊模型，詳見下方模型列表，不在列表中的模型暫不支援。
API_KEY：使用阿里雲百鍊前需開通百鍊並擷取 API Key 作為鑒權憑證，部署時填寫該 API Key。擷取方式詳見擷取API Key。
模型參數配置：選擇模型類別後，可為該模型填寫參數以更好適配業務，詳見下方參數說明，另支援模型重試機制配置。

參數說明

不同模型類別支援的參數如下，完整說明以百鍊控制台與 API 文檔為準。

文本類模型：
- max_tokens 為本次請求返回的最大 Token 數，模型能支援的最大 token 數見百鍊官網說明；
- temperature 為採樣溫度，控制產生多樣性，取值範圍 [0, 2.0)；
- top_p 為核採樣機率閾值，取值範圍 (0, 1.0]。temperature 與 top_p 均能控制多樣性，建議只設定其中一個。
- Qwen-Omni 系列：除通用文本參數外，支援 modalities（指定輸出為文本或音頻）、audio.voice（輸出音頻音色）、audio.format（音頻格式，支援 wav）。

翻譯類模型：為提升翻譯效果，可以填寫如下參數，完整使用見翻譯模型。

source_lang：源語言語種，詳情語言列表。
terms：翻譯術語，支援使用JSON格式填寫多個術語
tm_list：翻譯記憶，該欄位提供“源文-譯文”句對作為樣本，JSON格式。
domains：領域提示，通過文本傳入對應的提示。

使用樣本：

{
  "extra_body": {
    "translation_options": {
      "source_lang": "zh", 
      "domains": "The sentence is from Ali Cloud IT domain. ", 
      "terms": [
        {"source": "生物特徵辨識感應器", "target": "biological sensor"},
        {"source": "身體健康情況", "target": "health status of the body"}
      ], 
      "tm_list":[
        {"source": "您可以通過如下方式查看叢集的核心版本資訊:", "target": "You can use one of the following methods to query the engine version of a cluster:"},
        {"source": "bla", "target": "bla"}
      ]
    }
  }
}

embedding類模型：dimension 為向量維度，僅部分模型可修改，詳細使用參見向量化模型。
- text-embedding-v4 支援 2,048、1,536、1,024（預設）、768、512、256、128、64；
- text-embedding-v3 支援 1,024（預設）、768、512、256、128 或 64；
- qwen3-vl-embedding 支援 2,560（預設）、2,048、1,536、1,024、768、512、256。

模型重試機制

部署時可配置調用失敗時的重試行為，參數如下。

max_retries：最大重試次數，預設 2，取值範圍 [0, 100]。
initial_retry_delay：初始重試延遲（秒），預設 0.5，取值範圍 [0.5, 8]。
max_retry_delay：最大重試延遲（秒），預設 8，取值範圍 [1, 60]。
timeout：單次請求逾時時間（秒），預設 600，取值範圍 [1, 1200]。

模型列表

百鍊支援文本產生、翻譯、向量嵌入及多模態等類型模型。下表列出各模型的分類、model_type、task 類型、輸入輸出、備忘及跨域支援情況。

模型分類	model_type	task 類型	輸入輸出	備忘	是否跨域支援
文本產生	qwen3-max	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-max-2026-01-23	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-max-preview	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-max	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-max-latest	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-plus	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-plus-latest	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-flash	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-long	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-long-latest	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwq-plus	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwq-plus-latest	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-v3.2	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-v3.2-exp	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-v3.1	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1-0528	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-v3	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1-distill-qwen-1.5b	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1-distill-qwen-7b	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1-distill-qwen-14b	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	deepseek-r1-distill-qwen-32b	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	kimi-k2-thinking	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	Moonshot-Kimi-K2-Instruct	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	glm-4.6	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	glm-4.7	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	glm-5	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	MiniMax-M2.1	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	MiniMax-M2.5	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	MiniMax/MiniMax-M2.1	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	MiniMax/MiniMax-M2.5	chat/completions	支援 text 輸入，text 輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-235b-a22b-instruct	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-235b-a22b-thinking	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-32b-instruct	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-32b-thinking	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-8b-instruct	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-8b-thinking	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-plus	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-vl-flash	chat/completions	支援 image/video 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-vl-ocr	chat/completions	支援 image 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen-vl-ocr-latest	chat/completions	支援 image 作為輸入，text 作為輸出	支援參數：temperature、top_p、max_tokens	是
	qwen3-omni-flash	chat/completions	支援 text/image/audio/video 作為輸入，text/audio 作為輸出	支援參數：temperature、top_p、max_tokens，並支援參數：modalities 和 audio	是
翻譯	qwen-mt-plus	translation	ai_translate	支援參數：source_lang、terms、tm_list、domains	是
	qwen-mt-flash	translation	ai_translate	—	是
	qwen-mt-turbo	translation	ai_translate	—	是
	qwen-mt-lite	translation	ai_translate	—	是
向量嵌入	text-embedding-v1	embedding	ai_embed，text 輸入，float[] 輸出	向量維度：1,536	是
	text-embedding-v2	embedding	ai_embed，text 輸入，float[] 輸出	向量維度：1,536	是
	text-embedding-v3	embedding	ai_embed，text 輸入，float[] 輸出	向量維度：2,048、1,536、1,024（預設）、768、512、256、128、64	是
	text-embedding-v4	embedding	ai_embed，text 輸入，float[] 輸出	向量維度：1,024（預設）、768、512、256、128 或 64	是
	tongyi-embedding-vision-plus	embedding	ai_embed，text/image/video 輸入，float[] 輸出	向量維度：1,152；視頻類不支援非北京/新加坡 region	圖片可以，視頻不行
	tongyi-embedding-vision-flash	embedding	ai_embed，text/image/video 輸入，float[] 輸出	向量維度：768；視頻類不支援非北京/新加坡 region	圖片可以，視頻不行
	multimodal-embedding-v1	embedding	ai_embed，text/image/video 輸入，float[] 輸出	向量維度：1,024；視頻類不支援非北京/新加坡 region	圖片可以，視頻不行
	qwen3-vl-embedding	embedding	ai_embed，text/image/video 輸入，float[] 輸出	向量維度：2,560（預設）、2,048、1,536、1,024、768、512、256	圖片可以，視頻不行

模型使用

部署成功後，可在 Hologres 中使用 AI Function 調用對應模型，資料不出庫即可完成推理與構建 AI 應用。使用方式請參見AI Function，最佳實務請參見最佳實務：自動駕駛映像高效能分析系統。