全部產品
Search
文件中心

AnalyticDB:AI Function

更新時間:Mar 12, 2026

AI函數支援在資料庫內直接調用AI服務,例如基於AI的文本產生、條件判斷、情感分析和分類等。本文介紹AnalyticDB for MySQLAI函數的用法與樣本。

重要

由於AI模型本身具有隨機性,即使輸入相同,通過 AI Function Compute後也可能產生不同的輸出結果。

  • 文本函數

    • ai_filter:對輸入文本進行事實性或合規性判斷。

    • ai_translate:把輸入文本翻譯成對應的語言。

    • ai_classify:對輸入文本進行語義分析,並將其歸類到預定義的類別或標籤中。

    • ai_extract:從非結構化文本中抽取指定類型的結構化資訊(如人名、日期、地址、關鍵詞等)。

    • ai_sentiment:分析輸入文本的情感傾向,返回正面、負面或中性等情感判斷。

    • ai_similarity:計算兩段文本在語義上的相似性。

    • ai_mask:對輸入文本中的敏感資訊(如社會安全號碼、手機號、郵箱等)進行掩碼處理。

    • ai_summarize:對長文本進行語義壓縮,產生簡潔、準確的摘要內容。

  • 音頻函數

    • ai_audio_transcribe:將一段音頻轉錄成對應的文本,可以按需求指定切分模式、目標語言等。

    • ai_audio_embed:將一段音頻轉換成其對應的嵌入向量,返回對應的嵌入向量數組。

前提條件

  • AnalyticDB for MySQL企業版、基礎版及湖倉版,或數倉版彈性模式,已開啟ENI訪問。

    重要
    • 登入雲原生資料倉儲AnalyticDB MySQL控制台,在叢集管理 > 叢集資訊網路資訊地區,開啟ENI網路開關。

    • 開啟和關閉ENI網路會導致資料庫連接中斷大約2分鐘,無法讀寫。請謹慎評估影響後再開啟或關閉ENI網路。

  • 目前功能處於公測階段,如需使用請提交工單聯絡支援人員開通。

準備工作

如需使用AI函數能力,需配置AnalyticDB for MySQL到阿里雲百鍊的網路鏈路,並完成模型建立。

通過公網串連

  1. 配置公網NAT Gateway。

    建立與AnalyticDB for MySQL叢集同地區的公網NAT Gateway,並為其綁定Elastic IP Address(EIP),然後建立SNAT條目(推薦按交換器粒度建立SNAT條目,指定任意交換器即可)。具體操作,請參見公網 NAT Gateway

  2. 建立模型。模型建立的文法說明,請參見CREATE MODEL

    樣本:

    CREATE MODEL qwen_plus_external
    OPTIONS (
        type='external',
        provider='bailian',
        name='qwen-plus',
        interface='TEXT_TO_TEXT',
        api_key='sk-xxx' --需替換成阿里雲百鍊側開通的模型服務的api_key。
    )

通過PrivateLink私網串連

  1. 配置終端節點並擷取私網網域名稱。具體操作,請參見私網訪問阿里雲百鍊模型或應用 API

    重要

    阿里雲百鍊服務所在地區:新加坡、華北2(北京)

    終端節點須與阿里雲百鍊服務位於同一地區,如需從其他地區的VPC內進行私網訪問,請參考跨地區私網訪問阿里雲百鍊 API

  2. 建立模型。模型建立的文法說明,請參見CREATE MODEL

    樣本:

    CREATE MODEL qwen_plus_external
    OPTIONS (
        type='external',
        provider='bailian',
        name='qwen-plus',
        interface='TEXT_TO_TEXT',
        api_key='sk-xxx', --需替換成阿里雲百鍊側開通的模型服務的api_key
        endpoint='ep-xxx.com' --需替換成私網串連終端節點的網域名稱
    )

文本函數

ai_filter

ai_filter(text)
ai_filter(model_name, text)
  • 命令說明:對輸入文本進行事實性或合規性判斷,若內容可信、合理或符合預設規則,則返回真;否則返回假。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

  • 傳回值類型:BOOLEAN。

樣本1

SELECT ai_filter("浙江在中國嗎?")

返回結果

+---------------------------------+
| ai_filter("浙江在中國嗎?")       |
+---------------------------------+
| 1                               | 

樣本2

SELECT ai_filter("qwen_plus_external", "浙江在中國嗎?")

返回結果

+----------------------------------------------+
| ai_filter("qwen_plus_external", "浙江在中國嗎?")       |
+----------------------------------------------+
| 1                                            | 

ai_translate

ai_translate(text, targetLang)
ai_translate(model_name, text, targetLang)
  • 命令說明:把輸入文本翻譯成對應的語言。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

    • targetLang:需要翻譯的目標語言,VARCHAR類型。

      重要

      目標語言需要模型支援該語言。

  • 傳回值類型:VARCHAR。

樣本

SELECT ai_translate("AnalyticDB for MySQL is a data analytics platform based on a lakehouse architecture.", "cn") as translate_text

返回結果

+-------------------------------------------------------+
| translate_text                                        |
+-------------------------------------------------------+
| AnalyticDB for MySQL 是一個基於湖倉一體架構的資料分析平台。 | 

ai_classify

ai_classify(text, labels)
ai_classify(model_name, text, labels)
  • 命令說明:對輸入文本進行語義分析,並將其歸類到預定義的類別或標籤中。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

    • labels:預先定義的類別標籤列表,支援VARCHAR類型,或Array<VARCHAR>類型。

  • 傳回值類型:VARCHAR。

樣本1

SELECT ai_classify('昨晚在廚房嘗試了新買的意大利麵食譜,醬汁濃鬱,家人讚不絕口!', "['travel', 'cooking', 'reading', 'driving']")
AS predicted_label;

返回結果

+----------------------------------------------+
| predicted_label                              |
+----------------------------------------------+
| cooking                                      | 

樣本2

SELECT ai_classify('昨晚在廚房嘗試了新買的意大利麵食譜,醬汁濃鬱,家人讚不絕口!', ARRAY['travel', 'cooking', 'reading', 'driving'])
AS predicted_label;

返回結果

+----------------------------------------------+
| predicted_label                              |
+----------------------------------------------+
| cooking                                      | 

ai_extract

ai_extract(text, labels)
ai_extract(model_name, text, labels)
  • 命令說明:從非結構化文本中抽取指定類型的結構化資訊(如人名、日期、地址、關鍵詞等)。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

    • labels:預先定義的類別標籤列表,支援VARCHAR類型,或Array<VARCHAR>類型。

  • 傳回值類型:VARCHAR。

樣本

select ai_extract('我昨天(2024年6月15日)在淘寶買了iPhone 15 Pro,它的鈦金屬機身和A17晶片真的很驚豔!', "['product_name', 'date', 'key_feature']") as result

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| product_name=iPhone 15 Pro, date=2024年6月15日, key_feature=鈦金屬機身和A17晶片                                         | 

ai_generate

ai_generate(text)
ai_generate(model_name, text)
  • 命令說明:從非結構化文本中抽取指定類型的結構化資訊(如人名、日期、地址、關鍵詞等)。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

  • 傳回值類型:VARCHAR。

樣本

select ai_generate('一句話介紹一下TPC-H測試集') as result

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| TPC-H 是用於評估資料庫決策支援能力的標準基準測試集。 | 

ai_sentiment

ai_sentiment(text)
ai_sentiment(model_name, text)
  • 命令說明:分析輸入文本的情感傾向,返回正面、負面或中性等情感判斷。傳回值固定為positivenegativeneutralmixed四個標籤其中一個。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

  • 傳回值類型:VARCHAR。

樣本

SELECT ai_sentiment('這款耳機音質出色,佩戴舒適,強烈推薦!') AS sentiment;

返回結果

+----------------------------------------------+
| sentiment                                    |
+----------------------------------------------+
| positive                                     | 

ai_similarity

ai_similarity(text1, text2)
ai_similarity(model_name, text1, text2)
  • 命令說明:計算兩段文本在語義上的相似性。返回0到10之間的相似性分數,分數越大表示相似程度越高。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text1和text2:輸入文本,VARCHAR類型。

  • 傳回值類型:VARCHAR。

樣本

SELECT ai_similarity(
  '如何重設我的賬戶密碼?',
  '我忘記了登入密碼,該怎麼找回?'
) AS result;

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| 9.3                                          | 

ai_mask

ai_mask(text, labels)
ai_mask(model_name, text, labels)
  • 命令說明:對輸入文本中的敏感資訊(如社會安全號碼、手機號、郵箱等)進行掩碼處理。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

    • labels:預先定義的類別標籤列表,支援VARCHAR類型,或Array<VARCHAR>類型。

  • 傳回值類型:VARCHAR。

樣本

SELECT ai_mask(
  '聯絡我:1381234****,郵箱是user@example.com,社會安全號碼110101199003072316',
  "['phone', 'email', 'id_card']"
) AS result;

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| 聯絡我:[MSKED],郵箱是[MSKED],社會安全號碼[MSKED]   | 

ai_summarize

ai_summarize(text)
ai_summarize(model_name, text)
  • 命令說明:對長文本進行語義壓縮,產生簡潔、準確的摘要內容。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

  • 傳回值類型:VARCHAR。

樣本

SELECT ai_summarize('曲曲折折的荷塘上面,彌望的是田田的葉子。葉子出水很高,像亭亭的舞女的裙。層層的葉子中間,零星地點綴著些白花,有嫋娜地開著的,有羞澀地打著朵兒的;正如一粒粒的明珠,又如碧天裡的星星,又如剛出浴的美人。微風過處,送來縷縷清香,彷彿遠處高樓上渺茫的歌聲似的。這時候葉子與花也有一絲的顫動,像閃電般,霎時傳過荷塘的那邊去了。葉子本是肩並肩密密地挨著,這便宛然有了一道凝碧的波痕。葉子底下是脈脈的流水,遮住了,不能見一些顏色;而葉子卻更見風致了。月光如流水一般,靜靜地瀉在這一片葉子和花上。薄薄的青霧浮起在荷塘裡。葉子和花彷彿在牛乳中洗過一樣;又像籠著輕紗的夢。雖然是滿月,天上卻有一層淡淡的雲,所以不能朗照;但我以為這恰是到了好處——酣眠固不可少,小睡也別有風味的。月光是隔了樹照過來的,高處叢生的灌木,落下參差的斑駁的黑影,峭楞楞如鬼一般;彎彎的楊柳的稀疏的倩影,卻又像是畫在荷葉上。塘中的月色並不均勻;但光與影有著和諧的旋律,如梵婀玲上奏著的名曲。') AS result;

返回結果

+---------------------------------------------------------+
| result                                                  |
+---------------------------------------------------------+
| 月光下的荷塘靜謐優美:田田荷葉如舞女裙,白花點綴其間,似明珠、星星或出浴美人;微風送香,葉花輕顫,形成碧波;流水隱於葉下,更顯風致。月光如水,青霧輕籠,葉花如洗,朦朧如夢。雲遮滿月,光影斑駁,灌木黑影峭楞,楊柳倩影如畫,光與影和諧如名曲。| 

音頻函數

ai_audio_transcribe

僅支援使用內建模型。

ai_audio_transcribe(url)
ai_audio_transcribe(model_name, url)
ai_audio_transcribe(model_name, url, options)
  • 命令說明:將一段音頻轉錄成對應的文本,可以按需求指定切分模式、目標語言等。返迴轉錄後的文本或對應檔案url。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • url:輸入的音頻檔案地址,VARCHAR類型。

    • options:選擇性參數,是JSON格式的文本。支援的參數如下。

      參數名

      含義

      可選配置項

      預設值

      language

      轉錄文本的目標語言

      • cn(中文)

      • en(英語)

      • ja(日語)

      • yue(粵語)

      • fspk(中英文自由說)

      cn

      diarization_mode

      文本劃分段落的模式

      • word(按單詞維度)

      • sentence(按句子維度)

      • speaker(按說話者維度)

      word(按單詞維度切分)

      output_type

      輸出格式

      • json(直接返迴轉錄後的json文本)

      • url(儲存到oss上並返迴文本檔案的url地址)

      json

  • 傳回值類型:VARCHAR。

樣本1

SELECT ai_audio_transcribe("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav") as result

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Paragraphs": [{"ParagraphId": "1768447804076500000", "SpeakerId": "1", "Words": [{"Id": 10, "SentenceId": 1, "Start": 100, "End": 595, "Text": "Hello, "}, {"Id": 20, "SentenceId": 1, "Start": 596, "End": 841, "Text": "world, "}, {"Id": 30, "SentenceId": 1, "Start": 844, "End": 1588, "Text": "這裡是"}, {"Id": 40, "SentenceId": 1, "Start": 1588, "End": 2580, "Text": "阿里巴巴"}, {"Id": 50, "SentenceId": 1, "Start": 2580, "End": 3076, "Text": "語音"}, {"Id": 60, "SentenceId": 1, "Start": 3076, "End": 3820, "Text": "實驗室。"}]}], "AudioSegments": [[100, 3820]]}}                                | 

樣本2

SELECT ai_audio_transcribe("tingwu", "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav", "{'diarization_mode':'sentence'}") as result

返回結果

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Speakers": [{"SpeakerId": "1", "Sentences": [{"SentenceId": 1, "Start": 100, "End": 3820, "Text": "Hello, world, 這裡是阿里巴巴語音實驗室。"}]}]}}| 

ai_audio_embed

音頻向量嵌入,僅支援使用內建模型。

ai_audio_embed(text)
ai_audio_embed(model_name, text)
ai_audio_embed(model_name, text, options)
  • 命令說明:將一段音頻轉換成其對應的嵌入向量,返回對應的嵌入向量數組。可以通過參數控制是否需要對音頻按起止時間切分後再產生嵌入向量。

  • 輸入實值型別:

    • model_name:模型名,VARCHAR類型。

    • text:輸入文本,VARCHAR類型。

    • options:選擇性參數,是JSON格式的文本。支援的參數如下。

      參數名

      含義

      說明

      access_id

      oss的AK

      在使用role_arn的情況下,不再需要指定access_id、access_secret和token資訊。

      access_secret

      oss的SK

      token

      oss的token憑證

      role_arn

      角色扮演憑證

      start_time

      音訊開始時間

      不填預設為從頭開始。

      end_time

      音訊結束時間

      不填預設為到尾結束。

  • 傳回值類型:Array<FLOAT>。

樣本

SELECT ai_audio_embed("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav")

返回結果

[0.4234241,0.3575562,0.022631245,-0.09829298,-0.40605977,-0.72465694,-1.8039973,2.4773192,0.91702485,0.15146817,0.7209677,0.78260005,0.8019302,2.7577147,-0.86063147,-0.8249354,0.28732437,0.9009849,0.8899265,-1.5320694,-2.0866642,0.5051579,1.3553369,2.0348034,-0.74032015,-0.66065377,0.2756823,0.15714298,-2.8250875,0.8162984,0.20878711,0.8285897,1.4863579,1.0645012,-1.2933203,0.6919463,0.87850314,1.2212433,-0.44420764,-1.0146322,-1.8209833,0.16996253,-1.534067,-0.10253409,0.07611574,2.4005992,-1.2903963,0.6613166,-0.2554945,0.97570425,-0.108424075,0.82061934,-0.18102498,-0.27137256,-1.2723781,-0.9188018,-0.027503261,-0.15346082,-0.9564006,-0.4618776,-1.077228,-0.27756396,0.28749034,0.2570488,-0.39753774,0.5270934,0.6970718,-1.2984067,2.4312186,1.485573,1.6031872,-0.6158852,-1.1512791,0.49305362,-1.1939837,0.72933173,0.2659372,2.5559616,-0.41886744,0.6304756,-0.6171451,-0.5282561,0.78426796,-0.57570314,-0.91166306,-0.531768,0.96598166,0.17423655,-0.9133532,-0.09258405,-0.26951456,-0.34667063,-0.21612991,-0.054430377,0.052798696,0.5198285,1.9431844,-0.5638291,-2.1043544,-0.47198293,0.21959576,0.5975169,0.3951007,0.28112686,1.792782,-0.15896009,-0.71152246,1.1540072,0.37563428,0.0016163457,0.6544563,0.043974634,-1.6375796,-0.3698572,1.5310435,-0.20282944,0.73641545,2.1867669,1.4115202,-0.8114418,-0.36265984,0.2191038,-0.8933491,0.6616433,1.545423,0.10970531,-0.109743424,0.44605017,-0.087265715,0.29077715,-2.4435875,-1.1484046,0.078253455,0.02283861,-0.29482847,2.0237377,-0.6009212,-1.2377249,0.05194488,-0.16573223,0.8370868,-0.43602425,-0.4706988,0.17996503,0.88799554,0.6562157,-0.7520023,0.64789253,0.36865348,-1.4820247,0.5299587,1.1884397,1.4351603,0.21073115,-0.1712549,2.9410155,1.7485802,-0.6828801,-0.4833641,-0.4477328,2.3307724,-0.35595444,-0.61682695,-0.5370858,-0.8068234,1.2195143,-1.0834758,0.45275012,1.6243625,-1.3629726,-0.2959109,0.05071621,-0.7280639,-0.1713935,-0.43650395,-0.2131698,0.25380868,-0.16288652,2.0921175,0.3555297,-0.10752705,0.5716714,-0.09943808,-1.9066161,0.12683412,0.8566219,-0.20323975,1.4260024,-0.28789783,0.32412016,0.88897985,0.86555415,0.5940666,0.6645551,0.046334613,-1.5640875,-0.3510718,0.55363727,-0.13503519,0.67008746,-0.9184686,0.025278697,1.0128921,-0.61241907,-0.01107134,-0.8309523,-0.51025873,1.1141272,0.28813183,-0.10634196,-0.9954202,-0.07171043,-0.5856978,-1.2660285,2.1327019,-0.60268104,-0.6227884,-0.42101067,0.059599742,-0.09640202,-1.258711,0.5054924,-1.1847625,-0.044398762,-0.98595804,0.8883682,0.6221085,-2.5484293,0.7249505,0.69930685,0.7739025,-0.8139478,-1.1988907,-1.0416493,2.0153732,-1.7091763,-0.5611238,-1.2147603,0.9113469,1.5113174,0.23810485,-1.702736,-0.3295935,-0.41867778,-1.0378691,-0.45600057,-0.43525052,-0.1078409,-1.2993969,0.12842774,0.026097976,-0.7705405,-1.0907317,0.28274077,1.236289,1.6190177,2.1874366,0.16072829,-0.33150536,2.218483,-1.1703843,0.10300327,1.1994884,0.48275462,-0.40795812,0.5020531,1.1787555,-0.082187966,0.6315653,0.36654752,-0.24940589,-0.8652801,1.6283739,0.41405886,0.6377814,0.08396838,-1.0169003,1.2100558,1.4457762,-0.07999261,-0.012102162,0.85055244,-0.09711141,-1.0452846,0.13768612,-2.0506873,-1.6474499,0.043265514,-1.0009454,-0.0111249415,-1.2523409,-0.080719866,-0.6187693,-1.398226,0.6425289,-0.4808641,-0.06030046,-0.10275636,-0.31625932,-1.5993032,-0.20966552,-1.4618409,0.34925935,-0.5034448,0.100028045,0.25327235,-1.078896,-0.23394233,1.2247928,-2.6050038,-0.71609926,-0.77765155,-1.2089496,0.8526703,-0.1358416,1.1074059,1.1545771,-0.94525933,0.41012967,0.9361201,-0.14788401,-0.29333082,1.5782444,1.1100405,-0.4074414,-0.3862537,-0.5779069,-0.88644946,0.2233385,1.3612705,1.2413827,-1.3625424,-1.3623037,0.3056319,-1.4446377,0.64613384,0.15064861,-0.61473364,-1.3611295,-0.1975697,-1.0701923,0.7591377,-1.2106745,-2.067824,0.45041704,-0.71582735,-1.743847,1.169414,2.0158787,0.4734838,-0.3133036,-1.9916989,1.1441987,0.9155275,-1.3003027,0.82898057,-0.7439868,1.2072865,0.46877453,0.6648313,0.80477613,1.6927507,0.5842916,0.36608973,2.259473,-1.1628797,-0.3311869,0.36989415,-0.25035658,-0.28012496,1.092324,-0.40238732,0.0046352614,-2.0625768,1.161326,0.92277956,-0.20316431,-0.15164377,1.1715052,-0.7067665,-0.8608931,0.079684004,-0.89916384,0.02488108,-0.57668805,-0.879138,-1.7274998,0.824049,1.5638052,0.28415555,-0.17347385,-2.085917,0.4987632,-0.031601395,-1.0190377,-0.7815816,0.69643223,-0.33574122,2.1242745,0.19785832,0.55690974,-1.3932099,-0.44952002,-0.7719798,1.1326215,0.43839702,0.996999,-0.55692834,1.4014084,-2.3939395,-1.0112343,-1.4143248,1.1514114,1.1233637,-0.3385779,-0.23665123,-0.100857966,0.4633971,1.6215613,-0.0692888,-0.031505972,1.0472811,0.57112235,0.6015763,0.07582237,0.52702487,-1.3809607,-1.7482765,0.38386008,0.99316126,0.95603,0.40644804,-1.5072054,-0.34419048,0.63205683,-1.0854999,-0.92245156,-0.2712947,-0.75696105,0.996232,-0.10738732,-0.4674776,-1.2149413,-1.5094053,1.6796608,0.21961057,-0.35295358,-1.2609407,-0.040009048,-0.38785484,0.7788784,-0.65823495,2.0559616,-1.0074826,1.2282485,1.2540467,0.4914942,-0.47057188,-0.47061247,-0.16255763,-0.6718562,-0.53630847,0.4804698,-0.3134068,0.6407026,-0.727981,0.0481851,0.06927338,0.8321921,-0.6639807,0.74932885,0.23291564,0.76362675,-1.2966217,0.8806557,-1.2141875,0.6996881,-0.8293652,0.9085288,1.8878758,0.11363957,0.148718,0.5030497,-1.0422761,0.08673843,-0.80342984,-0.8046266,-0.18026677,0.28900644,0.76534355,-1.3163859,-0.72775376,0.36529016,-0.3660175,0.31056792,0.052575216,-1.11831,0.7895246,1.1172394,-0.31374845,0.17143561,-0.42633826,0.16579832,-0.012790448,-1.4290546,-0.47322562,0.7557427,1.0487514,-0.14331971,0.30455515,0.6938429,0.5565799]

配置AI函數預設模型

如果您在調用AI函數時未指定模型名,系統將使用預設模型,您可以通過以下配置項修改預設模型。配置修改操作,請參見Config和Hint配置參數

模型類型

預設模型

配置項

TEXT_TO_TEXT(文本模型)

qwen-plus

AI_FUNCTION_TEXT_TO_TEXT_MODEL

TEXT_TO_EMBEDDING(音頻嵌入向量模型)

text-embedding-v2

AI_FUNCTION_TEXT_TO_EMBEDDING_MODEL

AUDIO_TO_TEXT(音頻轉文本模型)

tingwu

AI_FUNCTION_AUDIO_TO_TEXT_MODEL

AUDIO_TO_EMBEDDING(音頻嵌入向量模型)

qwen2.5-vl-embedding

AI_FUNCTION_AUDIO_TO_EMBEDDING_MODEL

典型情境樣本

情境一:文本召回

適用情境:在諸如RAG、客服系統等情境下,根據使用者提問的語義,從文本庫中搜尋某段文本,選取最匹配的若干條結果。

通過ai_embedd編碼將待搜尋文本和文本庫的文本轉換為向量,然後排序計算出餘弦相似性最高的十條文本。

SELECT cosine_similarity(ai_embedd("待搜尋的語義文本"), ai_embedd(text)) AS cos
FROM text_table
ORDER BY cos
LIMIT 10

進一步最佳化向量搜尋效能,請參見向量檢索

情境二:聲紋檢索

適用情境:在會議、日常對話等情境下,客戶希望能把錄音轉錄成文本,並按照說話人區分對話資訊。

若需進一步分析說話人資訊,可以按說話人切分音頻後,和語音資料集做embedding比對,整體方案請參見聲紋檢索

  • 轉錄與切分:使用AI_AUDIO_TRANSCRIBE函數將音頻轉化成文本,並按照說話人維度切分。

    SELECT AI_AUDIO_TRANSCRIBE("oss://xxx", "{'diarization_mode':'speaker'}")
  • 聲紋檢索:使用AI_AUDIO_EMBED函數對輸入音頻和原始的音頻庫分別求出嵌入向量,計算l2_distance後,選取最接近的10個。

    SELECT l2_distance(input_embedding, origin_embedding) AS ld
    FROM(
      SELECT ai_audio_embed("oss://xxx") AS input_embedding
      JOIN
      SELECT ai_audio_embed(url) AS origin_embedding
      FROM audio_table)
    ORDER BY ld
    LIMIT 10

情境三:提取顧客的不滿意評論

適用情境:從使用者評論中,篩選出不滿意的反饋。

customer_comment表中已經儲存了每天顧客的評論,通過情感分析函數ai_sentiment來判斷顧客的情感傾向,篩選出2026年1月1日所有不滿意的顧客評論。

-- 已提前建立好qwen_plus_external模型,對應百鍊提供的qwen-plus模型。
SELECT name, comment
FROM customer_comment
WHERE ai_sentiment("qwen_plus_external", comment)='negative'
AND dt='20260101'