Alibaba has introduced ZEROSEARCH, a groundbreaking approach for training large language models (LLMs) that significantly reduces the cost of teaching AI to perform search tasks. The innovative approach eliminates expensive API calls to commercial search engines by training models to simulate search behavior, lowering training expenses by nearly 90% and making advanced AI search functionalities more accessible.
"By drastically reducing the costs involved in training LLMs to simulate search engine behavior, we are enabling developers and businesses—especially small and medium sized enterprises —to independently develop their own reinforcement learning (RL) framework without costly search engine interactions," said Huang Fei, Head of Alibaba's Tongyi Natural Language Processing Lab. "ZEROSEARCH is a major milestone in democratizing large scale RL technologies by enhancing affordability without compromising performance."
Searching for relevant information is crucial for enhancing LLMs' reasoning and response accuracy. Traditional reinforcement learning (RL) methods required hundreds of thousands of interactions with live search engines through costly API requests, making training prohibitively expensive and limiting scalability. Additionally, the inconsistent quality of results obtained from search engines often impacted the effectiveness of the training process.
To address these challenges, Alibaba's ZEROSEARCH employs a two-step simulation strategy that eliminates the need for costly API calls:
First, the team applied lightweight supervised fine-tuning to transform the LLM into a retrieval module capable of generating relevant documents in response to user queries, mimicking the behavior of real search engines.
Second, during the RL phase, researchers utilized a curriculum-based rollout strategy. This method progressively reduces the quality of simulated documents generated, challenging the model to adapt and continuously improve its performance.
In testing, models trained using ZEROSEARCH matched or exceeded the performance of those trained with actual search engine APIs. For example, a Qwen2.5-7B retrieval module demonstrated performance comparable to Google Search, while a larger 14B module surpassed Google's capabilities, achieving an 88% reduction in training costs.
Beyond cost-efficient training, Alibaba has open-sourced multiple AI models across sizes, languages, and modalities, supporting global developers in building custom AI solutions in a cost-efficient way.
Artificial Analysis placed Qwen3-235B-A22B fifth overall in Intelligence and first in affordability
Independent evaluations by Artificial Analysis, a well-recognized independent analysis of AI models and API providers, have placed Alibaba's latest LLM Qwen3-235B-A22B, fifth overall in Intelligence (math, coding, reasoning, and science) and first in affordability, significantly below competing offerings.
Learn more about Alibaba Cloud for Generative AI.
This article was originally published on Alizila written by Crystal Liu.
Alibaba Cloud and depa Launch "Eye for Thailand" Program to Empower Digital Creativity with AI
1,157 posts | 384 followers
FollowAlibaba Clouder - May 6, 2020
Alibaba Clouder - January 21, 2020
Alibaba Cloud Native - July 14, 2023
Alibaba Clouder - August 22, 2019
Michael Njunge - November 29, 2022
Alibaba Clouder - November 11, 2020
1,157 posts | 384 followers
FollowAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreTop-performance foundation models from Alibaba Cloud
Learn MoreA one-stop generative AI platform to build intelligent applications that understand your business, based on Qwen model series such as Qwen-Max and other popular models
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreMore Posts by Alibaba Cloud Community