Community Blog Lightblue Releases Japanese-Language LLMs based on Qwen-14B for Commercial Use

Lightblue Releases Japanese-Language LLMs based on Qwen-14B for Commercial Use

Lightblue has released the most performant 14-billion parameter Japanese LLMs, suitable for on-premise commercial use.

Lightblue releases 14-billion parameter Japanese LLMs for commercial use. These LLMs achieved the highest performance among existing Japanese-language models, and are suitable for on-premise environments.

Lightblue Inc. (Lightblue), which operates the LLab for research and development of generative AI and Japanese-language LLMs has announced the release of Japanese-language LLMs "Karasu" and "Qarasu" for commercial use.The model name for these LLMs comes from Yatagarasu, a crow in Japanese mythology and the god of guidance.


You can access these models with the following URL:

About the Karasu/Qarasu Series

  • Karasu Series: pre-trained and fine-tuned on 7 billion tokens of data in Japanese and English with Shisa.
  • Qarasu Series: fine-tuned based on Qwen-14B, utilizing the know-how cultivated in the Karasu series.

Model Features

  • Karasu Series: The 7-billion-parameter model is very lightweight, and provides performance on par with comparable 13-billion-parameter models.
  • Qarasu Series: The 14-billion-parameter model achieved the highest performance among all Japanese-language models on the market, approaching the performance of GPT-3.5.

When tested in the six tasks of MT-Bench one of the benchmarks used to evaluate the performance of Japanese-language models) Karasu achieved an average score of 6.70 and Qarasu 7.60 (Table 1).

Details of the various models released are provided in the following articles:

  • English version by a data scientist is available here
  • Japanese version is available here

You can try the chatbot demo of Qarasu with the following URL: https://lightblue-qarasu.serveo.net

Depending on the access method, you may need to wait until the request is processed.

Karasu/Qarasu Series Performance

Figure 1: Evaluation results using MT-Bench, the Japanese-language model benchmark

Table 1: List of Japanese-language model benchmark MT-Bench scores

  • Karasu Series: The license is Apache 2.0 and is available for commercial use.
  • Qarasu Series: The license is inherited from Qwen's Tongyi Qianwen License Agreement and may be used for commercial use under specific conditions. For more information, check the official license agreement here.
  • Press release details are available here

About LLab

Lightblue's LLab, a development team specializing in generative AI, supports practical LLM implementations to help customers utilize generative AI. LLab not only provides unique models tailored to each company’s on-premise infrastructure, but also fully utilizes the know-how cultivated through DX consulting and contracted development, to perform customization based on an understanding of the operations of companies, departments, and fields.

Fields suitable for independent development

1.  Fields with a high degree of specialization:

Fields involving many technical terms for different industries and corporations, such as construction and pharmaceuticals.

2.  Fields requiring a high level of information security:

Financial, medical, and other fields that require high-level data breach protection measures.

3.  Fields where it is difficult to access the Internet:

Factories and construction sites where it is difficult to maintain a communication environment.

About Lightblue

Lightblue is a University of Tokyo startup that develops AI solutions such as image analysis and natural language processing with the aim of AI democratization. LLab, a team dedicated the to research and development of generative AI and LLM, has been established to develop AI models with an emphasis on security and transparency. We aim to expand the use of AI technology and bring positive change to society.

0 1 0
Share on

Alibaba Cloud Community

918 posts | 206 followers

You may also like