提示:
- 在使用SDK之前,请先确保已阅读了 接口说明文档。
下载安装
可从maven 服务器下载最新版本SDK:
<dependency>
<groupId>com.alibaba.nls</groupId>
<artifactId>nls-sdk-tts</artifactId>
<version>2.1.0</version>
</dependency>
使用方式参见下面代码示例。Demo 源码下载链接。
demo 解压后,在pom 目录运行mvn package ,会在target目录生成可执行jarnls-example-tts-2.0.0-jar-with-dependencies.jar将此jar拷贝到目标服务器,可用于快速验证及压测服务。
服务验证
java -cp nls-example-tts-2.0.0-jar-with-dependencies.jar com.alibaba.nls.client.SpeechSynthesizerDemo并按提示提供相应参数,运行后在命令执行目录生成logs/nls.log
服务压测
java -jar nls-example-tts-2.0.0-jar-with-dependencies.jar并按提示提供相应参数,其中阿里云服务url参数为: wss://nls-gateway-ap-southeast-1.aliyuncs.com/ws/v1 ,并发数根据用户已购买并发谨慎选择。
温馨提示:自行压测超过2并发会产生费用。
关键接口
- NlsClient:语音处理client,相当于所有语音相关处理类的factory,全局创建一个实例即可。线程安全。
- SpeechSynthesizer:语音合成处理类,设置请求参数,发送请求。非线程安全。
- SpeechSynthesizerListener:语音合成监听类,监听返回结果。非线程安全。有如下两个抽象方法需要实现:
/**
* 接收语音合成二进制数据
*/
abstract public void onMessage(ByteBuffer message);
/**
* 语音合成结束事件通知
*
* @param response
*/
abstract public void onComplete(SpeechSynthesizerResponse response);
更多介绍参见API文档链接: Java API接口说明
SDK 调用注意事项
- NlsClient对象创建一次可以重复使用,每次创建消耗性能。NlsClient使用了netty的框架,创建时比较消耗时间和资源,但创建之后可以重复利用。建议调用程序将NlsClient的创建和关闭与程序本身的生命周期结合。
- SpeechSynthesizer对象不能重复使用,一个语音合成任务对应一个SpeechSynthesizer对象。例如有N个文本需要语音合成,则要进行N次语音合成任务,创建N个SpeechSynthesizer对象。
- 实现的SpeechSynthesizerListener对象和SpeechSynthesizer对象是一一对应的,不能将一个SpeechSynthesizerListener对象设置到多个SpeechSynthesizer对象中,否则不能区分是哪个语音合成任务。
- Java SDK依赖了Netty网络库,版本需设置为4.1.17.Final及以上。如果您的应用中依赖了Netty,请确保版本符合要求。
代码示例
说明1:Demo显示了如何在在创建NlsClient对象的时候设置URL:
client = new NlsClient("ws://nls-gateway-ap-southeast-1.aliyuncs.com/ws/v1", accessToken);
说明2:Demo中将合成的音频保存在了文件中,如果您需要播放音频且对实时性要求较高,建议使用流式播放,即边接收语音数据边播放,减少延时。
示例:
package com.alibaba.nls.client;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import com.alibaba.nls.client.protocol.NlsClient;
import com.alibaba.nls.client.protocol.OutputFormatEnum;
import com.alibaba.nls.client.protocol.SampleRateEnum;
import com.alibaba.nls.client.protocol.tts.SpeechSynthesizer;
import com.alibaba.nls.client.protocol.tts.SpeechSynthesizerListener;
import com.alibaba.nls.client.protocol.tts.SpeechSynthesizerResponse;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* The demo of speech synthesis.
* (for demo show only)
*/
public class SpeechSynthesizerDemo {
private static final Logger logger = LoggerFactory.getLogger(SpeechSynthesizerDemo.class);
private static long startTime;
private String appKey;
NlsClient client;
public SpeechSynthesizerDemo(String appKey, String token) {
this.appKey = appKey;
// Create an NlsClient object. You can globally create an NlsClient object and specify the endpoint.
client = new NlsClient("wss://nls-gateway-ap-southeast-1.aliyuncs.com/ws/v1", token);
}
private static SpeechSynthesizerListener getSynthesizerListener() {
SpeechSynthesizerListener listener = null;
try {
listener = new SpeechSynthesizerListener() {
File f=new File("tts_test.wav");
FileOutputStream fout = new FileOutputStream(f);
private boolean firstRecvBinary = true;
// Speech synthesis is completed.
@Override
public void onComplete(SpeechSynthesizerResponse response) {
System.out.println("name: " + response.getName() + ", status: " + response.getStatus()+", output file :"+f.getAbsolutePath());
}
// The speech binary data of speech synthesis.
@Override
public void onMessage(ByteBuffer message) {
try {
if(firstRecvBinary) {
// the latency of first binary
firstRecvBinary = false;
long now = System.currentTimeMillis();
logger.info("tts first latency : " + (now - SpeechSynthesizerDemo.startTime) + " ms");
}
byte[] bytesArray = new byte[message.remaining()];
message.get(bytesArray, 0, bytesArray.length);
//System.out.println("write array:" + bytesArray.length);
fout.write(bytesArray);
} catch (IOException e) {
e.printStackTrace();
}
}
@Override
public void onFail(SpeechSynthesizerResponse response){
System.out.println(
"task_id: " + response.getTaskId() +
// the code 20000000 indicates that the request is successful.
", status: " + response.getStatus() +
// error message
", status_text: " + response.getStatusText());
}
};
} catch (Exception e) {
e.printStackTrace();
}
return listener;
}
public void process() {
SpeechSynthesizer synthesizer = null;
try {
//Create an object and establish a connection.
synthesizer = new SpeechSynthesizer(client, getSynthesizerListener());
synthesizer.setAppKey(appKey);
//Specify the audio coding format of the returned audio file.
synthesizer.setFormat(OutputFormatEnum.WAV);
//Specify the audio sampling rate of the returned audio file.
synthesizer.setSampleRate(SampleRateEnum.SAMPLE_RATE_16K);
//The speaker.
synthesizer.setVoice("siyue");
//Optional. The intonation. Value range: -500 to 500. Default value: 0.
synthesizer.setPitchRate(100);
//The speed. Value range: -500 to 500. Default value: 0.
synthesizer.setSpeechRate(100);
//Set the text to be synthesized.
synthesizer.setText("hello world!");
//Serialize preceding parameters to the JSON format and send them to the server for confirmation.
long start = System.currentTimeMillis();
synthesizer.start();
logger.info("tts start latency " + (System.currentTimeMillis() - start) + " ms");
SpeechSynthesizerDemo.startTime = System.currentTimeMillis();
//Wait until the speech synthesis is completed.
synthesizer.waitForComplete();
logger.info("tts stop latency " + (System.currentTimeMillis() - start) + " ms");
} catch (Exception e) {
e.printStackTrace();
} finally {
//关闭连接
if (null != synthesizer) {
synthesizer.close();
}
}
}
public void shutdown() {
client.shutdown();
}
public static void main(String[] args) throws Exception {
String appKey = "your appkey";
String token = "your token";
if (args.length == 2) {
appKey = args[0];
token = args[1];
} else {
System.err.println("run error, need params: " + "<app-key> <token>");
System.exit(-1);
}
SpeechSynthesizerDemo demo = new SpeechSynthesizerDemo(appKey, token);
demo.process();
demo.shutdown();
}
}