使用实时语音识别

前提条件

确保已按照配置CPP环境（Linux）配置完毕。
请参考SDK（websocket）获取最新版本SDK包。

初始化Client

初始化RasrClient，其参数包括AuthInfo

表1 AuthInfo
参数名称	是否必选	参数类型	描述
ak	是	String	用户的ak，可参考AK/SK认证。
sk	是	String	用户的sk，可参考AK/SK认证。
projectId	是	String	项目ID，同region一一对应，参考获取项目ID。
region	是	String	区域，如cn-north-4，参考终端节点。
endpoint	否	String	终端节点，参考地区和终端节点。一般使用默认即可。

请求参数

请求类为RasrRequest，详见表 RasrRequest。

表2 RasrRequest
参数名称	是否必选	参数类型	描述
audioFormat	是	String	音频格式，支持pcm等，如pcm8k16bit，参见《API参考》中开始识别章节。
property	是	String	属性字符串，language_sampleRate_domain，如chinese_8k_common，参见《API参考》中开始识别章节。

通过set方法可以设置具体参数，详见表 RasrRequest设置参数

表3 RasrRequest设置参数
方法名称	是否必选	参数类型	描述
SetPunc	否	String	表示是否在识别结果中添加标点，取值为yes 、 no，默认no。
SetDigitNorm	否	String	表示是否将语音中的数字识别为阿拉伯数字，取值为yes 、 no，默认为yes。
SetVadHead	否	Integer	头部最大静音时间，[0, 60000]，默认10000ms。
SetVadTail	否	Integer	尾部最大静音时间，[0, 3000]，默认500ms。
SetMaxSeconds	否	Integer	音频最长持续时间， [1, 60]，默认30s。
SetIntermediateResult	否	String	是否显示中间结果，yes 或 no，默认no。
SetVocabularyId	否	String	热词表id，若没有则不填。
SetNeedWordInfo	否	String	表示是否在识别结果中输出分词结果信息，取值为“yes”和“no”，默认为“no”。

示例代码

如下示例仅供参考，最新代码请前往SDK（websocket）章节获取并运行。

/*
 * Copyright (c) Huawei Technologies Co., Ltd. 2020-2020. All rights reserved.
 */

#include "Utils.h"
#include "RasrClient.h"
#include "gflags/gflags.h"

// auth info
// refer to https://support.huaweicloud.com/api-sis/sis_03_0051.html
// 认证用的AK和SK硬编码在代码中或明文存储都有很大安全风险，建议在配置文件或环境变量中密文存放，使用时解密，确保安全。
DEFINE_string(ak, "", "access key");
DEFINE_string(sk, "", "secrect key");

// region, for example cn-east-3, cn-north-4
DEFINE_string(region, "cn-east-3", "project region, such as cn-east-3");
// projectId, refer to https://support.huaweicloud.com/api-sis/sis_03_0008.html
DEFINE_string(projectId, "", "project id");
// endpoint, relevant to region, sis-ext.${region}.myhuaweicloud.com
DEFINE_string(endpoint, "", "service endpoint");

DEFINE_string(audioFormat, "pcm16k16bit", "such pcm16k16bit alaw16k16bit etc.");
DEFINE_string(property, "chinese_16k_general", "");
DEFINE_string(audioPath, "xx.wav", "audio path");

DEFINE_int32(chunkSize, 3000, "bytes per send");
DEFINE_int32(sampleRate, 16000, "sample rate of audio");
DEFINE_int32(readTimeOut, 20000, "read time out, default 20s");
DEFINE_int32(connectTimeOut, 20000, "connecting time out, default 20s");
DEFINE_int32(bytesPerSecond, 32000, "32000 bytes per second");

void OnOpen()
{
    LOG(INFO) << "now rasr Connect success";
}

void OnStart(std::string text)
{
    LOG(INFO) << "now rasr receive start response: " << text;
}

void OnResp(std::string text)
{
    // text encoded by utf-8 contains chinese character, which will cause error code. So we should convert to ansi
    LOG(INFO) << "rasr receive " << text;
}

void OnEnd(std::string text)
{
    LOG(INFO) << "now rasr receive end response: " << text;
}

void OnClose()
{
    LOG(INFO) << "now rasr receive Close";
}

void OnError(std::string text)
{
    LOG(INFO) << "now rasr receive error: " << text;
}

void OnEvent(std::string text)
{
    LOG(INFO) << "now rasr receive event: " << text;
}

void RasrTest(const std::string filePath)
{
    const int sleepTime = FLAGS_bytesPerSecond / FLAGS_chunkSize;

    speech::huawei_asr::AuthInfo authInfo(FLAGS_ak, FLAGS_sk, FLAGS_region, FLAGS_projectId, FLAGS_endpoint);
    // config Connect parameter
    speech::huawei_asr::HttpConfig httpConfig;
    httpConfig.SetReadTimeout(FLAGS_readTimeOut);
    httpConfig.SetConnectTimeout(FLAGS_connectTimeOut);

    // config callback, callback function are optional, if not set, it will use function in RasrListener
    speech::huawei_asr::WebsocketService::ptr websocketServicePtr =
        websocketpp::lib::make_shared<speech::huawei_asr::WebsocketService>();
    websocketServicePtr->SetOnConnectFunc(OnOpen); // Connect success callback
    websocketServicePtr->SetOnStartFunc(OnStart);  // receive start response callback
    websocketServicePtr->SetOnRespFunc(OnResp);    // receive transcribe result callback
    websocketServicePtr->SetOnEndFunc(OnEnd);      // receive end response callback
    websocketServicePtr->SetOnCloseFunc(OnClose);  // Close callback
    websocketServicePtr->SetOnEventFunc(OnEvent);  // receive event callback
    websocketServicePtr->SetOnErrorFunc(OnError);  // receive error callback


    // step1 create client
    std::shared_ptr<speech::huawei_asr::RasrClient> rasrClient =
        std::make_shared<speech::huawei_asr::RasrClient>(authInfo, websocketServicePtr, httpConfig);

    // step2 connect, just select one mode, the following is continue stream connect.
    rasrClient->ContinueStreamConnect();
    // short stream connect
    // rasrClient->ShortStreamConnect();
    // sentence stream connect
    // rasrClient->SentenceStreamConnect();

    // step3 construct request params
    speech::huawei_asr::RasrRequest request(FLAGS_audioFormat, FLAGS_property);
    // set whether to add punctuation, yes or no, default no, optional operation.
    request.SetPunc("no");
    // set whether to transcribe number into arabic numerals, yes or no, default yes,optional operation.
    request.SetDigitNorm("yes");
    // set vad head, max silent head, [0, 60000], default 10000, optional operation.
    request.SetVadHead(10000);
    // set vad tail, max silent tail, [0, 3000], default 500, optional operation.
    request.SetVadTail(500);
    // set max seconds of one sentence, [1, 60], default 30, optional operation.
    request.SetMaxSeconds(30);
    // set whether to return intermediate result, yes or no, default no. optional operation.
    request.SetIntermediateResult("no");
    // set whether to return word_info, yes or no, default no. optional operation.
    request.SetNeedWordInfo("no");
    // set vocabulary_id, it should be filled only if it exists or it will report error
    // request.SetVocabularyId("");

    // step4 send start
    rasrClient->SendStart(request);

    // step5 send audio
    std::string audioContent;
    int ret = speech::huawei_asr::ReadBinary(filePath, audioContent);
    if (ret != 0) {
        LOG(ERROR) << "RasrDemo running failed";
        rasrClient->Close();
        return;
    }
    unsigned char *buf = (unsigned char *)(audioContent.c_str());
    rasrClient->SendBinary(buf, audioContent.size(), FLAGS_chunkSize, sleepTime);

    // step5 send end
    rasrClient->SendEnd();

    // step6 close
    rasrClient->Close();
}

int main(int argc, char *argv[])
{
    FLAGS_alsologtostderr = true;
    FLAGS_log_dir = "./logs";
    gflags::ParseCommandLineFlags(&argc, &argv, true);
    google::InitGoogleLogging(argv[0]);
    RasrTest(FLAGS_audioPath);
    return 0;
}

编译脚本

以下编译脚本仅供参考，您可以根据实际业务需求，对RasrDemo.cpp进行定制修改。

cd ${project_dir}
mkdir build && cd build
mkdir logs
cmake ..
make -j
./RasrDemo --audioPath=yourAudioPath --ak=yourAk --sk=yourSk --region=yourRegion --projectId=yourProjectId

父主题： CPP SDK（Linux）

上一篇：CPP SDK（Linux）

下一篇：使用实时语音合成

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在我的云声建议查看反馈及问题处理状态。

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

直接提交取消

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨

智能客服提问云社区提问

使用实时语音识别

前提条件

初始化Client

请求参数

示例代码

编译脚本

相关文档

意见反馈

文档内容是否对您有帮助？

7*24

备案

专业服务

退订

建议反馈

售前咨询热线