将WAV录制到IBM Watson语音到文本

o

我正在尝试记录音频,并立即将其发送到IBM Watson Speech-To-Text进行转录。我已经用从磁盘加载的WAV文件测试了Watson,并且行得通。另一方面,我还测试了从麦克风录制并将其存储到磁盘上,效果也很好。

但是,当我尝试使用NAudio WaveIn录制音频时,Watson的结果为空,好像没有音频一样。

有谁能对此有所启发,或者有一些想法?

private async void StartHere()
{
    var ws = new ClientWebSocket();
    ws.Options.Credentials = new NetworkCredential("*****", "*****");

    await ws.ConnectAsync(new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_NarrowbandModel"), CancellationToken.None);

    Task.WaitAll(ws.SendAsync(openingMessage, WebSocketMessageType.Text, true, CancellationToken.None), HandleResults(ws));

    Record();
}

public void Record()
{
    var waveIn = new WaveInEvent
    {
        BufferMilliseconds = 50,
        DeviceNumber       = 0,
        WaveFormat         = format
    };

    waveIn.DataAvailable    += new EventHandler(WaveIn_DataAvailable);
    waveIn.RecordingStopped += new EventHandler(WaveIn_RecordingStopped);
    waveIn.StartRecording();
}

public void Stop() 
{
    await ws.SendAsync(closingMessage, WebSocketMessageType.Text, true, CancellationToken.None);
}

public void Close()
{
    ws.CloseAsync(WebSocketCloseStatus.NormalClosure, "Close", CancellationToken.None).Wait();
}

private void WaveIn_DataAvailable(object sender, WaveInEventArgs e)
{
    await ws.SendAsync(new ArraySegment(e.Buffer), WebSocketMessageType.Binary, true, CancellationToken.None);
}

private async Task HandleResults(ClientWebSocket ws)
{
    var buffer = new byte[1024];

    while (true)
    {
        var segment = new ArraySegment(buffer);
        var result = await ws.ReceiveAsync(segment, CancellationToken.None);

        if (result.MessageType == WebSocketMessageType.Close)
        {
            return;
        }

        int count = result.Count;
        while (!result.EndOfMessage)
        {
            if (count >= buffer.Length)
            {
                await ws.CloseAsync(WebSocketCloseStatus.InvalidPayloadData, "That's too long", CancellationToken.None);
                return;
            }

            segment = new ArraySegment(buffer, count, buffer.Length - count);
            result = await ws.ReceiveAsync(segment, CancellationToken.None);
            count += result.Count;
        }

        var message = Encoding.UTF8.GetString(buffer, 0, count);

        // you'll probably want to parse the JSON into a useful object here,
        // see ServiceState and IsDelimeter for a light-weight example of that.
        Console.WriteLine(message);

        if (IsDelimeter(message))
        {
            return;
        }
    }
}

private bool IsDelimeter(String json)
{
    MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(json));
    DataContractJsonSerializer ser = new DataContractJsonSerializer(typeof(ServiceState));
    ServiceState obj = (ServiceState) ser.ReadObject(stream);

    return obj.state == "listening";
}

[DataContract]
internal class ServiceState
{
    [DataMember]
    public string state = "";
}


编辑:我也试图在StartRecording之前发送WAV“标题”,像这样

    waveIn.DataAvailable    += new EventHandler(WaveIn_DataAvailable);
    waveIn.RecordingStopped += new EventHandler(WaveIn_RecordingStopped);

    /* Send WAV "header" first */
    using (var stream = new MemoryStream())
    {
        using (var writer = new BinaryWriter(stream, Encoding.UTF8))
        {
            writer.Write(Encoding.UTF8.GetBytes("RIFF"));
            writer.Write(0); // placeholder
            writer.Write(Encoding.UTF8.GetBytes("WAVE"));
            writer.Write(Encoding.UTF8.GetBytes("fmt "));

            format.Serialize(writer);

            if (format.Encoding != WaveFormatEncoding.Pcm && format.BitsPerSample != 0)
            {
                writer.Write(Encoding.UTF8.GetBytes("fact"));
                writer.Write(4);
                writer.Write(0);
            }

            writer.Write(Encoding.UTF8.GetBytes("data"));
            writer.Write(0);
            writer.Flush();
        }

        byte[] header = stream.ToArray();

        await ws.SendAsync(new ArraySegment(header), WebSocketMessageType.Binary, true, CancellationToken.None);
    }
    /* End WAV header */

    waveIn.StartRecording();
o

经过20个小时的反复试验后找到了解决方案,我创建了GitHub Gist,因为它可能对其他人很方便。参见https://gist.github.com/kboek/20476c2a03b5e9188edebaace74f9a07

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

使用WebSockets的IBM Watson语音到文本

Shiny Watson 实时文本到语音

语音到文本Ibm Watson C#的语音,长音频超过100 MB

Watson NarrowBand语音到文本不接受ogg文件

Watson 语音到文本:无效凭据错误(代码:401)

如何检测语音到文本中的句子检测是否已完成(Unity IBM Watson sdk)?

IBM Watson语音到文本发送麦克风数据关闭了连接

使用ibm-watson服务C#进行语音到文本

IBM Watson语音到文本Python,“ DetailedResponse”对象没有属性“ getResult”

IBM Watson语音到文本仅使用Java SDK返回第一个单词

使用ibm watson文本到语音服务处理多个查询

Twilio的IBM Watson语音到文本插件影响一个电话号码

大于〜7mb的文件将引发“未收到响应”。IBM Watson语音到文本异步createJob调用

编辑python中的wav文件头以与QSound / pyqt5一起使用(Watson文本到语音TTS)

IBM Watson IAM令牌适合所有服务还是特定于每种服务(例如语音到文本)?

Flutter:文本到语音数组

Word 的文本到语音替换

是否可以使“ HTML到语音”与“文本到语音”相同?

IBM Watson语音对文本的单词置信度差异

选定文本的文本到语音ubuntu 16.04

使用Microsoft认知WAV文件的Android语音到文本

如何使用curl访问IBM语音到文本api?

文本到语音:语音方法不起作用

需要文本到语音和用于Linux的语音识别工具

语音识别语音到文本在 python 中不起作用

Azure语音到文本多语音识别

使用python 3.5进行语音文本到语音转换

非英语语言的语音到文本的转换

Java:文本到语音引擎概述