'how to store as a blob the audio recoding from azure API SpeechSynthesizer and play it in angular

We are currently implementing a spring boot API that calls text to speech API SpeechSynthesizer.

 public PronunciationBlob callAzureToTransformTextToSpeech(String text){
        //create a azure speech resource/speech services, and get the key from there.
        /*String speechSubscriptionKey = "2f6f7536157744cea209f4398d39cf12";
        String serviceRegion = "westus2";*/
        byte chunks[] = null;
        Blob blob = null;
        String speechSubscriptionKey = "102cb5eceb6d42e6952535c67884693b";

        String serviceRegion = "eastus";
        try {
            SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);

            config.setSpeechSynthesisVoiceName("en-US-AriaNeural");
            SpeechSynthesizer synth = new SpeechSynthesizer(config);
            assert(config != null);
            assert(synth != null);

            int exitCode = 1;
            Future<SpeechSynthesisResult> task = synth.SpeakTextAsync(text);
            assert(task != null);
            SpeechSynthesisResult result = task.get();
            chunks = result.getAudioData();
            assert(result != null);

            if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                System.out.println("Speech synthesized to speaker for text [" + text + "]");
                exitCode = 0;
            }
            else if (result.getReason() == ResultReason.Canceled) {
                SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(result);
                System.out.println("CANCELED: Reason=" + cancellation.getReason());

                if (cancellation.getReason() == CancellationReason.Error) {
                    System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                    System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                    System.out.println("CANCELED: Did you set the speech resource key and region values?");
                }
            }


        } catch (InterruptedException e) {
            e.printStackTrace();
            System.out.println("InterruptedException exception: " + e.getMessage());
        } catch (ExecutionException e) {
            e.printStackTrace();
            System.out.println("ExecutionException exception: " + e.getMessage());
        }
        PronunciationBlob pronunciationBlob = new PronunciationBlob();
        pronunciationBlob.setChunks(chunks);
       return pronunciationBlob;
    }

How do we convert the speech returned by SpeechSynthesizer to a blob, return the blob to angular and then play it back on the angular end?

I am seeing in SpeechSynthesisResult I have a method like:

result.getAudioData(). 

This method returns an array of bytes byte[]. But don't know how to go from there.

This is the angular code i am using:

search(){
    this.data = {name: this.name, email: "", uid: "", audioBlob: new FormData()}
    this.namePronunciationAPIService
    .pronounceEmployee(this.data)
    .subscribe((response: audioBlob) => {
      console.log(response.chunks);
      const blob = new Blob([response.chunks], {type: 'audio/webm'});
      const audioURL = URL.createObjectURL(blob);
      let audio = new Audio(audioURL)
      audio.play();
    });
  }


export interface audioBlob{
  chunks: string
}

I am getting the error:

DOMException: Failed to load because no supported source was found.

I don't if I am producing the audio data in the correct format. I am trying to return the audioData as chunks and then play it back in angular by using a blob.

update:

I just found a solution that partially works:

search(){

    this.data = {name: this.name, email: "", uid: "", audioBlob: new FormData()}
    this.namePronunciationAPIService
    .pronounceEmployee(this.data)
    .subscribe((response: audioBlob) => {

      let wavString =  response.chunks;
      let len = wavString.length;
      let buf = new ArrayBuffer(len);
      let view = new Uint8Array(buf);
      for (var i = 0; i < len; i++) {
        view[i] = wavString.charCodeAt(i) & 0xff;
      }
      let blob = new Blob([view], {type: "audio/x-wav"});

      console.log(response.chunks);
      const audioURL = URL.createObjectURL(blob);
      let audio = new Audio(audioURL)
      audio.play();
    });
  }

 

public PronunciationBlob callAzureToTransformTextToSpeech(String text){
        //create a azure speech resource/speech services, and get the key from there.
        /*String speechSubscriptionKey = "2f6f7536157744cea209f4398d39cf12";
        String serviceRegion = "westus2";*/
        byte chunks[] = null;
        Blob blob = null;
        String speechSubscriptionKey = "102cb5eceb6d42e6952535c67884693b";

        String serviceRegion = "eastus";
        try {
            SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
            config.setSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.);
            config.setSpeechSynthesisVoiceName("en-US-AriaNeural");
            SpeechSynthesizer synth = new SpeechSynthesizer(config);
            assert(config != null);
            assert(synth != null);

            int exitCode = 1;
            Future<SpeechSynthesisResult> task = synth.SpeakTextAsync(text);
            assert(task != null);
            SpeechSynthesisResult result = task.get();
            chunks = result.getAudioData();
            assert(result != null);

            if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                System.out.println("Speech synthesized to speaker for text [" + text + "]");
                exitCode = 0;
            }
            else if (result.getReason() == ResultReason.Canceled) {
                SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(result);
                System.out.println("CANCELED: Reason=" + cancellation.getReason());

                if (cancellation.getReason() == CancellationReason.Error) {
                    System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                    System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                    System.out.println("CANCELED: Did you set the speech resource key and region values?");
                }
            }


        } catch (InterruptedException e) {
            e.printStackTrace();
            System.out.println("InterruptedException exception: " + e.getMessage());
        } catch (ExecutionException e) {
            e.printStackTrace();
            System.out.println("ExecutionException exception: " + e.getMessage());
        }
        PronunciationBlob pronunciationBlob = new PronunciationBlob();
        String chunkSrtring = new String(chunks);
        pronunciationBlob.setChunks(chunkSrtring);
   return pronunciationBlob;
}

public class PronunciationBlob {
    public String getChunks() {
        return chunks;
    }

    public void setChunks(String chunks) {
        this.chunks = chunks;
    }

    String chunks;
}

The problem is that audio sounds muffled.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source