can some one help, how to config voicelive sdk to recieve animation blendshapes and viseme_id

Question

can some one help, how to config voicelive sdk to recieve animation blendshapes and viseme_id

Dadong Hu 0

it try to add this but no animation data recieve.

  modalities: ["text", "audio", 'animation'],

  outputAudioTimestampYypes: ["word"],

animation: {

    modelName: "default",

    outputs: ["viseme_id"]

  },

Anshika Varshney 7,995 Reputation points Microsoft External Staff Moderator

2026-03-02T12:30:00.6933333+00:00

Hi Dadong Hu,

Once this helps resolve your issue, please consider upvoting the helpful replies and marking the accepted answer so it benefits other community members facing the same problem.

Thankyou!

2 answers

Your answer

Anshika Varshney 7,995 Reputation points Microsoft External Staff Moderator

2026-03-02T12:30:00.6933333+00:00

Hi Dadong Hu,

Once this helps resolve your issue, please consider upvoting the helpful replies and marking the accepted answer so it benefits other community members facing the same problem.

Thankyou!

Answer 1

Hi Dadong Hu,

I’ve seen this issue come up when trying to wire Voice Live SDK into a React application, especially for the first time. In most cases it’s not a single bug, but a combination of setup and environment gaps. Below are the steps that usually resolve it.

Things to verify first

SDK availability for React Voice Live doesn’t provide a full “drop‑in” React SDK in the same way as some other Azure client libraries. For JavaScript/TypeScript, the official support is currently via preview SDKs and samples, and most working React apps use:
- A WebSocket‑based integration
  - Or a small Node.js proxy to handle authentication securely This is consistent with the official Voice Live SDK documentation and samples. [learn.microsoft.com], [github.com]
Do not expose keys in the browser If you are trying to connect directly from React using your Azure key, the connection will fail or behave unpredictably.
- Browser apps should not call Voice Live directly with keys
- Use a backend (Node/Express, Azure Functions, etc.) as a proxy that creates the Voice Live session This is a common root cause for connection or auth errors in React setups. [iloveagents.ai]

Troubleshooting steps:

Start from an official sample Before integrating into React, validate your Azure resource and credentials by running one of the official samples (JavaScript/TypeScript or Node) from the Voice Live samples repository.
- This confirms your resource, endpoint, and model are correct
- It removes React/browser variables from the equation [github.com]
Validate endpoint and model Double‑check:
- Endpoint format: https://<resource-name>.cognitiveservices.azure.com
- Model name is supported by Voice Live (for example gpt-4o or supported realtime models) Mis‑matched endpoint or unsupported model is a very common failure point. [learn.microsoft.com]
Add a backend proxy If you’re calling from React:
- Create a small backend service to open the Voice Live WebSocket
- Let React communicate only with your backend This pattern is explicitly recommended for browser‑based Voice Live apps and avoids CORS and key‑exposure issues.
Check logs during session creation Enable logging on both:
- Backend proxy
- Browser console Look for:
- Authentication failures
- WebSocket close codes
- Model or modality errors (audio/text mismatch)
Confirm audio configuration Make sure:
- Input/output audio format matches what Voice Live expects (for example PCM16)
- Microphone permissions are granted in the browser Audio format mismatches can cause silent failures even when the connection succeeds.

After validating the setup with an official sample and moving the Voice Live connection behind a backend proxy, react integrations usually work as expected. This approach has resolved the issue in most similar cases I’ve seen.

If the problem still persists, sharing:

The exact error message
Where the failure happens (session creation vs audio streaming)
Whether you’re calling Voice Live directly or via a backend

I Hope this helps. Do let me know if you have any further queries.

If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thankyou!

Answer 2

To configure the Voice Live SDK to receive animation blendshapes and viseme IDs, ensure that your configuration is set up correctly. Based on the context provided, here's how you can do it:

Modalities: Make sure you include animation in the modalities array, which you have done correctly:
```
   modalities: ["text", "audio", "animation"],
```
Output Audio Timestamp Types: Ensure that you have the correct spelling for outputAudioTimestampTypes, as it seems there is a typo in your configuration (outputAudioTimestampYypes). It should be:
```
   outputAudioTimestampTypes: ["word"],
```
Animation Configuration: Your animation configuration looks good, but ensure that the modelName is correctly set to a valid model that supports the outputs you want:
```
   animation: {
       modelName: "default",
       outputs: ["viseme_id"]
   },
```
Check for Responses: After setting this up, you should receive viseme alignment in the response when the audio is generated. The service will return messages like response.animation_viseme.delta and response.animation_viseme.done when all viseme messages are returned.

If you are still not receiving animation data, double-check the service's documentation for any additional requirements or constraints regarding the animation outputs and ensure that your session is properly established with the Voice Live API.

References:

Share via

can some one help, how to config voicelive sdk to recieve animation blendshapes and viseme_id

2 answers

Your answer