An Azure service that integrates speech processing into apps and services.
Hi GenixPRO,
Thanks for raising this question. Currently, ephemeral tokens are supported only via the Realtime API flow. Models such as gpt‑4o‑mini‑transcribe and gpt‑4o‑transcribe can generate ephemeral tokens only when used in a realtime session, not for standard (non‑realtime) transcription requests.
If your use case is batch or request‑response speech‑to‑text, ephemeral token generation isn’t available outside the Realtime API today. This is why you’re seeing references only for realtime scenarios and not for standalone transcription endpoints.
Given the pricing and architecture differences you mentioned, the recommended approach for non‑realtime transcription remains using the regular Speech / transcription APIs with standard authentication. If ephemeral tokens are required for non‑realtime transcription in the future, that would need to be a platform enhancement.
Please let me know if there are any remaining questions or additional details, I can help with, I’ll be glad to provide further clarification or guidance.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thankyou!