Speech to Text Action Block
The Speech to Text Action Block transcribes and translates audio files using OpenAI Whisper within your workflows.
⚠️
Requires an OpenAI API key with access to the Whisper model.
Supported Actions
- Transcribe Audio - Convert audio to text in the original language.
- Translate Audio - Transcribe and translate audio into English.
Configuration
- API Key - OpenAI API key.
- Audio File - Audio file URL or uploaded file (supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm).
- Language - Optional language hint for transcription (ISO 639-1 code).
- Response Format - Output format: text, JSON, SRT, or VTT.
- Temperature - Sampling temperature for controlling transcription variability (0-1).
Use Cases
- Transcribe voice messages from a chatbot into text for downstream processing.
- Build a multilingual support workflow that translates audio inputs to English.
- Create meeting transcription workflows that convert recorded audio to searchable text.