Speech to Text Action Block

The Speech to Text Action Block transcribes and translates audio files using OpenAI Whisper within your workflows.

⚠️

Requires an OpenAI API key with access to the Whisper model.

Supported Actions

Transcribe Audio - Convert audio to text in the original language.
Translate Audio - Transcribe and translate audio into English.

Configuration

API Key - OpenAI API key.
Audio File - Audio file URL or uploaded file (supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm).
Language - Optional language hint for transcription (ISO 639-1 code).
Response Format - Output format: text, JSON, SRT, or VTT.
Temperature - Sampling temperature for controlling transcription variability (0-1).

Use Cases

Transcribe voice messages from a chatbot into text for downstream processing.
Build a multilingual support workflow that translates audio inputs to English.
Create meeting transcription workflows that convert recorded audio to searchable text.

Data Summarizer OpenAI

Indite Documentation v1.6.0

Privacy Terms Support