Workflows
Blocks
Actions
Speech to Text

Speech to Text Action Block

The Speech to Text Action Block transcribes and translates audio files using OpenAI Whisper within your workflows.

⚠️

Requires an OpenAI API key with access to the Whisper model.

Supported Actions

  • Transcribe Audio - Convert audio to text in the original language.
  • Translate Audio - Transcribe and translate audio into English.

Configuration

  • API Key - OpenAI API key.
  • Audio File - Audio file URL or uploaded file (supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm).
  • Language - Optional language hint for transcription (ISO 639-1 code).
  • Response Format - Output format: text, JSON, SRT, or VTT.
  • Temperature - Sampling temperature for controlling transcription variability (0-1).

Use Cases

  • Transcribe voice messages from a chatbot into text for downstream processing.
  • Build a multilingual support workflow that translates audio inputs to English.
  • Create meeting transcription workflows that convert recorded audio to searchable text.
Indite Documentation v1.6.0
PrivacyTermsSupport