Agentic podcast generation, end-to-end
Turn raw ideas into a polished, listenable episode.
Resound orchestrates an agentic pipeline to plan a show, write a script, and (optionally) generate speech with OpenAI TTS — then saves everything to output/.
Agentic scripting
The system plans structure, generates sections, and iterates to produce a coherent script — not just a single “dump” response.
Configurable voices
Speakers are defined in
config/config_openai.json (voice model + voice id).Artifacts included
Every run saves content, section JSON, and final audio (when enabled) inside
output/<episode>/.How it works
- 1) InputYou provide an episode name + text (or a topic). Resound treats it as source material.
- 2) Agentic planningAgents analyze the content, choose a structure, and produce a multi-segment script.
- 3) Script → Audio (optional)With TTS enabled, each segment is voiced using OpenAI and stitched into a single episode file.
- 4) OutputResults land in
output/<episode>/for inspection, iteration, and playback.
Why this is different
Instead of a one-shot chat response, Resound is an orchestration layer: it enforces a show format, manages shared context across steps, and persists artifacts so you can debug, refine, and regenerate.
Fast mode
Generate script only (cheap + quick) and review before spending on TTS.
Full mode
Generate audio and get a playable episode you can share immediately.
Quick setup
- Install Python deps:
pip install -e . - Set
OPENAI_API_KEY(shell or.env). - Start this UI:
cd web && npm install && npm run dev
Note: this demo runs the CLI on the server, so it’s intended for local use.