Frequently asked
Is ViralMint's AI caption generator actually free?
Yes. The Captions tool runs entirely in the ViralMint desktop app and uses local Whisper transcription plus FFmpeg for the caption burn — both free, no API key, no per-minute cost. Download the desktop app, drag a video into the Captions tool, and render. There's no monthly cap, no watermark on the output, and no upgrade prompt.
What caption styles does ViralMint support?
Three viral-tested presets out of the box: viral (yellow word-by-word highlight, Montserrat Bold 56pt, 3 words at a time), classic (full-sentence, Arial 42pt, bottom of frame, no per-word highlight), and bold (green word-by-word highlight, Impact 64pt, 2 words at a time). All three render via FFmpeg using ASS subtitle format, so timing is sample-accurate.
How accurate is the transcription?
ViralMint uses faster-whisper running locally with int8 CPU quantization. On clean studio audio, word error rate is under 5% in English; for accented or noisy audio you may see occasional substitutions. Because the captions are word-level timed, you can edit any word in the generated subtitle file before re-rendering if you spot a mistake.
Can I caption a long podcast or webinar?
Yes — there's no length cap. Captioning runs proportional to clip length: a 5-minute clip takes about 30 seconds, a 60-minute podcast about 6–8 minutes on a typical laptop CPU. Local processing means you can caption hours of content overnight without metered cloud costs.
Does ViralMint also generate the script and voiceover?
Yes — those are separate tools in the same app. The full pipeline can scout a trending video idea, generate a script, generate the voiceover (Edge TTS or paid gpt-4o-mini-tts), assemble the video against Pexels stock footage, then run the captioning step described here. You can use any single piece on its own, or chain them.
What languages does the captioning support?
Whisper supports 100+ languages out of the box. The visible caption presets currently render Latin scripts cleanly; CJK and RTL scripts work but may need font swapping. Caption position, font and color are configurable per preset if you want to tune for a non-English audience.
Is this part of the open-source ViralMint?
Yes. The Captions tool is part of the ViralMint codebase that's open source on GitHub at github.com/openclaw-easy/ViralMint under the AGPL-3.0. The Whisper integration, ASS caption generator and FFmpeg burn-in are all open code you can read, fork and extend.