1,000+ sites — YouTube, TikTok, Bilibili, Twitter, Reddit, Twitch, Vimeo, Instagram, etc. ViralMint wraps it with curl-cffi browser impersonation, cookie-auth extraction, PoT fallbacks, and a Playwright-driven manifest-capture path for hard-to-extract sites.
Local int8 transcription with word-level timestamps. Bundled small.en + small (multilingual) models; medium / large-v3 selectable in Settings. ~30s to transcribe a 5-minute video on a mid-range laptop CPU.
Clip stitching, ASS subtitle burn-in (word-by-word from Whisper timestamps), audio mixing with ducking, watermark overlay, reframe (MediaPipe face-tracking), EBU R128 normalization, silence removal, multi-aspect export. All composable.
Every operation above is exposed as a /api/* endpoint. SQLite job tracking, async dispatch via asyncio (no Redis / Celery), WebSocket for real-time progress events.
86 MCP tools mounted at /mcp. Claude Code / Claude Desktop / Cursor can drive the entire pipeline from natural-language chat. wait_for_job helper for multi-step orchestration.