input)Choose a capability set and click Load.
# MP4 (H.264) -vf "scale=1280:-2" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k # WEBM (VP9) -vf "scale=1280:-2" -c:v libvpx-vp9 -crf 32 -b:v 0 -c:a libopus -b:a 96k # Square social crop + scale -vf "crop=min(iw\,ih):min(iw\,ih),scale=1080:1080"
# Extract MP3 -vn -c:a libmp3lame -q:a 2 # Extract AAC audio only -vn -c:a aac -b:a 160k # Keep video, remove audio -an -c:v copy # Remux to MP4 container -c copy
# Extract one frame at 4.2s -ss 00:00:04.200 -frames:v 1 frame.png # Export a frame sequence -vf fps=12 frame%04d.png # High-quality GIF (palette workflow) -vf "fps=12,scale=640:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse"
Use URL fetch + yt-dlp server-side and ffmpeg.wasm client-side to pull media, scrub timelines, extract frames, and convert image sequences to MP4/WEBM/GIF.
yt-dlpffmpeg -i input.mp4 -vf "select=eq(n\,42)" -vframes 1 frame.png ffmpeg -framerate 12 -i frame%04d.png -c:v libx264 out.mp4
No installs, no flags, no setup. Just the frame you want, fast.
The server uses yt-dlp to resolve and fetch media from supported platforms.
From there, everything runs locally in your browser with
ffmpeg.wasm: decode, scrub, export PNG.
The server never touches frames or images. It only retrieves the video.
AI tools like ChatGPT and Grok work best with still images. A clean PNG frame is easy to annotate, modify, or remix with a prompt.
Why not just use yt-dlp and ffmpeg?
If you’re set up already, do it. This is for when you want the same result with
zero friction.
Why might “Load URL” fail?
Some sources need auth/geo access. If so, download the media yourself and drop it above.
What runs server-side?
Only yt-dlp to fetch the MP4. All decoding and frame extraction
happens in your browser via ffmpeg.wasm. Frames never leave your
device.
Is this free?
Yes. No account, no signup.