I Built a Local Video Processing Workstation with AI - Here's the Complete Journey
The Problem
Most video processing tools force you to:
- Upload files to the cloud (privacy concerns)
- Deal with file size limits
- Pay for premium features
I wanted a fully local, feature-rich, good-looking video processing tool. So I built ClipForge.
What is ClipForge?
ClipForge is a desktop app that handles 20+ video/audio operations locally:
- Transcode - MP4, WebM, MKV, MOV, AVI, GIF
- Visual - Crop, watermark removal, rotate, color adjust, denoise
- Speed - 0.25x to 4x, reverse, boomerang, loop, fade
- Audio - Extract, mute, volume, normalize
- Composite - Concatenate, side-by-side, picture-in-picture, overlay, subtitles
Three modes: Single operation, Stack (chain multiple ops), Batch processing.
Built with Electron, React, FFmpeg, and Zustand. Ships for Windows, macOS, and Linux.
The Tech Stack
| Layer | Tech | Why |
|---|---|---|
| Desktop | Electron 42 | Cross-platform, Node.js for FFmpeg |
| UI | React 18 + TypeScript | Component ecosystem |
| Build | Vite 5 + Electron Forge | Fast HMR, clean packaging |
| State | Zustand | Simple, no boilerplate |
| Styling | Tailwind CSS | Rapid UI development |
| Video | FFmpeg (bundled) | Industry-standard processing |
| AI | Claude Code | Pair programming assistant |
Development Journey
Step 1: Scaffold
npm create electron-app clipforge
Electron Forge generated the boilerplate: main process, preload script, renderer with Vite.
Step 2: UI Layout
Built a 4-panel layout:
- Left: Media pool + operation library
- Center: Real-time preview canvas
- Right: Parameter inspector
- Bottom: Stack/Batch queue + logs
Dark theme with Tailwind CSS.
Step 3: FFmpeg Integration (The Hard Part)
This is the core challenge - wrapping FFmpeg's CLI into visual operations.
Architecture:
Renderer (React)
│ invoke('process:start', request)
▼
Preload (IPC bridge)
│
▼
Main Process (Node.js)
│ composeArgs(request) → ffmpeg args array
▼
FFmpeg (child_process.spawn)
│ progress parsing from stderr
▼
Events back to renderer
Example: Watermark Removal
Instead of FFmpeg's delogo filter (which has boundary restrictions - x≥1, y≥1, no edge support), I used a crop + blur + overlay approach:
case 'delogo': {
const x = Math.max(0, Math.round(Number(p.x) || 0));
const y = Math.max(0, Math.round(Number(p.y) || 0));
const w = Math.max(10, Math.round(Number(p.w) || 10));
const h = Math.max(10, Math.round(Number(p.h) || 10));
args.push('-filter_complex',
`[0:v]split[a][b];` +
`[b]crop=${w}:${h}:${x}:${y},gblur=sigma=30,format=rgba,colorchannelmixer=aa=0.7[b2];` +
`[a][b2]overlay=${x}:${y}[out]`
);
args.push('-map', '[out]', '-map', '0:a?');
args.push(...videoCodec(outExt));
break;
}
The filter graph:
crop- extract the watermark regiongblur- Gaussian blur (more natural than boxblur)colorchannelmixer=aa=0.7- semi-transparent blend for smooth integration
Step 4: Real-time Preview
Users need to see changes immediately, not after processing completes.
Solution: Canvas-based preview simulation. Instead of running FFmpeg, read frames from the <video> element and apply operations on a <canvas>:
useEffect(() => {
const render = () => {
drawPreview(ctx, video, previewOps, {
width: rect.width,
height: rect.height
});
};
render(); // immediate draw
if (playing) {
const loop = () => {
render();
raf = requestAnimationFrame(loop);
};
raf = requestAnimationFrame(loop);
}
return () => cancelAnimationFrame(raf);
}, [playing, playhead, JSON.stringify(previewOps)]);
Adjusting brightness, crop region, or rotation shows instant feedback.
Step 5: Mouse Region Selection
For watermark removal, users drag to select the area. Screen coordinates must convert to video pixel coordinates (accounting for letterbox scaling):
function screenToVideo(localX, localY, container, videoW, videoH) {
const { scale, ox, oy } = getVideoMapping(container, videoW, videoH);
return {
x: Math.max(0, Math.min(Math.round((localX - ox) / scale), videoW)),
y: Math.max(0, Math.min(Math.round((localY - oy) / scale), videoH)),
};
}
Bug I hit: The onUp callback captured stale state from useState. Fixed by using useRef for live coordinates during drag.
Step 6: Packaging & CI/CD
Electron packaging is tricky - FFmpeg binaries can't go inside the asar archive, and Linux needs lowercase executable names.
forge.config.ts:
packagerConfig: {
asar: {
unpackDir: 'src/main/ffmpeg'
},
extraResource: ['src/main/ffmpeg'],
executableName: 'clipforge',
}
GitHub Actions builds all three platforms in parallel:
jobs:
build:
strategy:
matrix:
os: [macos-latest, ubuntu-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npm run make
Push a tag → auto-build → auto-publish to GitHub Releases.
Lessons Learned
AI Coding is "Efficient Coding", Not "No Coding"
Claude Code handled tedious work (Electron packaging, FFmpeg arg mapping, IPC boilerplate), but I still needed to:
- Make architectural decisions
- Review generated code
- Debug edge cases (stale closures, boundary conditions)
Ship the Core Flow First
Got "open file → select operation → process → output" working before adding preview, batch mode, or i18n.
Packaging is the Last Minefield
Binary files, asar compression, platform-specific naming - expect to spend time here. Automate with CI early.
The Result
- 3 weeks of part-time work
- 20+ operations across 6 categories
- 3 platforms supported
- Fully local processing
- Open source: github.com/mayu888/clipforge
License: MIT + Commons Clause (free for personal use, commercial use requires authorization).
What's Next
- Drag-and-drop operations between panels
- More filter effects (LUTs, stabilization)
- Plugin system for custom operations
Built with Electron, FFmpeg, and a lot of help from AI. The future of indie development is here.
electron ffmpeg react typescript ai-coding desktop-app video-processing indie-hacker
Comments
No comments yet. Start the discussion.