Hacker News .hnnew | past | comments | ask | show | jobs | submit | jberthom's commentslogin

Thanks! Yeah Claude Code’s native browser is getting better. ProofShot is agent-agnostic though — works with Cursor, Codex, any agent that can run shell commands. And the proof bundle you get at the end (viewer.html with video + timeline + errors) is what I actually review, not the raw screenshots.

Thanks! Web only for now, runs headless Chromium. Desktop is the #1 request, likely through accessibility APIs or OS-level screenshots. On the roadmap.

Web only for now since it runs headless Chromium. For Flutter web builds it’d work, but native Flutter apps would need emulator integration which is on the roadmap. Feel free to open an issue on the repo.

Fair point, clearly the first question everyone has. Will add a comparison section to the README.

Web only for now. It runs headless Chromium under the hood. Desktop and mobile are the #1 request. Mobile path would be iOS Simulator or Android emulator integration. Desktop would need accessibility APIs or OS-level screenshot capture. It’s on the roadmap. Feel free to leave an issue on the repo if that's critical for you

Right now agent-browser launches a fresh Chromium instance each time, so no persisted auth. For apps behind login, you’d need to either hit a page that doesn’t require auth, or script the login as part of your proofshot exec steps (type email, type password, click submit). Cookie/session injection is something I want to add, would make the auth flow much smoother for sure.

interesting, which model were you using for the vision part? In my experience Claude Sonnet and Opus handle UI screenshots reasonably well, not perfect but good enough that the agent can catch obvious layout issues and iterate. Definitely not at the “pixel perfect design implementation” stage yet though. But for testing features it's ok. The goal is for the agent to test that the UX/UI flow works, not that one pixel is correctly aligned with others in that case

agent-browser runs locally (it’s a Rust CLI + Node daemon on your machine), so there’s no cloud dependency on Vercel, it’s just built by the Vercel Labs team. Everything stays local :)

Simon’s tools are really great. Showboat is more for static screenshots though. ProofShot is the full session: recording, error capture, action timeline, PR upload. Different scope i'd say.

The agent drives interactions through proofshot exec — clicks, typing, navigation and each action gets logged with timestamps synced to the video. So in the viewer you can scrub through and click on action markers to jump to specific moments. It captures what happened during interaction, not just what the page looked like at rest. I had recordings where the agent struggled (for instance when having to click toggle buttons). It was fascinating to watch, the agent just tried again and again like a toddler figuring out how to use a keyboard and after 3 tries figured it out on his/her own (trying not to misgender the babies of future AGI).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: