A few days ago I was playing around with local LLMs — Qwen, Gemma, the usual suspects — running them through llama.cpp on my machine. Everything was working great until I thought: wouldn’t it be nice if I could point one of these at a web page and just ask questions about it?
Claude has a browser plugin that does exactly this. It’s slick, it works, and it’s closed-source and locked to Claude only. I wanted the same thing for my local models.
So I went hunting on GitHub for an open-source equivalent. I figured someone must have built this by now. There’s OpenClaw (which is more of an agent framework than a browser extension), there are a few half-finished reverse-engineered clones of Claude’s extension, but nothing that felt like “install this, pick any provider, go.” Nothing that treated local models as a first-class citizen.
So I opened my editor and started vibe-coding.
The whole thing came together faster than I expected. Chrome Manifest V3 gives you a side panel API, which is basically a persistent UI slot next to your tab. Drop a chat interface in there, wire it up to any OpenAI-compatible API endpoint, add a content script that can read and interact with the DOM, and you’ve got most of what Claude’s extension does.
The architecture is deliberately simple:
Two modes: Ask (read-only, just analyzes the page) and Act (full automation — clicking, typing, navigating). Ask mode is the default because agents that can click stuff on your behalf are scary until you’ve tested them.
No CDP (Chrome DevTools Protocol), no remote debugging, no relay server. Just content scripts and the extension APIs. This means some things aren’t possible — file downloads, proper shadow DOM traversal, cross-origin iframes — but it also means the thing just works when you install it. No weird setup, no “run this helper binary on your machine,” no extra permissions.
I also ported it to Firefox. The Firefox version uses sidebar_action instead of sidePanel and browser.* instead of chrome.*, but otherwise it’s the same codebase.
A few things were harder than I thought:
Markdown rendering inside the side panel. Code blocks were a pain because I was doing single-pass regex substitution, which breaks when HTML escaping runs over code that contains < characters. Ended up doing a multi-pass approach: extract code blocks as placeholders, escape the rest, format, then stitch it back together.
Per-tab conversation state. If you ask questions on Tab A, then switch to Tab B, you don’t want to see Tab A’s chat. I ended up keeping a Map<tabId, innerHTML> and swapping the DOM on tab switches. Hacky but works.
Telling the user when the agent is actually doing something. Claude’s extension has this nice indicator that says “Claude started inspecting this page.” Mine didn’t, and it felt dead. So I added a banner + an extension badge that lights up whenever a tool that touches the page is running.
Max agent steps. Agents can loop forever if you let them. I capped it at 25 steps by default and added a “Continue” button so the user can let it keep going when it actually needs more steps for a complex task.
Plenty. Off the top of my head:
These are all solvable. I just haven’t gotten to them yet.
It’s called WebBrain. It works with OpenAI, Gemini, Anthropic, OpenRouter, and anything OpenAI-compatible (including your local llama.cpp server). It runs on Chrome and Firefox. It’s MIT licensed.
If you’ve been wanting a Claude-in-Chrome that works with your local model, give it a shot. And if you find bugs or have ideas, open an issue — this is very much a v0.9 and there’s a lot of room to grow.
tags: ai - open-source - browser-extension - llm - webbrain