Open-source CLI · ReAct agent · Ollama / Gemma
nip — A Local-First ReAct Code Agent
nip is the local-first coding agent I wanted to exist. A small CLI that reads a prompt, plans a sequence of tool calls, and rewrites your codebase the way a careful pair programmer would — without ever sending source code past the firewall. It runs on Ollama with Gemma as the reasoning model and implements a tight ReAct loop with structured, schema-validated tool calls. I built it because every customer contract that forbids cloud LLM calls is, eventually, a forcing function in the same direction, and I wanted to know how far the local-first version could go.
Why I built it
Most LLM coding agents in 2026 still assume a willing connection to a hosted model. That assumption fails the moment your customer's contract forbids it, your repo holds private health data, or you're on a flight (which, hah, is when I tend to want to refactor things). I wanted to know how far a small open-weight model on a laptop could go if the loop around it was sharp enough.
nip is the answer: a tiny ReAct runtime, a clean tool-calling protocol, and a deliberately modest set of filesystem primitives. The intelligence lives in the loop and the prompt — not in cloud capacity.
How it works
Each iteration: the agent emits a thought, picks a tool (read_file, write_file, list_dir, run_shell, ...), observes the result, repeats. The loop terminates when the model emits a final answer or hits a guard rail. Tool calls are validated against a JSON schema before execution, so a hallucinated path can't escape the workspace.
Gemma on Ollama is, surprisingly, quite competent at this scale — especially for narrow tasks like "refactor this callback to async/await and add an LRU cache." Not as competent as Claude. Not trying to be. The full session is auditable: every thought, tool call, observation, and patch goes to disk so you can replay or revert.
What I take from it
Local-first agents aren't a downgrade — they're a different point on the curve. For surgical refactors, scaffolding, and codebase Q&A, the gap between Gemma-on-Ollama-with-a-good-loop and a hosted frontier model is much smaller than the marketing on either side will admit. And the local version keeps every byte of source code, secret, and customer data on your machine.
nip is the reference implementation I point at when a team asks whether AI-enabled web development can survive an enterprise security review. As far as I can tell, yes. The harder question is taste — what to give the agent permission to do — and that one is just as hard for the hosted models.