What is AI-enabled web development?

It's the new layer of full-stack work in 2026 — large language models, agents, and ML inference wired into both the product (chat, copilot, summarization, semantic search) and the engineering loop (Claude Code plugins, MCP servers, local-first coding agents). I've been shipping it for the last year or two across Next.js, Node.js, and the MERN stack. The high-leverage piece is almost always the typed integration surface, not the prompt.

What is an MCP server?

An MCP (Model Context Protocol) server is just a process that exposes tools, data, and capabilities to an LLM client over a standardized protocol — kind of like a GraphQL endpoint, but for an agent. It lets a client like Claude Code read your database, call your APIs, or run domain-specific actions without bespoke integration code. I build custom MCP servers, including the open-source MERN MCP, that automate full-stack CRUD scaffolding and developer workflows.

What is a Claude Code plugin?

A Claude Code plugin extends the Claude Code CLI with custom commands, hooks, MCP servers, sub-agents, or skills tailored to your codebase. Plugins can narrate edits, enforce conventions, run pre-commit checks, or wire in proprietary tools. I've authored claude-human-review, which describes each edit Claude is about to make in plain English so engineers can approve or undo with full context — diff hunts are the wrong review surface.

What is a ReAct agent?

A ReAct (Reasoning + Acting) agent is an LLM that interleaves thoughts with concrete tool calls — read a file, run a search, write a patch — until a task is complete. It's the loop behind most modern coding agents. My open-source project nip is a local-first ReAct agent powered by Gemma on Ollama, with no cloud calls and no vendor lock-in.

Is Irteza Asad Rizvi available for hire?

Yes. I'm open to senior full-stack roles, AI-focused contracts, and consulting engagements. Remote out of Toronto, specializing in shipping production AI features — LLM agents, Claude Code plugins, MCP servers — into Next.js, Node.js, Vue, and Nuxt apps. Fastest way to reach me: irtezaasad@gmail.com.

What stack does Irteza work in?

Primary: TypeScript, Node.js, Next.js, React, Vue, Nuxt, Python, MongoDB, PostgreSQL, Docker, AWS, Linux. AI-specific: Claude SDK, Model Context Protocol, Ollama, ONNX Runtime. CI/CD: Jenkins and Docker. Seven-plus years of production web work and active AI tooling on top.

Open-source CLI · ReAct agent · Ollama / Gemma

nip — A Local-First ReAct Code Agent

nip is the local-first coding agent I wanted to exist. A small CLI that reads a prompt, plans a sequence of tool calls, and rewrites your codebase the way a careful pair programmer would — without ever sending source code past the firewall. It runs on Ollama with Gemma as the reasoning model and implements a tight ReAct loop with structured, schema-validated tool calls. I built it because every customer contract that forbids cloud LLM calls is, eventually, a forcing function in the same direction, and I wanted to know how far the local-first version could go.

[Python][Ollama][Gemma]

View on GitHub Hire me for similar work

Why I built it

Most LLM coding agents in 2026 still assume a willing connection to a hosted model. That assumption fails the moment your customer's contract forbids it, your repo holds private health data, or you're on a flight (which, hah, is when I tend to want to refactor things). I wanted to know how far a small open-weight model on a laptop could go if the loop around it was sharp enough.

nip is the answer: a tiny ReAct runtime, a clean tool-calling protocol, and a deliberately modest set of filesystem primitives. The intelligence lives in the loop and the prompt — not in cloud capacity.

How it works

Each iteration: the agent emits a thought, picks a tool (read_file, write_file, list_dir, run_shell, ...), observes the result, repeats. The loop terminates when the model emits a final answer or hits a guard rail. Tool calls are validated against a JSON schema before execution, so a hallucinated path can't escape the workspace.

Gemma on Ollama is, surprisingly, quite competent at this scale — especially for narrow tasks like "refactor this callback to async/await and add an LRU cache." Not as competent as Claude. Not trying to be. The full session is auditable: every thought, tool call, observation, and patch goes to disk so you can replay or revert.

What I take from it

Local-first agents aren't a downgrade — they're a different point on the curve. For surgical refactors, scaffolding, and codebase Q&A, the gap between Gemma-on-Ollama-with-a-good-loop and a hosted frontier model is much smaller than the marketing on either side will admit. And the local version keeps every byte of source code, secret, and customer data on your machine.

nip is the reference implementation I point at when a team asks whether AI-enabled web development can survive an enterprise security review. As far as I can tell, yes. The harder question is taste — what to give the agent permission to do — and that one is just as hard for the hosted models.

← All projects More on AI engineering →Writing on AI & engineering →