
I build agentic AI systems and ship them to prod.
Bihan Banerjee
Creating with code. Details matter.
About
Full-stack AI engineer building agentic systems end-to-end. I published re-collect — a belief-centric memory layer for AI agents — and built Holdmind on top of it. I work across Python (FastAPI, LangGraph) and TypeScript (Node.js, React, Next.js), and everything goes to prod — Docker, Terraform, CI/CD. No notebook demos.
- •Built a multi-agent app builder with a 6-node LangGraph pipeline — planning, code generation, validation, and auto-recovery.
- •Built a 7-service trading platform handling 1,000+ price ticks/sec with in-memory liquidation and event sourcing.
- •Everything goes to prod — Docker, Terraform, CI/CD. No notebook demos.
Projects
Describe what you want in plain English and a 2-agent LangGraph system generates production-ready React apps in isolated E2B sandboxes. Sonnet 4.5 handles initial builds, o4-mini handles follow-up edits — with guardrail classification, prompt enhancement, template assembly, and a validation loop that auto-fixes build errors.
- •Built a 2-agent LangGraph system — Sonnet 4.5 for high-quality initial generation (create_app + web_search tools) and o4-mini for fast follow-up edits (modify_app + chat_message + web_search) with automatic error fixing up to 3 attempts.
- •Created a real-time chat interface with SSE streaming showing live terminal logs, tool call progress, build summary cards with file lists, and one-click deploy to Cloudflare Pages.
- •Designed a base template + assembler system with locked files (vite.config.js, main.jsx) that prevents LLM overwrites, ensuring sandbox compatibility across all generated projects.
- •Built a guardrail classifier that distinguishes build requests from general chat using the edit model, preventing wasted sandbox creation and enabling project-aware conversational responses.
- •Engineered a custom E2B sandbox template with pre-installed node_modules matching the base template, reducing sandbox startup from ~30s to ~5s while maintaining dependency sync.
- •Implemented stateless sandbox management with Cloudflare R2 persistence, sandbox reconnection, and a validation loop that writes files, runs npm build, and auto-patches errors via an error-fix agent.
A full-stack conversational AI that goes beyond chat history — every message is automatically parsed into typed beliefs (facts, experiences, habits) stored in a per-user belief graph. The AI queries this graph on every turn, making responses progressively more personalized. Supports multiple models (Claude, Grok, Mistral) via OpenRouter. Deployed to production on Vercel + DigitalOcean.
- •Engineered a streaming chat pipeline (SSE) that retrieves semantically relevant beliefs from Qdrant before each response, then extracts and stores new claims post-response using the recollectx LLMExtractor and MemoryUpdater — with multi-model support via OpenRouter (Claude, Grok, Mistral, and more).
- •Built a D3.js belief graph visualization showing confidence-weighted nodes, relationship edges (supports/contradicts/derives), and timestamped confidence history — plus a pattern detection system that surfaces behavioral insights (consistent habits, recurring themes, long-term goals).
- •Designed an animated public landing page with D3.js canvas particle animations and a pattern detection insights panel, plus a communication style detector that injects style hints into the system prompt for personalized responses.
- •Designed a per-user SQLite + shared Qdrant architecture — each user gets an isolated belief graph file while sharing a scoped vector collection (beliefs_{user_id}), bypassing recollectx's global session singleton via a memory factory pattern.
- •Built a dual-token auth system: short-lived JWTs in localStorage for API calls and 30-day opaque refresh tokens in httpOnly cookies, with a 401 interceptor that auto-retries after silent token refresh.
- •Encrypted OpenRouter API keys at rest (AES via PyCryptodome) and rate-limited all endpoints (slowapi) — 20 chat requests/min per user, 5 auth requests/min per IP — without adding latency to the streaming path.
Currently Building
in progressStack
Skills
GitHub Contributions
Research
ORCIDMonkeypox detection from skin lesion images using an amalgamation of CNN models aided with Beta function-based normalization scheme
Pramanik R., Banerjee B., Efimenko G., Kaplun D., Sarkar R.
PLOS ONE, 2023