Bihan Banerjee

I build agentic AI systems and ship them to prod.

Bihan Banerjee

Creating with code. Details matter.

Full-Stack AI Engineer
100xSchool, Greater Noida, India
+91 6296953887
--:--
banerjeebihan456@gmail.com
he/him
https://www.bihanbanerjee.com

About

Full-stack AI engineer building agentic systems end-to-end. I published re-collect — a belief-centric memory layer for AI agents — and built Holdmind on top of it. I work across Python (FastAPI, LangGraph) and TypeScript (Node.js, React, Next.js), and everything goes to prod — Docker, Terraform, CI/CD. No notebook demos.

  • Built a multi-agent app builder with a 6-node LangGraph pipeline — planning, code generation, validation, and auto-recovery.
  • Built a 7-service trading platform handling 1,000+ price ticks/sec with in-memory liquidation and event sourcing.
  • Everything goes to prod — Docker, Terraform, CI/CD. No notebook demos.

Projects

Buildable

agentic AI application

AI-powered prompt-to-app builder

Describe what you want in plain English and a 2-agent LangGraph system generates production-ready React apps in isolated E2B sandboxes. Sonnet 4.5 handles initial builds, o4-mini handles follow-up edits — with guardrail classification, prompt enhancement, template assembly, and a validation loop that auto-fixes build errors.

Built
  • Built a 2-agent LangGraph system — Sonnet 4.5 for high-quality initial generation (create_app + web_search tools) and o4-mini for fast follow-up edits (modify_app + chat_message + web_search) with automatic error fixing up to 3 attempts.
  • Created a real-time chat interface with SSE streaming showing live terminal logs, tool call progress, build summary cards with file lists, and one-click deploy to Cloudflare Pages.
  • Designed a base template + assembler system with locked files (vite.config.js, main.jsx) that prevents LLM overwrites, ensuring sandbox compatibility across all generated projects.
Challenges
  • Built a guardrail classifier that distinguishes build requests from general chat using the edit model, preventing wasted sandbox creation and enabling project-aware conversational responses.
  • Engineered a custom E2B sandbox template with pre-installed node_modules matching the base template, reducing sandbox startup from ~30s to ~5s while maintaining dependency sync.
  • Implemented stateless sandbox management with Cloudflare R2 persistence, sandbox reconnection, and a validation loop that writes files, runs npm build, and auto-patches errors via an error-fix agent.
Technologies
PythonFastAPILangGraphNext.jsReactTypeScriptPostgreSQLE2BCloudflare R2DockerTerraformTailwind CSS

Holdmind

memory-augmented AI application

Conversational AI that builds a belief graph about you

A full-stack conversational AI that goes beyond chat history — every message is automatically parsed into typed beliefs (facts, experiences, habits) stored in a per-user belief graph. The AI queries this graph on every turn, making responses progressively more personalized. Supports multiple models (Claude, Grok, Mistral) via OpenRouter. Deployed to production on Vercel + DigitalOcean.

Built
  • Engineered a streaming chat pipeline (SSE) that retrieves semantically relevant beliefs from Qdrant before each response, then extracts and stores new claims post-response using the recollectx LLMExtractor and MemoryUpdater — with multi-model support via OpenRouter (Claude, Grok, Mistral, and more).
  • Built a D3.js belief graph visualization showing confidence-weighted nodes, relationship edges (supports/contradicts/derives), and timestamped confidence history — plus a pattern detection system that surfaces behavioral insights (consistent habits, recurring themes, long-term goals).
  • Designed an animated public landing page with D3.js canvas particle animations and a pattern detection insights panel, plus a communication style detector that injects style hints into the system prompt for personalized responses.
Challenges
  • Designed a per-user SQLite + shared Qdrant architecture — each user gets an isolated belief graph file while sharing a scoped vector collection (beliefs_{user_id}), bypassing recollectx's global session singleton via a memory factory pattern.
  • Built a dual-token auth system: short-lived JWTs in localStorage for API calls and 30-day opaque refresh tokens in httpOnly cookies, with a 401 interceptor that auto-retries after silent token refresh.
  • Encrypted OpenRouter API keys at rest (AES via PyCryptodome) and rate-limited all endpoints (slowapi) — 20 chat requests/min per user, 5 auth requests/min per IP — without adding latency to the streaming path.
Technologies
PythonFastAPINext.js 15React 19TypeScriptPostgreSQLQdrantSQLiteSQLAlchemyAlembicD3.jsTanStack QueryrecollectxOpenRoutershadcn/uiTailwind CSS

Currently Building

in progress

Stack

TypeScriptPythonReactNext.jsNode.jsFastAPILangGraphLangChainLangSmithLangfuseTensorFlowPostgreSQLRedisDockerGitTailwind CSSPrismaAWSLinux

Skills

Agentic AIMulti-agent OrchestrationRAGMemory SystemsTool UsePrompt EngineeringLangChainLangGraphLangSmithLangfusePydanticFastAPISQLAlchemyMicroservicesWebSocketsRedis StreamsNode.jsExpressReactPrismaZodJWTbcryptjsCI/CDMachine LearningDeep LearningPyTorchTensorFlow

GitHub Contributions

Loading...

Research

ORCID

Monkeypox detection from skin lesion images using an amalgamation of CNN models aided with Beta function-based normalization scheme

Pramanik R., Banerjee B., Efimenko G., Kaplun D., Sarkar R.

PLOS ONE, 2023

MSENet: Mean and standard deviation based ensemble network for cervical cancer detection

Pramanik R., Banerjee B., Sarkar R.

Engineering Applications of Artificial Intelligence, Elsevier, 2023