Every week, a company reaches out asking us to "build a chatbot like ChatGPT, but for our product."
The conversation goes like this:
Client: "We want users to ask questions and get answers from our documentation."
Us: "What happens when the answer isn't in your docs?"
Client: "The bot should... figure it out?"
This is where most AI chatbot projects fail.
The Problem: Chatbots Are Dumb Retrieval Systems
Most "AI chatbots" are just:
- Embedding your docs into a vector database
- Retrieving relevant chunks based on user query
- Passing those chunks to GPT with a prompt like "Answer the question using this context"
This works for simple Q&A. It breaks down for:
- Multi-step workflows (e.g., "Help me troubleshoot why my API key isn't working")
- Actions (e.g., "Cancel my subscription")
- Ambiguous requests (e.g., "My dashboard is broken")
- Questions requiring external data (e.g., "What's my current usage?")
Why Chatbots Fail: The Math Doesn't Lie
Let's break down typical costs for a "production-ready" chatbot:
Development: $15K-30K
- Vector database setup and optimization
- Retrieval logic and chunking strategy
- Prompt engineering and testing
- UI integration
Monthly Operating Costs: $500-2K
- LLM API calls (10K-50K messages per month)
- Vector database hosting
- Monitoring and error tracking
Hidden Costs: $10K-40K per year
- Maintaining documentation quality (chatbot is only as good as your docs)
- Handling edge cases and user complaints
- Retraining when products change
- Customer support for when the bot fails
Total Year One: $25K-70K
The killer stat: 60-70% of chatbot projects are abandoned within 12 months because they don't deliver enough value to justify ongoing costs.
What Actually Works: Intelligent Agents (Not Chatbots)
The difference between a chatbot and an intelligent agent:
Chatbot:
- Retrieves documents
- Generates text response
- Cannot take actions
- No memory of past interactions
Intelligent Agent:
- Retrieves relevant context
- Reasons about what action to take
- Can execute workflows (API calls, database queries, multi-step processes)
- Maintains conversation context and user history
- Escalates to humans when uncertain
Real Example: Support Automation
Chatbot approach: User: "Why was I charged twice?" Bot: Searches docs, finds billing FAQ Bot: "Here's our refund policy..."
Result: User still frustrated, contacts human support anyway.
Agent approach: User: "Why was I charged twice?" Agent:
- Pulls user's transaction history via API
- Detects duplicate charge
- Identifies it's within auto-refund policy window
- Issues refund automatically
- Sends confirmation
Result: Issue resolved in 30 seconds, no human needed.
Cost comparison:
- Chatbot: $1.50 per resolution (LLM + human escalation)
- Agent: $0.15 per resolution (automated end-to-end)
At 1,000 support tickets per month, that's $1,350 per month in savings.
The Architecture Difference
Basic Chatbot Stack
- User query → Embedding → Vector search → Context retrieval → LLM generation → Response
Intelligent Agent Stack
- User query → Intent classification → Context retrieval (docs + user data + system state)
- Reasoning layer (determine if answer is actionable)
- Action executor (API calls, workflows)
- Verification (did the action succeed?)
- Response generation with citations
The agent architecture costs 3-5× more upfront but delivers 10× more value in high-volume scenarios.
Cost Optimization Strategies
If you're building an agent system, here's how to keep costs under control:
Use hybrid search:
- Semantic search for understanding intent
- Keyword search for exact matches
- Costs 40% less than pure vector search at scale
Cache common queries:
- 30-40% of support queries are variations of the same 10 questions
- Cache LLM responses for frequent patterns
- Reduces API costs by 25-30%
Smart routing:
- Use a cheap model (GPT-3.5 or Claude Haiku) to classify intent
- Only use expensive models (GPT-4) when complexity is high
- Cuts costs in half for most use cases
Context summarization:
- Don't re-send full conversation history on every turn
- Summarize past context, keep only recent messages
- Reduces token usage by 60% in long conversations
When a Chatbot Actually Makes Sense
Not all use cases need intelligent architecture. Basic chatbots work for:
Simple FAQ Bots
- Use case: Answer 5-10 common questions
- Why it works: Small knowledge base, no actions needed
- Example: "What are your hours?" "Where do I find my invoice?"
Lead Qualification Forms
- Use case: Collect user info before sales call
- Why it works: Structured data collection, no complex reasoning
- Example: Chatbot asks name, company, budget, schedules demo
Internal Wikis (Low Volume)
- Use case: Help 20-person team find docs
- Why it works: Small user base, hallucination tolerance higher
- Example: "Where's the design system?" → Links to Figma
What these have in common:
- Low message volume (less than 1K per month)
- Narrow scope (limited questions)
- No actions required
- Tolerance for imperfect answers
The Real Question: What Problem Are You Solving?
Before building any AI system, ask:
What's the ROI threshold?
- If you handle 500 support tickets per month at $10 each, automating 50% saves $2,500 per month
- Budget accordingly: spend less than 10-12 months of savings on development
What's the fallback?
- When the AI fails, what happens?
- Do you have human escalation workflows?
- Can you measure when the AI is uncertain?
What's the data quality?
- Is your documentation accurate and up to date?
- Do you have structured data (APIs, databases) the AI can query?
- Can you version control your knowledge base?
What's the success metric?
- Resolution rate?
- User satisfaction?
- Cost per interaction?
- Time to resolution?
If you can't answer these, you're not ready to build an AI chatbot or agent.
What We'd Build Today
If we were starting from scratch in 2025:
For simple use cases (less than 1K messages per month):
- Use an off-the-shelf tool (Intercom, Zendesk AI, HubSpot chatbot)
- Embed your docs, let them handle infrastructure
- Cost: $50-200 per month
For medium complexity (1K-10K messages per month):
- Build a lightweight RAG system with action capabilities
- Use LangChain or LlamaIndex for orchestration
- Host on Vercel or Railway for scalability
- Cost: $500-1,500 per month (mostly LLM API calls)
For high complexity (more than 10K messages per month or mission-critical workflows):
- Build a custom agent system with observability
- Use frameworks like LangGraph or Semantic Kernel
- Implement caching, smart routing, human-in-the-loop
- Cost: $2K-5K per month, but ROI justifies it
The Bottom Line
Don't build a chatbot. Build an intelligent system that solves a real problem.
If all you need is FAQ answering, use an off-the-shelf solution.
If you need workflow automation, build an agent system—but only if the ROI is clear.
And if you're not sure? Start with a Notion doc and a link to your support email. Sometimes the best AI is no AI at all.
Need help deciding if AI makes sense for your use case? We've built agent systems for companies processing 100K+ support tickets per month. Let's talk.