TL;DR: The Decision Split
BUY IF:
BUILD IF:
The AI customer support space is flooded with SaaS options right now. Intercom Fin, Zendesk AI, Freshdesk Freddy, Tidio, Drift: every support tool has bolted on an "AI agent" in the last 18 months. Most of them are knowledge base retrieval with a chat interface layered on top. That is fine for a lot of use cases. It is not fine for all of them.
At Wizz Air, I built and deployed an AI-powered support system across a 65-person customer operations team. It handled 700K+ monthly interactions (flight changes, cancellations, compensation claims, multi-language queries). The off-the-shelf tools were evaluated and rejected. Here is the detailed breakdown of why, and how to make the right call for your situation.
The Buy Side: What SaaS Tools Actually Do Well
The major SaaS options (Intercom Fin, Zendesk AI, Freshdesk Freddy) are genuinely good at a specific thing: FAQ deflection over a knowledge base. If your support volume is primarily customers asking questions that are answered somewhere in your documentation, these tools can deflect 40–60% of tickets without custom code.
Where off-the-shelf wins
- Speed to deploy. Most tools are live in days, not weeks. You connect your knowledge base, configure escalation rules, and you are running.
- No engineering required. Your support team configures it. No ML background needed.
- Reasonable cost at low volume. Intercom Fin starts around $0.99/resolution. If you have 2K tickets/month and it deflects 40%, that is $800/month, significantly cheaper than custom.
Where they consistently fail
- Complex multi-step workflows. A flight cancellation is not one question. It is a conditional tree: was it weather? Was it the airline? Did you have insurance? Each answer triggers a different resolution path. Off-the-shelf tools cannot handle this without extensive manual configuration that breaks constantly.
- Custom sentiment escalation. If a frustrated customer is approaching a regulatory complaint threshold, you need domain-specific escalation logic, not a generic "negative sentiment detected" flag.
- Real integration with your systems. Zendesk AI does not know your booking system. It cannot pull flight status, check refund eligibility, or trigger a compensation workflow. It can only answer what your knowledge base says to answer.
Typical cost range for off-the-shelf: $200–$2,000/month depending on volume and feature tier.
The Build Side: What a Custom System Actually Gives You
Building a custom AI support system is genuinely harder, takes longer, and costs more upfront. It also gives you things that no SaaS tool can.
- Integration with your exact data sources. Your booking system, CRM, order database, product catalog: all queryable in real time during the conversation.
- Multi-step reasoning. The agent can ask clarifying questions, update state, trigger backend actions, and produce a resolution, not just retrieve a document.
- Custom routing logic. Route by language, customer tier, issue type, sentiment, ticket history. Build routing that matches your actual operations.
- No vendor lock-in. Your data stays yours. Your prompts stay yours. When GPT-4o gets replaced by something better, you swap the model, not the entire platform.
- Training on your domain. Fine-tune on your historical tickets. The system learns your specific resolution patterns, not generic internet knowledge.
What it costs to build properly: 4–8 weeks of engineering time to build the initial system, then 10–20% of that ongoing for maintenance and improvements.
The Architecture I Would Build Today
This is the stack I would use for a production AI customer support system in 2026, based on what I deployed at Wizz Air and what I have learned since.
- LLM layer: GPT-4o or Claude 3.5 Sonnet. Both are strong at instruction-following and structured output. Pick based on cost profile and API stability.
- RAG over your knowledge base: Chunk your documentation, embed with
text-embedding-3-small, store in pgvector or Pinecone. Retrieve top-k chunks per query before sending to the LLM. For a detailed look at production RAG architecture, see RAG Pipeline Architecture for Production. - Intent classifier first: Before hitting the LLM, run a fast intent classification step. Is this a billing question? A technical issue? A cancellation? Route to the right prompt template and context window based on intent.
- Escalation logic: Explicit rules for when the AI should stop and hand off to a human. Never leave this to the LLM's judgment alone.
- Human handoff protocol: Full conversation context passed to the agent. No "the AI couldn't help you, start over" moments.
Here is a simplified version of the intent classifier I would use as the first step:
// Classify intent before hitting the main LLM
async function classifyIntent(message: string): Promise<IntentType> {
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini', // Fast + cheap for classification
messages: [
{
role: 'system',
content: 'Classify the customer message into one of: BILLING, CANCELLATION, TECHNICAL, COMPLAINT, OTHER. Respond with only the category.',
},
{ role: 'user', content: message },
],
temperature: 0,
max_tokens: 10,
});
return response.choices[0].message.content?.trim() as IntentType ?? 'OTHER';
}For production prompt patterns, I cover confidence scoring and chain-of-thought approaches in detail in Prompt Engineering for Production.
The Real Trade-off Matrix
| Dimension | Buy (SaaS) | Build (Custom) |
|---|---|---|
| Time to deploy | Days | 4–8 weeks |
| Cost, Year 1 | $2K – $24K | $30K – $80K |
| Cost, Year 3 | $6K – $72K+ | $35K – $90K (flat) |
| Customization | Low, within vendor limits | Complete: you own everything |
| Data privacy | Vendor processes your data | Full control, on your infra |
| Vendor risk | High: pricing, API changes | None on the platform layer |
| Complex workflows | Poor: FAQ-level only | Full support |
| Accuracy (domain-specific) | Generic baseline | Training on your data |
The Hybrid Approach (What Most Mid-Size Companies Actually Do)
The cleanest architecture for most companies at 5K–50K monthly support tickets is not pure buy or pure build. It is a deliberate hybrid.
Use Intercom or Freshdesk for simple FAQ deflection (the 40% of tickets that are "where is my order?" and "how do I reset my password?"). Build a custom system for the complex 20% (the multi-step workflows, the high-value customers, the edge cases that require real integration with your backend). Route between them based on intent classification.
This gives you fast deployment on the simple stuff, while maintaining complete control over the cases that actually matter for retention and revenue.
When to Call Me
I work on customer support AI systems when the off-the-shelf tools have hit their ceiling. Specifically:
- Complex multi-step workflows that require real integration with your backend systems
- HIPAA, GDPR, or data residency requirements that preclude sending customer data to a third-party SaaS
- Domain-specific accuracy requirements above 90%, the kind that require training on your historical data
- High volume (10K+ monthly interactions) where per-resolution SaaS pricing becomes expensive relative to a custom system
You can see more on how I approach AI integration engagements at /services/ai-integration, or book a call to walk through your specific situation.
The Decision in One Sentence
Buy if your support needs are FAQ-shaped and your volume is under 5K tickets/month. Build if you need real workflows, real integrations, or real accuracy on proprietary data.
Everything in between is a hybrid, and designing that hybrid well is where the actual engineering work lives.