Skip to main content
Back to Blog
technical4 min read

The Real Cost of Running an AI Product in 2025

API pricing is one component of total cost. Infrastructure, monitoring, and overhead add 60-70% more. Here is the complete breakdown with real numbers.

BLT

Bear Lumen Team

AI Infrastructure Experts

#infrastructure-costs#ai-economics#tco-analysis#cost-optimization

You see the API pricing: GPT-4o at $2.50/$10 per million tokens (roughly ¾ of a word each). You calculate your monthly OpenAI bill at $10,000. You think that's your AI cost.

Your actual spend includes $15,000-20,000 more.


The AI Cost Iceberg

Cost Category% of TotalMonthly $ (if API = $10K)
API Costs (OpenAI/Anthropic)30-40%$10,000 (visible)
Infrastructure (AWS/GCP/Azure)40-50%$12,000-15,000
Monitoring & Observability5-10%$1,500-3,000
Caching & Storage3-5%$900-1,500
Failed Requests & Retries2-3%$600-900
Development & Testing3-5%$900-1,500
TOTAL100%$25,900-32,900

A $10K/month API bill becomes $26K-33K in total cost. This pattern explains why Cursor's AWS bill doubled from $6.2M to $12.6M/month even as they optimized API usage.


What's in the Hidden 70%

Infrastructure (40-50%): Compute for application servers and background workers. Databases for usage tracking and vector search. Storage for conversation history. Networking, load balancers, and container orchestration.

Monitoring (5-10%): LLM observability tools (LangSmith, Helicone), application performance (Datadog, Sentry), and product analytics. At 10,000 users, monitoring typically runs $2,000-3,500/month.

Caching & Storage (3-5%): Prompt caching has write costs—Anthropic charges 1.25x input price to write to cache. You need 8+ cache hits to break even. Vector databases for semantic search add $200-1,000/month at scale.

Failed Requests (2-3%): An 8% error rate with 3 retries averages to 24% wasted API spend. On a $10K bill, that's $2,400/month most companies don't track.

Development (3-5%): Local testing, CI/CD integration tests, staging environments, and prompt iteration. Typically $1,400-5,300/month depending on team size.


Cursor's Cost Discovery

Cursor reached $500M ARR and discovered their AWS costs were 79% of their Anthropic costs.

MonthAWS BillAnthropic (est.)AWS as % of API
May 2025$6.2M~$8M77%
June 2025$12.6M~$16M79%

Why so high? Massive conversation history storage (200K token context windows), real-time collaboration infrastructure, code indexing, and distributed caching.

Their assumed economics: 64% gross margin based on API costs alone.

Their actual economics: 36% gross margin including infrastructure.

The 28-point margin difference led to four repricing cycles in 12 months, usage limits, a $200/month Ultra tier, and June 2025 pricing adjustments.


Real-World TCO: AI Chatbot Example

Product: Customer support chatbot with 10,000 users and 500K conversations/month on GPT-4o.

CategoryCost% of Total
API Costs$5,00030%
Infrastructure$1,85047%
Monitoring$1,10012%
Storage$6008%
Failed Requests$7003%
TOTAL$8,250100%

True cost per customer: $0.83/month

At a $15/month price point, actual gross margin is 94.5%. If you only tracked API costs ($0.50/user), you'd calculate 96.7% margin—a 2-point error.

Why 2 points matters at scale:

  • At $1M ARR: $20K/year difference
  • At $10M ARR: $200K/year difference
  • At $100M ARR: $2M/year difference

Key Takeaways

  1. API costs are 30-40% of total spend—infrastructure, monitoring, and overhead add 60-70% more
  2. Cursor's AWS bill was 79% of Anthropic costs—infrastructure often exceeds API spend at scale
  3. Failed requests waste 2-3% of budget—most companies don't track this
  4. Margin errors compound—a 2-point error becomes $2M/year at $100M ARR

Complete cost visibility enables accurate margin calculations.

Bear Lumen tracks API costs, infrastructure allocation, per-customer margins, and failed request overhead automatically.

Join our waitlist for early access.


Share this article