How to Price AI Products: A Data-Driven Framework

title: 'How to Price AI Products: A Data-Driven Framework' tagline: 'Most AI pricing decisions are made without the one thing that matters most: actual cost data.' description: 'A practical framework for pricing AI products using real cost-to-serve data. Covers unit economics, pricing models, margin analysis, and iteration strategies for AI startups.' publishedAt: '2026-03-16' updatedAt: '2026-04-12' author: 'Bear Lumen Team' authorRole: 'Research' category: 'guides' tags: ['ai-pricing', 'pricing-strategy', 'unit-economics', 'cost-to-serve', 'margins'] featured: true reviewed: true voice: observer

Most pricing frameworks start with the wrong question.

They ask: what will customers pay? Then they survey competitors, model willingness-to-pay, and anchor to whatever the market seems to tolerate. The price ships. Months later, the margin report arrives. Some customers are profitable. Some cost more to serve than they pay. Nobody knows which are which until the aggregate numbers turn red.

The correct starting question is simpler: what does it cost to serve each customer?

Intercom knew this when it priced Fin at $0.99 per resolved ticket. The number was not pulled from competitor analysis. It was pulled from inference cost data: $0.15 to $0.85 per resolution depending on complexity, with enough volume for the law of large numbers to stabilize margin. Cognition's Devin charges $20/month plus per-compute-unit billing. Not because $20 is what the market will bear, but because the cost of autonomous coding sessions varies by 100x depending on complexity, and flat pricing would bleed on power users.

GitHub Copilot did the opposite. Microsoft set $10/month based on market positioning. The cost to serve turned out to be roughly $30/month. That gap went unmeasured for two years across 4.7 million subscribers.

The difference between these outcomes is not pricing talent. It is the presence or absence of cost-to-serve data before the price was set.

Last Updated: April 2026

The Structural Difference

Traditional SaaS has near-zero marginal cost per user. Add a customer, and the infrastructure bill barely moves. That made it possible to price on value, competitive positioning, or gut feel and still maintain 70-90% gross margins.

AI products break that assumption. Every interaction triggers inference. Every inference costs money. Two customers on the same plan can have 10-100x cost differences depending on usage patterns. A chat feature and a document analysis feature have entirely different cost profiles. A "free" feature nobody asked for might be burning 40% of compute.

ICONIQ Growth reports that scaling AI B2B companies averaged 52% gross margins in January 2026, up from 41% in 2024. Bessemer Venture Partners documents AI gross margins at 50-60%, against 70-90% for mature SaaS. For every $1M in AI product revenue, roughly $230,000 is consumed by inference costs alone. Before engineering, sales, or support.

Dimension	Traditional SaaS	AI-First SaaS
Gross margin	70-90%	20-60% (avg 52%)
Marginal cost per user	Near zero	Variable, per-request
Cost predictability	High (fixed infra)	Low (model-dependent)
Cost variance across customers	Minimal	10-100x range
Pricing model convergence	Settled (per-seat)	Unsettled (7+ models)

The structural implication: pricing AI products requires cost-to-serve data at the customer level. Without it, margin calculations are fiction.

Step 1: Know Your Costs Before Setting a Price

Most AI companies track the monthly OpenAI or Anthropic bill. Fewer track cost per customer. Almost none track cost per feature, per workflow, or per outcome.

The unit of AI cost accounting is not the token. It is the trace: a complete workflow execution from user request to final response. A single customer interaction might involve embedding a query, retrieving context from a vector database, calling a planner model, executing tool calls, generating a response, and running safety checks. That is six cost centers from multiple providers in one interaction. Token math captures maybe two of them.

Current API prices illustrate the range. GPT-4o costs $2.50/$10 per million input/output tokens. Claude Sonnet 4.6 costs $3/$15. Claude Opus 4.6 costs $5/$25. A workflow that chains multiple models, uses retrieval, and includes tool calls accumulates costs across all of them. Prompt caching (90% savings) and batch processing (50% off) can reduce these dramatically, but only if the architecture is designed for them.

Before setting any price, four questions need answers.

What does it cost to serve each customer per month? Not on average. Per customer, with variance. The distribution matters more than the mean, because averages hide the customers who cost more than they pay.

Which features drive the most cost? A chat feature and a code generation feature have very different cost profiles. Knowing which features cost what determines which ones belong in each tier.

What does a single workflow execution cost? Not the LLM call alone. The full trace: retrieval, tool calls, retries, safety checks.

How does cost correlate with value delivered? Some high-cost interactions produce high value. Some do not. This correlation determines whether outcome-based pricing is viable for your product.

If you can answer these four, you can price with confidence. If you cannot, any pricing framework is built on assumptions that may be off by an order of magnitude.

Step 2: Build Unit Economics From Real Data

With cost data in hand, the core financial model becomes straightforward.

Gross margin per customer equals revenue minus variable cost to serve. For AI products, variable cost includes LLM API costs across all models, embedding and retrieval infrastructure, tool call and external API costs, compute for self-hosted models, and a proportional share of vector databases and caching layers.

Bessemer sets the benchmark tiers. Fast-ramping AI "Supernovas" average about 25% gross margin early on. Steadier "Shooting Stars" trend closer to 60%. LLM-native companies maintaining around 65% while growing 400% year-over-year represent the current ceiling for high-growth AI businesses.

A healthy AI product targets 60-80% gross margins. Below 50%, pricing or cost structure needs attention. The structural floor for AI companies will likely settle at 60-65%, below the 80%+ traditional SaaS achieves.

Metric	Target	Why
Gross margin	60-80%	Covers fixed costs and profit
LTV/CAC	3x minimum	Validates acquisition economics
Payback period	Under 12 months	Runway sustainability
Cost variance	Measured, not averaged	Tiers should reflect reality
Inference as % of revenue	Under 23%	ICONIQ benchmark for scaling AI

A practical example. A customer pays $100/month, costs $10/month to serve, and cost $300 to acquire. Payback is 3.3 months. Three-year gross LTV is roughly $3,240, giving an LTV/CAC of 10.8x. Healthy.

But if your heaviest users cost $80/month to serve on the same $100 plan, payback stretches to 15 months and LTV/CAC drops to 2.4x. Same product. Same plan. Radically different economics depending on which customer you examine. This is the unit economics challenge that makes per-customer cost data essential, not optional.

Step 3: Choose a Pricing Model That Fits Your Cost Structure

The market has not converged. A 2025 industry report found 92% of AI software companies now use mixed pricing models. Seven distinct approaches have emerged, and the right choice depends on your cost structure and the specificity of your use case.

Input-Based Pricing

Tokens, API calls, compute time, storage. OpenAI and Anthropic use this model because they serve an entire market and cannot predict what gets built on their APIs.

This works when you are a horizontal platform, your customers are technical, and cost scales predictably with usage. The risk: input pricing creates what Clay discovered firsthand. When Clay introduced per-action pricing, users during onboarding chose to enrich 10 emails instead of 1,000. Not because 10 was enough, but because they were nervous about spending credits they could not predict. The onboarding trained users to be conservative instead of discovering value.

Outcome-Based Pricing

Resolved tickets, completed reports, qualified leads. Intercom Fin charges $0.99 per resolved conversation. 11x bills roughly $5 per qualified lead. Chargeflow takes 25% of recovered chargeback value.

This works when the outcome is measurable and the problem is well-defined. The risk: cost variance per outcome can be enormous. Some support tickets take 500 tokens to resolve. Some take 100,000. If Intercom's average inference cost per resolution is $0.30, they keep $0.69 of gross margin. If a harder ticket requires multiple model calls and costs $0.85, margin drops to $0.14.

Sierra AI serves 40% of the Fortune 50 with outcome-based pricing and no public rates. Year-one costs reportedly reach $200K-$350K+. Outcome pricing at enterprise scale works when volume smooths variance. For startups without that volume, a single high-cost outlier can wreck monthly margins.

Hybrid: Subscription Base With Usage Components

Most AI products land here. A base subscription provides revenue predictability while usage components capture value as customers scale.

GitHub Copilot charges $10/month Individual, $19/user/month Business, $39/user/month Enterprise. Microsoft Copilot for Security bills $4/hour of compute. Notion bundles AI into its $20/user/month Business tier.

Three tiers work for most products. A starter tier with low price and limited usage for low-friction entry. A growth tier with higher limits that is the obvious choice for most customers. An enterprise tier with custom limits priced on negotiation. The design principle: hide pricing complexity for 90% of users. Let heavy users buy additional capacity after hitting their plan limit. Do not make everyone a meticulous gas-meter tracker.

Model	Example	Billing Unit	Works When
Input-based	OpenAI, Anthropic	Per token/call	Horizontal platform, technical buyers
Per-resolution	Intercom Fin ($0.99)	Resolved ticket	Binary outcomes, high volume
Per-lead	11x ($5,000/mo)	Qualified lead	Measurable pipeline value
Hourly	Copilot for Security ($4/hr)	Compute hour	Bursty, unpredictable workloads
Outcome %	Chargeflow (25%)	% of value recovered	High-value, measurable outcomes
Hybrid seat+usage	GitHub Copilot ($19-$39/seat)	Seat + overages	Most B2B SaaS products
Bundled	Notion ($20/seat)	Feature-gated tier	AI as feature, not core product

Match Model to Product Maturity

Before committing, be honest about where the product sits.

Commodity products compete on price with pure usage at the lowest unit cost. Most wrapper and thin-layer AI products live here. Differentiated products compete on quality with premium per-unit pricing justified by measurable quality differences. Specialized vertical AI and products with proprietary data live here. Indispensable products compete on outcomes and can make delivery promises because output variance is narrow enough.

Most companies believe they are differentiated. Most are commodity. The pricing model should match reality. Commodity products doing outcome pricing make promises they cannot keep. Indispensable products on pure usage pricing leave money on the table. If you do not know which level you occupy, you probably lack the cost and outcome data to tell. That is itself the answer: start with usage-based or hybrid pricing and measure your way toward outcome pricing as output variance narrows.

Step 4: Price to Value, Bounded by Alternatives

Cost data is the floor. Value is the target. But the ceiling is set by alternatives, and that ceiling is moving.

If your AI replaces a $50,000/year employee, $10,000 annually is a strong value proposition regardless of delivery cost. If you prevent a $100,000 compliance violation, $10,000 is rational risk reduction. A useful rule of thumb: aim for 5-10x ROI for the customer. At 10x, the purchase decision is obvious. Below 3x, it becomes a negotiation.

Customers compare your price to four alternatives: hiring someone to do the work manually, using a competitor, building it themselves, or doing nothing.

That third option is changing faster than most vendors have noticed. AI coding tools have dropped the cost of building software in-house significantly. Retool's 2026 report found that 35% of their customers have already replaced at least one SaaS tool with a custom build, and 78% expect to build more of their own tools in 2026.

The practical test: can your customer's team member, armed with Cursor and a weekend, replicate 80% of your core workflow? When that build option was $500K and 12 months, paying $50K/year for SaaS was obvious. When it costs $5,000-$15,000 and three months with AI-assisted development, the calculus changes.

This does not mean underprice. Price signals quality. In B2B, a $29/month AI product competing against $500/month solutions raises suspicion, not excitement. But the ceiling on what you can charge is a moving target, and it is moving down. Anchor pricing to alternatives, not to what the market would theoretically pay.

Step 5: Handle Free Tiers With Cost Awareness

Free trials and freemium reduce the first purchase decision to zero. For AI products, this comes with a specific risk: you pay real, per-interaction costs for users who have not committed to paying.

Free trials grant full access for 14-30 days and create urgency. They work best when users experience value within minutes. One design principle: the trial clock should start when the user takes their first meaningful action, not when they create an account.

Freemium offers limited features indefinitely and works when usage reveals value over time and cost-to-serve on the free tier stays low. For AI products, "low cost to serve" is the constraint. Set clear boundaries: limit AI interactions per month, restrict which models or features are available, cap compute time or output length.

The math that matters: monitor cost to serve free users and free-to-paid conversion rate. If you spend $5/month per free user with a 2% monthly conversion rate, you spend $250 in free-tier costs per converted customer. That is your effective CAC from the free tier. Compare it to your other acquisition channels. If paid ads acquire customers at $150 and the free tier acquires them at $250 with higher engagement, the free tier may still win on LTV. If the free tier acquires at $250 with lower engagement, it is a cash furnace disguised as a growth engine.

Step 6: Iterate With Data, Not Intuition

Pricing is not a launch decision. It is a continuous process. In the first year, revisit quarterly. As the product matures, semi-annual reviews are sufficient unless model costs shift sharply. They often do.

Four signals indicate price is too low: customers accept without negotiation, sales cycles are unusually fast, buyers describe it as "a steal," and high churn from low engagement suggests users do not value what they did not invest in.

Four signals indicate price is too high: frequent discount requests, high abandonment at the payment step, long sales cycles stuck at procurement, and declining win rates despite strong product fit.

When adjusting, raise prices for new customers first and measure conversion impact before applying broadly. Grandfather existing customers or give 90+ days notice with clear communication about added value. Test packaging changes before testing price changes. Packaging often has a larger impact than the number itself.

The Foundation

Every step in this framework requires one thing: actual cost-to-serve data at the customer level. Without it, unit economics are estimates. Tier boundaries are guesses. Margin calculations are fiction.

The AI pricing discourse is full of confident, mutually exclusive claims. Seat-based pricing is dead. Seat-based pricing improved retention. Outcome-based pricing is the future. Outcome-based pricing is a buzzword. Usage-based pricing is necessary. Usage-based pricing destroys retention. Credits are a growth lever. Credits are customer-hostile.

When you lay these arguments side by side, the contradictions are not subtle. Almost every disagreement traces back to the same gap: per-request, per-customer cost-to-serve data barely exists yet. The arguments are built on intuition, anecdotes, and small samples.

The market may be converging on hybrid pricing not because it is the best model, but because it is the model you can implement without knowing your true cost-to-serve. Companies that do have per-request cost data might arrive at entirely different, more profitable answers. They might discover that certain customer segments are wildly profitable under flat pricing while others need aggressive usage gating. They might find that output variance is narrow enough in specific use cases to justify outcome pricing for those segments alone.

Cost-to-serve data is what turns pricing from guesswork into architecture. The teams that have it build pricing that adapts. The rest build pricing that corrects.

Bear Lumen gives you per-customer cost-to-serve data for AI products. See how it works.