How to Rate Limit AI API Routes in Next.js: Protect Your Budget from Abuse
The Rate Limit That Protects Your AI Budget AI API calls are expensive. One user with a script, one bug in your frontend, one bad actor -- and your monthly bill spikes before you notice. Here's how...

Source: DEV Community
The Rate Limit That Protects Your AI Budget AI API calls are expensive. One user with a script, one bug in your frontend, one bad actor -- and your monthly bill spikes before you notice. Here's how to add per-user rate limiting to your AI routes so one user can't burn your entire Anthropic/OpenAI budget. Why AI Routes Need Special Limits Standard API call cost: ~0.001 cents Claude Sonnet call (1k tokens in, 1k out): ~0.3 cents GPT-4o call (1k tokens in, 1k out): ~0.5 cents At 1000 requests from one user: Standard API: $0.01 -- irrelevant Claude Sonnet: $3.00 -- noticeable GPT-4o: $5.00 -- noticeable At 10,000 requests (runaway script, malicious actor): Claude Sonnet: $30 in minutes GPT-4o: $50 in minutes The Two-Layer Approach // lib/ai-rate-limit.ts // Layer 1: Short burst limit (prevents rapid-fire abuse) // Layer 2: Daily budget limit (prevents sustained drain) import { rateLimit } from './rate-limit' export async function checkAIRateLimit(userId: string) { // 10 requests per minute