Cost Control

When AI generates code using paid APIs/services, enforce strict limits.

⚠️

AI loves generating loops and recursive calls. Without limits, this can rack up massive bills.

The Math Rule

Always calculate and comment maximum costs:

// ALWAYS calculate and comment max costs
const OPENAI_COST_PER_1K_TOKENS = 0.002;
const MAX_TOKENS_PER_REQUEST = 4000;
const MAX_REQUESTS_PER_DAY = 1000;
const MAX_DAILY_COST = 
  OPENAI_COST_PER_1K_TOKENS * 
  (MAX_TOKENS_PER_REQUEST / 1000) * 
  MAX_REQUESTS_PER_DAY;
// MAX_DAILY_COST = $8.00/day = $240/month

The Limit Rule

Every paid operation needs explicit limits:

const rateLimiter = new RateLimiter({
  maxRequests: 100,
  windowMs: 60 * 1000,  // per minute
  maxCost: 10.00,       // per day
});
 
// Fail BEFORE hitting limit, not after
if (rateLimiter.wouldExceed(estimatedCost)) {
  throw new Error('Rate limit would be exceeded');
}

The Loop Prevention Rule

AI loves generating recursive/looping code. Always add circuit breakers:

let attempts = 0;
const MAX_ATTEMPTS = 10;
 
while (condition && attempts < MAX_ATTEMPTS) {
  attempts++;
  // ... operation
}
 
if (attempts >= MAX_ATTEMPTS) {
  logError('Circuit breaker triggered');
  alertOps();
}

Cost Control Checklist

Before deploying AI-generated code with paid services:

Max cost per request calculated and commented
Max cost per day/month calculated
Rate limits implemented
Circuit breakers for loops/recursion
Alerts set up for approaching limits
Kill switch ready for runaway costs

Example: OpenAI Integration

const MAX_TOKENS = 1000;
const MAX_REQUESTS_PER_MINUTE = 10;
const COST_PER_1K_TOKENS = 0.002;
 
// Estimated max cost: $0.02/minute = $1.20/hour = $28.80/day
// Comment the math so future-you understands the limit
 
class OpenAIService {
  private requestCount = 0;
  private lastResetTime = Date.now();
  
  async complete(prompt: string) {
    // Rate limit check
    if (this.requestCount >= MAX_REQUESTS_PER_MINUTE) {
      const waitTime = 60000 - (Date.now() - this.lastResetTime);
      if (waitTime > 0) {
        throw new Error(`Rate limit exceeded. Wait ${waitTime}ms`);
      }
      this.requestCount = 0;
      this.lastResetTime = Date.now();
    }
    
    this.requestCount++;
    
    return openai.complete({
      prompt,
      max_tokens: MAX_TOKENS,  // Hard limit
    });
  }
}

Alert Thresholds

Set up alerts at multiple levels:

Threshold	Action
50% of daily limit	Info alert to Slack
75% of daily limit	Warning email
90% of daily limit	Page on-call
100% of daily limit	Auto-disable service

Architecture Principles Tool Tips