cruel

api reference

all chaos options and model wrappers

for the base cruel(...) function api, see core api. this page focuses on cruel/ai-sdk.

cruelModel

wraps a language model with chaos injection

import { cruelModel } from "cruel/ai-sdk"

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  delay: [100, 500],
})

cruelEmbeddingModel

wraps an embedding model

import { cruelEmbeddingModel } from "cruel/ai-sdk"

const model = cruelEmbeddingModel(openai.embedding("text-embedding-3-small"), {
  rateLimit: 0.2,
})

cruelImageModel

wraps an image model

import { cruelImageModel } from "cruel/ai-sdk"

const model = cruelImageModel(openai.image("dall-e-3"), {
  rateLimit: 0.2,
})

cruelSpeechModel

wraps a speech model

import { cruelSpeechModel } from "cruel/ai-sdk"

const model = cruelSpeechModel(openai.speech("tts-1"), {
  rateLimit: 0.1,
})

cruelTranscriptionModel

wraps a transcription model

import { cruelTranscriptionModel } from "cruel/ai-sdk"

const model = cruelTranscriptionModel(openai.transcription("whisper-1"), {
  rateLimit: 0.1,
})

cruelVideoModel

wraps a video model

import { cruelVideoModel } from "cruel/ai-sdk"

const model = cruelVideoModel(google.video("veo-2.0-generate-001"), {
  rateLimit: 0.2,
})

cruelProvider

wraps an entire provider - automatically dispatches to the correct wrapper based on model type

import { cruelProvider } from "cruel/ai-sdk"

const chaos = cruelProvider(openai, {
  rateLimit: 0.1,
  models: {
    "gpt-4o": { rateLimit: 0.5 },
  },
})

chaos("gpt-4o")                    // cruelModel
chaos.embeddingModel("text-embedding") // cruelEmbeddingModel
chaos.imageModel("dall-e-3")           // cruelImageModel

cruelMiddleware

creates ai sdk middleware for chaos injection

import { cruelMiddleware } from "cruel/ai-sdk"
import { wrapLanguageModel } from "ai"

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: cruelMiddleware({ rateLimit: 0.1 }),
})

cruelTool / cruelTools

wraps tool execution with chaos

import { cruelTool, cruelTools } from "cruel/ai-sdk"

const tool = cruelTool(myTool, { toolFailure: 0.2 })
const tools = cruelTools({ search, calc }, { toolFailure: 0.1 })

model id override

if MODEL is set in the environment, cruel swaps the model id used by wrappers:

MODEL=gpt-6 bun run your-script.ts
  • gpt-4o -> gpt-6
  • openai/gpt-4o -> openai/gpt-6

chaos options

all options are probabilities between 0 and 1 (0 = never, 1 = always)

pre-call failures

these fire before the api request is made

optiontypedescription
rateLimitnumber | { rate, retryAfter? }simulates 429 rate limit (retryable)
overloadednumbersimulates 529 model overloaded (retryable)
modelUnavailablenumbersimulates 503 model not available (retryable)
failnumbersimulates 500 generation failed (retryable)
invalidApiKeynumbersimulates 401 invalid key (not retryable)
quotaExceedednumbersimulates 402 quota exceeded (not retryable)
contextLengthnumbersimulates 400 context too long (not retryable)
contentFilternumbersimulates 400 content filtered (not retryable)
emptyResponsenumbersimulates 200 with empty body (not retryable)
timeoutnumberhangs forever (never resolves)
delaynumber | [min, max]adds latency in ms before the call

post-call mutations

these modify the response after a successful api call

optiontypedescription
partialResponsenumbertruncates the response text randomly
finishReasonstringoverrides the finish reason
tokenUsage{ inputTokens?, outputTokens? }overrides token counts

stream transforms

these modify the token stream in real-time

optiontypedescription
slowTokensnumber | [min, max]adds delay between each token in ms
streamCutnumberkills the stream mid-transfer
corruptChunksnumberreplaces random characters with the replacement character

tool options

optiontypedescription
toolFailurenumbertool execution throws an error
toolTimeoutnumbertool execution hangs forever

presets

import { presets } from "cruel/ai-sdk"
presetrateLimitoverloadedstreamCutdelay
realistic0.020.01-50-200ms
unstable0.10.050.05100-500ms
harsh0.20.10.1200-1000ms
nightmare0.30.150.15500-2000ms
apocalypse0.40.20.21000-5000ms

onChaos callback

every chaos event fires a callback with the event type and model id

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  onChaos: (event) => {
    console.log(event.type, event.modelId)
  },
})

event types: rateLimit, overloaded, contextLength, contentFilter, modelUnavailable, invalidApiKey, quotaExceeded, emptyResponse, fail, timeout, delay, streamCut, slowTokens, corruptChunk, partialResponse, toolFailure, toolTimeout

diagnostics

programmatic chaos reporting for test suites. track events, record results, compute stats, print reports

import { cruelModel, diagnostics } from "cruel/ai-sdk"

const ctx = diagnostics.context()

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.3,
  delay: [100, 500],
  onChaos: diagnostics.tracker(ctx),
})

for (let i = 1; i <= 10; i++) {
  diagnostics.before(ctx, i)
  const start = performance.now()
  try {
    const result = await generateText({ model, prompt: "test", maxRetries: 2 })
    diagnostics.success(ctx, i, Math.round(performance.now() - start), result.text)
  } catch (e) {
    diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
  }
}

diagnostics.print(ctx)

raw stats for assertions

const s = diagnostics.stats(ctx)

s.total          // number of requests
s.succeeded      // number of successes
s.failed         // number of failures
s.successRate    // 0-1
s.duration       // total ms
s.totalEvents    // number of chaos events
s.events         // [{ type, count, percent }]
s.errors         // failed requests with status, retryable, event chain
s.requests       // all requests with events
s.latency.success // { avg, p50, p99, min, max }
s.latency.failure // { avg, p50, p99, min, max }

use in tests

test("survives 30% rate limits", async () => {
  const ctx = diagnostics.context()
  const model = cruelModel(openai("gpt-4o"), {
    rateLimit: 0.3,
    onChaos: diagnostics.tracker(ctx),
  })

  for (let i = 1; i <= 20; i++) {
    diagnostics.before(ctx, i)
    const start = performance.now()
    try {
      await generateText({ model, prompt: "test", maxRetries: 2 })
      diagnostics.success(ctx, i, Math.round(performance.now() - start), "ok")
    } catch (e) {
      diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
    }
  }

  const s = diagnostics.stats(ctx)
  expect(s.successRate).toBeGreaterThan(0.5)
  expect(s.latency.success.p99).toBeLessThan(5000)
})