api reference

all chaos options and model wrappers

for the base cruel(...) function api, see core api. this page focuses on cruel/ai-sdk.

cruelModel

wraps a language model with chaos injection

import { cruelModel } from "cruel/ai-sdk"

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  delay: [100, 500],
})

cruelEmbeddingModel

wraps an embedding model

import { cruelEmbeddingModel } from "cruel/ai-sdk"

const model = cruelEmbeddingModel(openai.embedding("text-embedding-3-small"), {
  rateLimit: 0.2,
})

cruelImageModel

wraps an image model

import { cruelImageModel } from "cruel/ai-sdk"

const model = cruelImageModel(openai.image("dall-e-3"), {
  rateLimit: 0.2,
})

cruelSpeechModel

wraps a speech model

import { cruelSpeechModel } from "cruel/ai-sdk"

const model = cruelSpeechModel(openai.speech("tts-1"), {
  rateLimit: 0.1,
})

cruelTranscriptionModel

wraps a transcription model

import { cruelTranscriptionModel } from "cruel/ai-sdk"

const model = cruelTranscriptionModel(openai.transcription("whisper-1"), {
  rateLimit: 0.1,
})

cruelVideoModel

wraps a video model

import { cruelVideoModel } from "cruel/ai-sdk"

const model = cruelVideoModel(google.video("veo-2.0-generate-001"), {
  rateLimit: 0.2,
})

cruelProvider

wraps an entire provider - automatically dispatches to the correct wrapper based on model type

import { cruelProvider } from "cruel/ai-sdk"

const chaos = cruelProvider(openai, {
  rateLimit: 0.1,
  models: {
    "gpt-4o": { rateLimit: 0.5 },
  },
})

chaos("gpt-4o")                    // cruelModel
chaos.embeddingModel("text-embedding") // cruelEmbeddingModel
chaos.imageModel("dall-e-3")           // cruelImageModel

cruelMiddleware

creates ai sdk middleware for chaos injection

import { cruelMiddleware } from "cruel/ai-sdk"
import { wrapLanguageModel } from "ai"

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: cruelMiddleware({ rateLimit: 0.1 }),
})

cruelTool / cruelTools

wraps tool execution with chaos

import { cruelTool, cruelTools } from "cruel/ai-sdk"

const tool = cruelTool(myTool, { toolFailure: 0.2 })
const tools = cruelTools({ search, calc }, { toolFailure: 0.1 })

model id override

if MODEL is set in the environment, cruel swaps the model id used by wrappers:

MODEL=gpt-6 bun run your-script.ts

gpt-4o -> gpt-6
openai/gpt-4o -> openai/gpt-6

chaos options

all options are probabilities between 0 and 1 (0 = never, 1 = always)

pre-call failures

these fire before the api request is made

option	type	description
`rateLimit`	`number \| { rate, retryAfter? }`	simulates 429 rate limit (retryable)
`overloaded`	`number`	simulates 529 model overloaded (retryable)
`modelUnavailable`	`number`	simulates 503 model not available (retryable)
`fail`	`number`	simulates 500 generation failed (retryable)
`invalidApiKey`	`number`	simulates 401 invalid key (not retryable)
`quotaExceeded`	`number`	simulates 402 quota exceeded (not retryable)
`contextLength`	`number`	simulates 400 context too long (not retryable)
`contentFilter`	`number`	simulates 400 content filtered (not retryable)
`emptyResponse`	`number`	simulates 200 with empty body (not retryable)
`timeout`	`number`	hangs forever (never resolves)
`delay`	`number \| [min, max]`	adds latency in ms before the call

post-call mutations

these modify the response after a successful api call

option	type	description
`partialResponse`	`number`	truncates the response text randomly
`finishReason`	`string`	overrides the finish reason
`tokenUsage`	`{ inputTokens?, outputTokens? }`	overrides token counts

stream transforms

these modify the token stream in real-time

option	type	description
`slowTokens`	`number \| [min, max]`	adds delay between each token in ms
`streamCut`	`number`	kills the stream mid-transfer
`corruptChunks`	`number`	replaces random characters with the replacement character

tool options

option	type	description
`toolFailure`	`number`	tool execution throws an error
`toolTimeout`	`number`	tool execution hangs forever

presets

import { presets } from "cruel/ai-sdk"

preset	rateLimit	overloaded	streamCut	delay
`realistic`	0.02	0.01	-	50-200ms
`unstable`	0.1	0.05	0.05	100-500ms
`harsh`	0.2	0.1	0.1	200-1000ms
`nightmare`	0.3	0.15	0.15	500-2000ms
`apocalypse`	0.4	0.2	0.2	1000-5000ms

onChaos callback

every chaos event fires a callback with the event type and model id

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.2,
  onChaos: (event) => {
    console.log(event.type, event.modelId)
  },
})

event types: rateLimit, overloaded, contextLength, contentFilter, modelUnavailable, invalidApiKey, quotaExceeded, emptyResponse, fail, timeout, delay, streamCut, slowTokens, corruptChunk, partialResponse, toolFailure, toolTimeout

diagnostics

programmatic chaos reporting for test suites. track events, record results, compute stats, print reports

import { cruelModel, diagnostics } from "cruel/ai-sdk"

const ctx = diagnostics.context()

const model = cruelModel(openai("gpt-4o"), {
  rateLimit: 0.3,
  delay: [100, 500],
  onChaos: diagnostics.tracker(ctx),
})

for (let i = 1; i <= 10; i++) {
  diagnostics.before(ctx, i)
  const start = performance.now()
  try {
    const result = await generateText({ model, prompt: "test", maxRetries: 2 })
    diagnostics.success(ctx, i, Math.round(performance.now() - start), result.text)
  } catch (e) {
    diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
  }
}

diagnostics.print(ctx)

raw stats for assertions

const s = diagnostics.stats(ctx)

s.total          // number of requests
s.succeeded      // number of successes
s.failed         // number of failures
s.successRate    // 0-1
s.duration       // total ms
s.totalEvents    // number of chaos events
s.events         // [{ type, count, percent }]
s.errors         // failed requests with status, retryable, event chain
s.requests       // all requests with events
s.latency.success // { avg, p50, p99, min, max }
s.latency.failure // { avg, p50, p99, min, max }

use in tests

test("survives 30% rate limits", async () => {
  const ctx = diagnostics.context()
  const model = cruelModel(openai("gpt-4o"), {
    rateLimit: 0.3,
    onChaos: diagnostics.tracker(ctx),
  })

  for (let i = 1; i <= 20; i++) {
    diagnostics.before(ctx, i)
    const start = performance.now()
    try {
      await generateText({ model, prompt: "test", maxRetries: 2 })
      diagnostics.success(ctx, i, Math.round(performance.now() - start), "ok")
    } catch (e) {
      diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
    }
  }

  const s = diagnostics.stats(ctx)
  expect(s.successRate).toBeGreaterThan(0.5)
  expect(s.latency.success.p99).toBeLessThan(5000)
})

api reference

On this page