api reference
all chaos options and model wrappers
for the base cruel(...) function api, see core api. this page focuses on cruel/ai-sdk.
cruelModel
wraps a language model with chaos injection
import { cruelModel } from "cruel/ai-sdk"
const model = cruelModel(openai("gpt-4o"), {
rateLimit: 0.2,
delay: [100, 500],
})cruelEmbeddingModel
wraps an embedding model
import { cruelEmbeddingModel } from "cruel/ai-sdk"
const model = cruelEmbeddingModel(openai.embedding("text-embedding-3-small"), {
rateLimit: 0.2,
})cruelImageModel
wraps an image model
import { cruelImageModel } from "cruel/ai-sdk"
const model = cruelImageModel(openai.image("dall-e-3"), {
rateLimit: 0.2,
})cruelSpeechModel
wraps a speech model
import { cruelSpeechModel } from "cruel/ai-sdk"
const model = cruelSpeechModel(openai.speech("tts-1"), {
rateLimit: 0.1,
})cruelTranscriptionModel
wraps a transcription model
import { cruelTranscriptionModel } from "cruel/ai-sdk"
const model = cruelTranscriptionModel(openai.transcription("whisper-1"), {
rateLimit: 0.1,
})cruelVideoModel
wraps a video model
import { cruelVideoModel } from "cruel/ai-sdk"
const model = cruelVideoModel(google.video("veo-2.0-generate-001"), {
rateLimit: 0.2,
})cruelProvider
wraps an entire provider - automatically dispatches to the correct wrapper based on model type
import { cruelProvider } from "cruel/ai-sdk"
const chaos = cruelProvider(openai, {
rateLimit: 0.1,
models: {
"gpt-4o": { rateLimit: 0.5 },
},
})
chaos("gpt-4o") // cruelModel
chaos.embeddingModel("text-embedding") // cruelEmbeddingModel
chaos.imageModel("dall-e-3") // cruelImageModelcruelMiddleware
creates ai sdk middleware for chaos injection
import { cruelMiddleware } from "cruel/ai-sdk"
import { wrapLanguageModel } from "ai"
const model = wrapLanguageModel({
model: openai("gpt-4o"),
middleware: cruelMiddleware({ rateLimit: 0.1 }),
})cruelTool / cruelTools
wraps tool execution with chaos
import { cruelTool, cruelTools } from "cruel/ai-sdk"
const tool = cruelTool(myTool, { toolFailure: 0.2 })
const tools = cruelTools({ search, calc }, { toolFailure: 0.1 })model id override
if MODEL is set in the environment, cruel swaps the model id used by wrappers:
MODEL=gpt-6 bun run your-script.tsgpt-4o->gpt-6openai/gpt-4o->openai/gpt-6
chaos options
all options are probabilities between 0 and 1 (0 = never, 1 = always)
pre-call failures
these fire before the api request is made
| option | type | description |
|---|---|---|
rateLimit | number | { rate, retryAfter? } | simulates 429 rate limit (retryable) |
overloaded | number | simulates 529 model overloaded (retryable) |
modelUnavailable | number | simulates 503 model not available (retryable) |
fail | number | simulates 500 generation failed (retryable) |
invalidApiKey | number | simulates 401 invalid key (not retryable) |
quotaExceeded | number | simulates 402 quota exceeded (not retryable) |
contextLength | number | simulates 400 context too long (not retryable) |
contentFilter | number | simulates 400 content filtered (not retryable) |
emptyResponse | number | simulates 200 with empty body (not retryable) |
timeout | number | hangs forever (never resolves) |
delay | number | [min, max] | adds latency in ms before the call |
post-call mutations
these modify the response after a successful api call
| option | type | description |
|---|---|---|
partialResponse | number | truncates the response text randomly |
finishReason | string | overrides the finish reason |
tokenUsage | { inputTokens?, outputTokens? } | overrides token counts |
stream transforms
these modify the token stream in real-time
| option | type | description |
|---|---|---|
slowTokens | number | [min, max] | adds delay between each token in ms |
streamCut | number | kills the stream mid-transfer |
corruptChunks | number | replaces random characters with the replacement character |
tool options
| option | type | description |
|---|---|---|
toolFailure | number | tool execution throws an error |
toolTimeout | number | tool execution hangs forever |
presets
import { presets } from "cruel/ai-sdk"| preset | rateLimit | overloaded | streamCut | delay |
|---|---|---|---|---|
realistic | 0.02 | 0.01 | - | 50-200ms |
unstable | 0.1 | 0.05 | 0.05 | 100-500ms |
harsh | 0.2 | 0.1 | 0.1 | 200-1000ms |
nightmare | 0.3 | 0.15 | 0.15 | 500-2000ms |
apocalypse | 0.4 | 0.2 | 0.2 | 1000-5000ms |
onChaos callback
every chaos event fires a callback with the event type and model id
const model = cruelModel(openai("gpt-4o"), {
rateLimit: 0.2,
onChaos: (event) => {
console.log(event.type, event.modelId)
},
})event types: rateLimit, overloaded, contextLength, contentFilter, modelUnavailable, invalidApiKey, quotaExceeded, emptyResponse, fail, timeout, delay, streamCut, slowTokens, corruptChunk, partialResponse, toolFailure, toolTimeout
diagnostics
programmatic chaos reporting for test suites. track events, record results, compute stats, print reports
import { cruelModel, diagnostics } from "cruel/ai-sdk"
const ctx = diagnostics.context()
const model = cruelModel(openai("gpt-4o"), {
rateLimit: 0.3,
delay: [100, 500],
onChaos: diagnostics.tracker(ctx),
})
for (let i = 1; i <= 10; i++) {
diagnostics.before(ctx, i)
const start = performance.now()
try {
const result = await generateText({ model, prompt: "test", maxRetries: 2 })
diagnostics.success(ctx, i, Math.round(performance.now() - start), result.text)
} catch (e) {
diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
}
}
diagnostics.print(ctx)raw stats for assertions
const s = diagnostics.stats(ctx)
s.total // number of requests
s.succeeded // number of successes
s.failed // number of failures
s.successRate // 0-1
s.duration // total ms
s.totalEvents // number of chaos events
s.events // [{ type, count, percent }]
s.errors // failed requests with status, retryable, event chain
s.requests // all requests with events
s.latency.success // { avg, p50, p99, min, max }
s.latency.failure // { avg, p50, p99, min, max }use in tests
test("survives 30% rate limits", async () => {
const ctx = diagnostics.context()
const model = cruelModel(openai("gpt-4o"), {
rateLimit: 0.3,
onChaos: diagnostics.tracker(ctx),
})
for (let i = 1; i <= 20; i++) {
diagnostics.before(ctx, i)
const start = performance.now()
try {
await generateText({ model, prompt: "test", maxRetries: 2 })
diagnostics.success(ctx, i, Math.round(performance.now() - start), "ok")
} catch (e) {
diagnostics.failure(ctx, i, Math.round(performance.now() - start), e)
}
}
const s = diagnostics.stats(ctx)
expect(s.successRate).toBeGreaterThan(0.5)
expect(s.latency.success.p99).toBeLessThan(5000)
})