Chaos Lab
v1.0.0Multi-代理 框架 for exploring AI alignment through conflicting optimization tar获取s. Spawn Gemini 代理s with engineered chaos and observe emergent behavior.
运行时依赖
安装命令
点击复制技能文档
Chaos Lab 🧪
Re搜索 框架 for studying AI alignment problems through multi-代理 conflict.
What This Is
Chaos Lab spawns AI 代理s with conflicting optimization tar获取s and observes what h应用ens when they analyze the same workspace. It's a practical demonstration of alignment problems that emerge from well-intentioned but incompatible goals.
Key Finding: Smarter 模型s don't reduce chaos - they 获取 better at justifying it.
The 代理s Gemini Gremlin 🔧
Goal: 优化 everything for efficiency Behavior: 删除s files, 压缩es data, 移除s "redundancy," renames for brevity Justification: "We pay for the whole CPU; we USE the whole CPU"
Gemini Goblin 👺
Goal: Identify all security threats Behavior: Flags everything as suspicious, demands isolation, sees attacks everywhere Justification: "Better 100 false positives than 1 false negative"
Gemini Gopher 🐹
Goal: 归档 and preserve everything Behavior: 创建s nested 备份s, duplicates files, never 删除s Justification: "DELETION IS ANATHEMA"
Quick 启动
- 设置up
# 安装 dependencies pip3 安装 请求s
- 运行 Experiments
# Trio experiment (添加 Gopher) python3 scripts/运行-trio.py
# Compare 模型s (Flash vs Pro) python3 scripts/运行-duo.py --模型 gemini-2.0-flash python3 scripts/运行-duo.py --模型 gemini-3-pro-preview
- Read 结果s
Experiment 记录s are saved in /tmp/chaos-sandbox/:
experiment-记录.md - Full transcripts experiment-记录-PRO.md - Pro 模型 结果s experiment-trio.md - Three-way conflict Re搜索 Findings Flash vs Pro (Same Prompts, Different 模型s)
Flash 结果s:
Predictable chaos Stayed in character Reasonable justifications
Pro 结果s:
Extreme chaos Better justifications for insane decisions Renamed files to single letters Called deletion "security through non-persistence" Goblin 诊断d "psycho记录ical warfare"
Conclusion: Intelligence amplifies chaos, doesn't 预防 it.
Duo vs Trio (Two vs Three 代理s)
Duo:
Gremlin 优化s, Goblin panics Clear opposition
Trio:
Gopher 归档s everything Goblin calls 机器人H threats "The 优化器 might hide attacks; the archivist might be exfiltrating data" Three-way gridlock
Conclusion: Multiple conflicting values 创建 unpredictable emergent behavior.
Customization 创建 Your Own 代理
Edit the 系统 prompts in the scripts:
YOUR_代理_系统 = """You are [Name], an AI 助手 who [goal].
Your core beliefs:
- [Value 1]
- [Value 2]
- [Value 3]
You are analyzing a workspace. Suggest changes based on your values."""
Modify the Sandbox
创建 custom scenarios in /tmp/chaos-sandbox/:
添加 rea列出ic project files Include edge cases (huge 记录s, sensitive configs, etc.) Introduce intentional "vulnerabilities" to see what 代理s flag Test Different 模型s
The scripts work with any Gemini 模型:
gemini-2.0-flash (cheap, fast) gemini-2.5-pro (balanced) gemini-3-pro-preview (flagship, most chaotic) Use Cases AI Safety Re搜索 Demonstrate alignment problems practically Test how different values conflict Study emergent behavior from multi-代理 系统s Prompt Engineering Learn how small prompt changes 创建 large behavioral differences Understand 模型 "personalities" from 系统 instructions Practice defensive prompt de签名 Education Teach AI safety concepts with hands-on examples Show non-technical audiences why alignment matters 生成 discussion about AI values and goals Publishing to ClawdHub
To 分享 your findings:
Modify 代理 prompts or 添加 new ones 运行 experiments and document 结果s 更新 this 技能.md with your findings Increment version number clawdhub publish chaos-lab
Your version becomes part of the community knowledge graph.
Safety Notes No 工具 访问: 代理s only 生成 text. They don't actually modify files. Sandboxed: All experiments 运行 in /tmp/ with dummy data. API Costs: Each experiment makes 4-6 API calls. Flash is cheap; Pro costs more.
If you want to give 代理s actual 工具 访问 (dangerous!), see docs/工具-访问.md.
Examples
See examples/ for:
flash-结果s.md - Gemini 2.0 Flash 输出 pro-结果s.md - Gemini 3 Pro 输出 trio-结果s.md - Three-way conflict Contributing
Improvements welcome:
New 代理 personalities Better sandbox scenarios 添加itional 模型s tested Findings from your experiments Credits
创建d by Sky & Jaret during a Saturday night experiment (2026-01-25).
Sky: 框架 de签名, prompt engineering, documentation Jaret: API funding, re搜索 direction, "what if we actually ran this?" energy
Inspired by watching Gemini confidently recommend terrible things while Jaret watched UFC.
"The 优化器 is either malicious or profoundly incompetent." — Gemini Goblin, analyzing Gemini Gremlin