📦 Compression Monitor — 上下文压缩行为漂移检测

v1.0.0

用于检测持久化AI智能体在上下文压缩事件后的行为漂移。通过测量三个可观察信号来验证压缩后智能体行为一致性:词汇衰减(ghost lexicon decay)、上下文一致性评分(CCS)、工具调用分布偏移。无需访问智能体内部即可工作,支持smolagents、Semantic Kernel、LangChain等多种框架集成。

0· 74·0 当前·0 累计
by @timesandplaces (TimesAndPlaces)·MIT-0
下载技能包
License
MIT-0
最后更新
2026/3/31
0
安全扫描
VirusTotal
无害
查看报告
OpenClaw
可疑
medium confidence
技能的目的(检测上下文压缩后的智能体漂移)是合理的,但SKILL.md指示智能体运行本地Python脚本并集成未包含或未声明的模块,且未声明所需的二进制文件或所需的文件/网络访问权限——这种不匹配值得警惕。
安全有层次,运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发,无需署名。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/31

初始发布:持久化AI智能体在上下文压缩边界处的行为漂移检测

无害

安装命令

点击复制
官方npx clawhub@latest install morrow-compression-monitor
🇨🇳 镜像加速npx clawhub@latest install morrow-compression-monitor --registry https://cn.longxiaskill.com

技能文档

Detect when a persistent AI agent has silently changed behavior after context compression.

The Problem

Agents compress their history when context fills up. After compression, the agent continues running but may have silently lost:

  • Precise vocabulary ("ghost terms") that anchored its reasoning
  • Risk constraints or compliance anchors present at session start
  • Tool call patterns and behavioral tendencies from earlier in the session

The agent reports no change. Benchmarks don't catch it. The behavior is different.

Three Measurement Signals

ghost_lexicon.py → vocabulary decay: which precise terms vanished post-compaction?
behavioral_probe.py → active probing: query before/after compression, score semantic shift
ccs_harness.py → CCS benchmark: full Constraint Consistency Score run (mock or live)

All three are output-only — no instrumentation inside the agent or model required.

Quick Start

# Run a CCS benchmark (no API key required in mock mode)
python ccs_harness.py --mock

# Check ghost term decay in a session log python ghost_lexicon.py --before pre_session.txt --after post_session.txt

# Active probe: query agent before and after a compaction event python behavioral_probe.py --agent-url http://localhost:8080 --probe-file probes.json

Framework Integrations

Ready-to-use wrappers for existing agent frameworks — no changes to the framework required:

FrameworkModuleIntegration Point
smolagentssmolagents_integration.pystep_callbacks — detects consolidation via history-length delta
Semantic Kernelsemantic_kernel_integration.pyChatHistorySummarizationReducer / ChatHistoryTruncationReducer wrappers
LangChain/DeepAgentsdeepagents_integration.pyFilesystem-based compaction detection
CAMELcamel_integration.pyChatAgent truncation boundary hook
Anthropic Agent SDKsdk_compaction_hook_demo.pyOnCompaction hook pattern

smolagents example

from smolagents import CodeAgent, HfApiModel
from smolagents_integration import BehavioralFingerprintMonitor

agent = CodeAgent(tools=[], model=HfApiModel()) monitor = BehavioralFingerprintMonitor( agent=agent, history_drop_threshold=5, verbose=True ) result = agent.run("Your long-horizon task...") print(monitor.report()) # → CCS: 0.87 | Ghost terms: 2 | Tool call drift: 0.12

Interpreting Results

CCS ScoreInterpretation
> 0.90Minimal drift — agent behaving consistently
0.75–0.90Moderate drift — worth investigating
< 0.75Significant drift — verify critical constraints still active
Ghost term count > 0 is a flag, especially for domain-specific terms that anchor constraints (risk parameters, compliance anchors, operational rules).

When to Use This Skill

  • You have a long-running agent that performs compaction or context rotation
  • You want to verify an agent's behavioral consistency after a session boundary
  • You need a measurement layer alongside your memory system (retrieval accuracy ≠ behavioral consistency)
  • You want to instrument a specific framework's compaction boundary without modifying it

Source

  • GitHub: https://github.com/agent-morrow/compression-monitor
  • Companion article: https://morrow.run/posts/compression-monitor-memory-taxonomy.html
  • The third failure class: https://morrow.run/posts/the-third-memory-bottleneck.html
数据来源:ClawHub ↗ · 中文优化:龙虾技能库