📦 Compression Monitor — 上下文压缩行为漂移检测

v1.0.0

用于检测持久化AI智能体在上下文压缩事件后的行为漂移。通过测量三个可观察信号来验证压缩后智能体行为一致性：词汇衰减（ghost lexicon decay）、上下文一致性评分（CCS）、工具调用分布偏移。无需访问智能体内部即可工作，支持smolagents、Semantic Kernel、LangChain等多种框架集成。

0· 74·0 当前·0 累计

by @timesandplaces (TimesAndPlaces)·MIT-0

下载技能包

License

MIT-0

最后更新

2026/3/31

安全扫描

VirusTotal

无害

查看报告

OpenClaw

可疑

medium confidence

技能的目的（检测上下文压缩后的智能体漂移）是合理的，但SKILL.md指示智能体运行本地Python脚本并集成未包含或未声明的模块，且未声明所需的二进制文件或所需的文件/网络访问权限——这种不匹配值得警惕。

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.02026/3/31

初始发布：持久化AI智能体在上下文压缩边界处的行为漂移检测

● 无害

安装命令

点击复制

官方npx clawhub@latest install morrow-compression-monitor

🇨🇳 镜像加速npx clawhub@latest install morrow-compression-monitor --registry https://cn.longxiaskill.com

技能文档

Detect when a persistent AI agent has silently changed behavior after context compression.

The Problem

Agents compress their history when context fills up. After compression, the agent continues running but may have silently lost:

Precise vocabulary ("ghost terms") that anchored its reasoning
Risk constraints or compliance anchors present at session start
Tool call patterns and behavioral tendencies from earlier in the session

The agent reports no change. Benchmarks don't catch it. The behavior is different.

Three Measurement Signals

ghost_lexicon.py → vocabulary decay: which precise terms vanished post-compaction?
behavioral_probe.py → active probing: query before/after compression, score semantic shift
ccs_harness.py → CCS benchmark: full Constraint Consistency Score run (mock or live)

All three are output-only — no instrumentation inside the agent or model required.

Quick Start

# Run a CCS benchmark (no API key required in mock mode) python ccs_harness.py --mock # Check ghost term decay in a session log python ghost_lexicon.py --before pre_session.txt --after post_session.txt

# Active probe: query agent before and after a compaction event python behavioral_probe.py --agent-url http://localhost:8080 --probe-file probes.json

Framework Integrations

Ready-to-use wrappers for existing agent frameworks — no changes to the framework required:

Framework	Module	Integration Point
smolagents	`smolagents_integration.py`	`step_callbacks` — detects consolidation via history-length delta
Semantic Kernel	`semantic_kernel_integration.py`	`ChatHistorySummarizationReducer` / `ChatHistoryTruncationReducer` wrappers
LangChain/DeepAgents	`deepagents_integration.py`	Filesystem-based compaction detection
CAMEL	`camel_integration.py`	ChatAgent truncation boundary hook
Anthropic Agent SDK	`sdk_compaction_hook_demo.py`	`OnCompaction` hook pattern

smolagents example

from smolagents import CodeAgent, HfApiModel
from smolagents_integration import BehavioralFingerprintMonitoragent = CodeAgent(tools=[], model=HfApiModel())
monitor = BehavioralFingerprintMonitor(
    agent=agent,
    history_drop_threshold=5,
    verbose=True
)
result = agent.run("Your long-horizon task...")
print(monitor.report())  # → CCS: 0.87 | Ghost terms: 2 | Tool call drift: 0.12

Interpreting Results

CCS Score	Interpretation
> 0.90	Minimal drift — agent behaving consistently
0.75–0.90	Moderate drift — worth investigating
< 0.75	Significant drift — verify critical constraints still active

Ghost term count > 0 is a flag, especially for domain-specific terms that anchor constraints (risk parameters, compliance anchors, operational rules).

When to Use This Skill

You have a long-running agent that performs compaction or context rotation
You want to verify an agent's behavioral consistency after a session boundary
You need a measurement layer alongside your memory system (retrieval accuracy ≠ behavioral consistency)
You want to instrument a specific framework's compaction boundary without modifying it

Source

GitHub: https://github.com/agent-morrow/compression-monitor
Companion article: https://morrow.run/posts/compression-monitor-memory-taxonomy.html
The third failure class: https://morrow.run/posts/the-third-memory-bottleneck.html

数据来源：ClawHub ↗ · 中文优化：龙虾技能库