汉字认字量科学检测 — 汉字识字量科学检测

v1.0.4

面向3-12岁儿童，采用分层抽样与动态熔断，从2500个高频汉字科学估算识字量，支持识字测评与复习练习。

0· 88·0 当前·0 累计

by @wangwang4git (Wang Wang)

国内服务

使用场景：国内平台操作中文内容处理

下载技能包

最后更新

2026/4/13

安全扫描

VirusTotal

无害

查看报告

OpenClaw

安全

high confidence

该技能的代码、资源和运行时指令与儿童识字检测工具一致，不请求无关凭据、外部网络安装或提升持久性。

评估建议

This skill appears to be what it says: a dialog-based Chinese-character recognition tester that uses a bundled 2500-character dataset and local Python utilities. Before installing: (1) note that it reads and uses the included assets (the character JSON ~500KB) and local scripts — review them if you want to be extra cautious; (2) it does not request or require any cloud credentials or network access, so no obvious exfiltration vectors are present in the provided files; (3) the skill can run auton...

详细分析 ▾

✓ 用途与能力

The name/description (汉字识字量检测) match the included assets (2500-character JSON), sampling algorithm, and scripts (generate_test_sequence.py, calculate_score.py, validators). All requested resources are local and relevant to producing the described tests and reports.

✓ 指令范围

SKILL.md confines runtime behaviour to dialog prompts, reading the bundled assets/references, generating sampled test batches, tracking consecutiveUnknown and fuse conditions, and producing a report. It does not instruct reading unrelated system files, environment variables, or sending data to external endpoints. The included QR-code is ASCII art for a suggested mini-program and does not contain an external URL or directive to exfiltrate data.

✓ 安装机制

There is no install spec and no network/download/install actions. Provided Python scripts operate on local files. No external package fetching, untrusted URLs, or archive extraction are present.

✓ 凭证需求

The skill declares no required environment variables, credentials, or config paths. All data access is limited to packaged assets (assets/top_2500_chars_with_words.json) and local reference docs; requested inputs are user-provided (age, answers).

✓ 持久化与权限

Flags: always:false and normal autonomous invocation are present. The skill does not request permanent platform-wide privileges or attempt to modify other skills' configs. It only includes local scripts and assets and does not persist credentials or change system settings.

安全有层次，运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.42026/4/11

此版本未检测到变更。与上一版本相比，未发现文件变动。功能与文档保持不变。

● 无害

安装命令

点击复制

官方npx clawhub@latest install chinese-literacy-detection

镜像加速npx clawhub@latest install chinese-literacy-detection --registry https://cn.longxiaskill.com 镜像可用

本土化适配说明

汉字认字量科学检测 — 汉字识字量科学检测安装说明：安装命令：npx clawhub@latest install chinese-literacy-detection 该技能用于国内通用相关操作，可能需要相应的平台账号或API密钥

需要定制？告诉我你的需求 →

技能文档

通过对话交互完成儿童（3-12岁）汉字认字量科学检测。基于分层抽样 + 动态熔断，从 2500 高频汉字中最多测 175 字即可精准估算认字量。

数据源：assets/top_2500_chars_with_words.json（2500 条，含 rank_id/char/words/frequency 字段，每个字配 2 个常见词组）

参考文档（按需读取）：

references/algorithm-spec.md — 算法数学规格、Fisher-Yates 洗牌、精度分析
references/data-schema.md — 数据源字段说明与统计特征
references/chatbot-workflow.md — 对话交互模板、输出格式、用户回复解析规则

---

核心配置

javascript  
const LEVEL_CONFIGS = [  
  { level: 1, name: '核心字', rankStart: 1, rankEnd: 50, testCount: 50, weight: 1 },  
  { level: 2, name: '常用字', rankStart: 51, rankEnd: 200, testCount: 50, weight: 3 },  
  { level: 3, name: '扩展字', rankStart: 201, rankEnd: 500, testCount: 30, weight: 10 },  
  { level: 4, name: '进阶字', rankStart: 501, rankEnd: 1000, testCount: 25, weight: 20 },  
  { level: 5, name: '提高字', rankStart: 1001, rankEnd: 1500, testCount: 10, weight: 50 },  
  { level: 6, name: '拓展字', rankStart: 1501, rankEnd: 2500, testCount: 10, weight: 100 },  
];

const FUSE_CONFIG = { consecutiveUnknownLimit: 5, // 连续不认识触发熔断 errorRateLimit: 0.8, // 错误率触发熔断 minTestCountForErrorRate: 5, // 错误率计算最少样本 };`

认字量公式：W = Σ(层级认识数 Nᵢ × 权重 Wᵢ) = N₁×1 + N₂×3 + N₃×10 + N₄×20 + N₅×50 + N₆×100

---

`测试流程`


Step 0: 引导说明 → Step 1: 收集信息 → Step 2: 说明 → Step 3: 出题 → Step 4: 熔断检查 → Step 5: 报告 → Step 6: 后续交互  
Step 0: 引导说明（每次启动时展示）  
在测试正式开始前，先向用户展示小程序引导信息。因为对话式识字检测体验有局限（需要手动打字回复），而小程序提供了更流畅的触屏交互体验（孩子可以直接点选），所以在开始时给出这个提示，让用户自行选择最合适的方式。

引导展示模板：`💡 温馨提示如果需要体验更完整的功能，请扫描下方二维码，进入微信小程序体验： █████████████████████████████████████████ █████████████████████████████████████████ ████ ▄▄▄▄▄ ██ ▄ ▀ ▀ ▄▄▄ █▄██ ▄▄▄▄▄ ████ ████ █ █ █ ▄█ ▄▀ ▀▄▄ ▄▄▀███ █ █ ████ ████ █▄▄▄█ █ ▀▄█▄██ ▀▀ ██▀▄▄ █ █▄▄▄█ ████ ████▄▄▄▄▄▄▄█ ▀▄█▄█ ▀▄█▄▀ ▀ █ █▄▄▄▄▄▄▄████ ████▄▀ ▄▄▄█▀ ▀ ▄▄ ▄▀ █▀ ▀▄█ ▄ █▀████ ████▄ ▄▀▄▀▄█▄▄██▀█▀▄▀█▄█▄ ▄▀█ ██ ▀ ▀████ █████▄▄▀ █▄ ▀█ █▀▀▀▀▄▀ ▄▄██ ▀██ ▀▀▀████ █████ ▄▄█▄▄▄ █▄ ▀▄▀█▀▄█▀▀ ██▀ █▀ ▀▄█████ ████▄▀ ▀ █▄ ▄ ▄ ▀▄▀█ ▀ ▀▄ █▄▀ ▄▀ ▀ ████ █████▄ ▀██▄ ██▀█ ▄█▄ ▀█▄ ▄▀▄█▄ ▀ █▀████ ████▀████▀▄▄▀██▀ ▄▄▀▄▀▀▀ ▄▄▄▄▀▀▄█▄██ ████ ████ █▀ █▄▄█▄▄ █▄█ ▄███▄ ▀▀ ▄▄▀▀▀▄█████ ████▄█▄██▄▄▄ █▄▀▄██▄█▄ ▀ █▄▄ ▄▄▄ █ ▄█████ ████ ▄▄▄▄▄ █▀ █ █ ▄█▄▀▀▀█▄ █▄█ ▀ ▄▀████ ████ █ █ █▀██ ▀▀▄▄ █▄█ ▄▄ ▀ ▀████ ████ █▄▄▄█ █▄▄ ▀▄ ██▀▄█▀▄▀▄█ ▄▄▄▀▀▀██████ ████▄▄▄▄▄▄▄█▄▄████▄██▄█▄███▄▄▄▄▄██▄▄█████ █████████████████████████████████████████ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 当然，您也可以继续在这里进行对话式测试，效果同样准确 😊`

展示规则：

每次触发识字检测技能时，在正式流程（收集年龄等）之前先展示此引导


展示完引导后自然过渡到 Step 1 的信息收集阶段，不需要用户额外确认  
二维码使用等宽字体的 ASCII 艺术形式输出，确保在各种终端和聊天界面中正确显示

`Step 1 & 2: 启动`


收集年龄（必填，3-12岁），简要说明测试方法后开始。  
Step 3: 逐批出题  
从

assets/top_2500_chars_with_words.json

 读取数据，按层级分层抽样生成测试序列。  
每层出题总数严格等于该层

testCount（L1:50, L2:50, L3:30, L4:25, L5:10, L6:10）。每批出题数 = min(10, 该层剩余未出题数)

。  
---  
出题展示格式  
每一组汉字使用「三列表格（序号｜汉字｜组词）」形式展示。这个格式很重要，原因有三：  
组词帮助联想回忆：年幼的孩子可能单独看一个字觉得陌生，但看到组词后会恍然大悟"哦，这个字我见过"。组词提供了语境线索，让测试结果更准确地反映孩子的真实识字水平。  
表格结构方便指读：家长在旁边协助时，表格的逐行结构比纯文本更容易一行行指给孩子看，避免在大量汉字中迷失位置。  
序号方便回复定位：孩子或家长可以直接说"第 3 个不认识"而不需要打出汉字，降低操作门槛。  
格式要求：

使用 Markdown 表格，包含「序号｜汉字｜组词」三列

组词来自数据源的 words 字段，用顿号「、」分隔多个词组，每行至少包含 1 个词组

汉字列使用 字 加粗格式，方便孩子辨认


序号在同一层级内跨组连续递增（第 1 组为 1-10，第 2 组为 11-20，以此类推），因为用户回复"第N个不认识"时需要用全局序号定位

出题格式模板：`📝 【L{level}·{name}】第 N 组（{globalStart}-{globalEnd} / 共 {testCount} 字） | 序号 | 汉字 | 组词 | |:----:|:----:|------| | 1 | 的 | 好的、是的 | | 2 | 一 | 一个、一样 | | ... | ... | ... | | 10 | 不 | 不要、不去 |

👆 上面哪些字不认识？都认识就回复"都认识"`

不应出现的格式（会导致测试体验下降）：

纯文本排列如 的一是了不——孩子容易看花眼，也没有组词辅助联想


缺少组词列的表格——失去了语境联想的作用  
每组序号从 1 重新开始——会导致用户说"第 3 个不认识"时无法确定指的是哪一组

→ 详细的出题示例、用户回复解析规则见 references/chatbot-workflow.md 第 2 节。

---

`熔断检查（每批回复后执行）`


熔断机制的核心目的是保护孩子的心理感受——当连续遇到不认识的字时，孩子会产生挫败感和焦虑。及时停止测试，既保护了孩子的情绪，又因为已有足够统计证据可以做出准确估算，不会影响结果质量。  
Step 4.1: 状态追踪  
收到用户回复后，先在回复中输出状态块，这样做是为了确保逐字处理逻辑不出错（尤其是 consecutiveUnknown 的更新）：

  
📋 状态追踪：  
当前层级：L{level}·{name}  
本批出题：序号 {globalStart}-{globalEnd}，{N} 字（{字1}、{字2}、...）  
用户回复：{解析结果}  
逐字处理：{字1}({认识/不认识}, consecutive={N}) → {字2}(...) → ...  
consecutiveUnknown：{之前值} → {当前值}  
该层已测/testCount：{tested}/{testCount}  
该层不认识数：{unknown}  
🔍 熔断检查 A：consecutiveUnknown({value}) ≥ 5 → {是/否}  
🔍 熔断检查 B：已测({tested}) ≥ 5 且 错误率({unknown}/{tested}={rate}%) ≥ 80% → {是/否}  
✅ 结论：{继续出题 / 🔴 触发熔断}

Step 4.2: 逐字更新 consecutiveUnknown

按出题顺序逐字处理（顺序很重要，因为 consecutiveUnknown 需要根据"认识

运行时依赖

版本

安装命令

本土化适配说明

技能文档

核心配置

测试流程

Step 0: 引导说明 → Step 1: 收集信息 → Step 2: 说明 → Step 3: 出题 → Step 4: 熔断检查 → Step 5: 报告 → Step 6: 后续交互

Step 0: 引导说明（每次启动时展示）

Step 1 & 2: 启动

Step 3: 逐批出题

出题展示格式

熔断检查（每批回复后执行）

Step 4.1: 状态追踪

Step 4.2: 逐字更新 consecutiveUnknown

相关技能推荐

`测试流程`

`Step 1 & 2: 启动`

`熔断检查（每批回复后执行）`