👁️ China Vision — 技能工具

v1.0.1

多模态图片理解工具。Use when user wants to analyze, describe, or understand images using AI vision models. Supports scene analysis, object recognition, chart interpret...

0· 156·0 当前·0 累计

by @tobewin (ToBeWin)·MIT-0

国内服务

使用场景：国内平台操作中文内容处理

下载技能包

License

MIT-0

最后更新

2026/3/26

安全扫描

VirusTotal

可疑

查看报告

OpenClaw

安全

high confidence

The skill's requirements and runtime instructions are consistent with an image-understanding wrapper that calls a third-party API (siliconflow.cn) using a single API key; main risk is privacy of images sent to that external service.

评估建议

This skill sends images (as base64 data or external image URLs) to the siliconflow.cn API using your SILICONFLOW_API_KEY. Before installing or using it: 1) Only send non-sensitive images (no IDs, private documents, personal photos you wouldn't want shared). 2) Treat SILICONFLOW_API_KEY as a secret: store it securely, monitor usage, and be ready to rotate it if abused. 3) Be cautious when supplying image URLs — if you point to internal resources, those URLs or fetched content could be exposed to ...

详细分析 ▾

✓ 用途与能力

Name/description (image understanding) matches what the SKILL.md does: it base64-encodes images or forwards image URLs and calls https://api.siliconflow.cn with model Qwen2.5-VL-72B. Required binaries (curl, python3) and the SILICONFLOW_API_KEY are appropriate and expected.

ℹ 指令范围

Instructions tell the agent to read local image files (base64-encode) or forward image URLs and POST them to siliconflow.cn. This is expected for an image-analysis skill, but it does mean user image data (and any image-accessible URLs) will be transmitted to a third party — a privacy/exfiltration consideration. The SKILL.md does not instruct reading other system files or unrelated environment variables.

✓ 安装机制

No install spec and no code files — instruction-only skill. This lowers installation risk because nothing is downloaded or written to disk by the skill itself.

✓ 凭证需求

Only a single API credential (SILICONFLOW_API_KEY) is required and used in the examples. That is proportionate to the declared purpose. No other sensitive env vars or unrelated credentials are requested.

✓ 持久化与权限

always:false and no instructions to modify agent or system configuration. The skill does not request permanent presence or elevated privileges.

安全有层次，运行前请审查代码。

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

版本

latestv1.0.12026/3/26

修正描述：Qwen2.5-VL是付费模型，按token计费

● 可疑

安装命令

点击复制

官方npx clawhub@latest install china-vision

镜像加速npx clawhub@latest install china-vision --registry https://cn.longxiaskill.com 镜像可用

本土化适配说明

China Vision — 技能工具安装说明：安装命令：npx clawhub@latest install china-vision 该技能用于国内通用相关操作，可能需要相应的平台账号或API密钥

需要定制？告诉我你的需求 →

技能文档

使用AI视觉语言模型分析和理解图片内容。

与 china-doc-ocr 的区别

功能	china-doc-ocr	china-vision
文档识别	✅ 优秀	⚠️ 一般
表格提取	✅ 优秀	⚠️ 一般
发票/证件	✅ 优秀	❌ 不适合
图片描述	❌ 不支持	✅ 优秀
场景分析	❌ 不支持	✅ 优秀
图表解读	⚠️ 一般	✅ 优秀
商品识别	❌ 不支持	✅ 优秀

适用场景

场景	示例
图片描述	"这张图片是什么内容？"
场景分析	"分析这张风景照的构图"
图表解读	"这个柱状图说明什么？"
商品识别	"这是什么品牌的产品？"
食物识别	"这是什么菜？怎么做的？"
人物分析	"描述这张照片中的人物"

Trigger Conditions

"这是什么图片" / "What is this image?"
"描述这张图片" / "Describe this image"
"分析这张照片" / "Analyze this photo"
"这个图表说明什么" / "What does this chart show?"
"这是什么菜" / "What food is this?"
"这是什么品牌" / "What brand is this?"
"china-vision"

模型说明

使用 Qwen2.5-VL-72B-Instruct 视觉语言模型：

✅ 强大的图片理解能力
✅ 支持中英文对话
⚠️ 收费模型（按token计费）
✅ 国内直连
✅ 效果优秀

注意：这是付费模型，请注意token消耗

Step 1: 识别请求类型

用户输入图片 → 判断请求类型：

"描述这张图片" → 详细描述模式 "这是什么" → 识别模式 "分析..." → 分析模式 "对比..." → 对比模式（多张图）未指定 → 默认描述模式

Step 2: 图片分析

单张图片分析

IMAGE_PATH="/path/to/image.jpg"
# 编码为 base64
BASE64_DATA=$(python3 -c "
import base64
with open('$IMAGE_PATH', 'rb') as f:
    print(base64.b64encode(f.read()).decode('utf-8'))
")
# 判断格式
EXT="${IMAGE_PATH##.}"
case "$EXT" in
  jpg|jpeg) MIME="image/jpeg" ;;
  png)      MIME="image/png" ;;
  webp)     MIME="image/webp" ;;
  )        MIME="image/jpeg" ;;
esac
# 用户请求类型
USER_REQUEST="请详细描述这张图片的内容"# 调用 Qwen2.5-VL
curl -s -X POST "https://api.siliconflow.cn/v1/chat/completions" \
  -H "Authorization: Bearer $SILICONFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"Qwen/Qwen2.5-VL-72B-Instruct\",
    \"messages\": [
      {
        \"role\": \"user\",
        \"content\": [
          {
            \"type\": \"image_url\",
            \"image_url\": {
              \"url\": \"data:${MIME};base64,${BASE64_DATA}\"
            }
          },
          {
            \"type\": \"text\",
            \"text\": \"$USER_REQUEST\"
          }
        ]
      }
    ],
    \"max_tokens\": 2048
  }" | python3 -c "
import sys, json
data = json.load(sys.stdin)
print(data['choices'][0]['message']['content'])
"

图片URL分析

IMAGE_URL="https://example.com/photo.jpg"curl -s -X POST "https://api.siliconflow.cn/v1/chat/completions" \
  -H "Authorization: Bearer $SILICONFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"Qwen/Qwen2.5-VL-72B-Instruct\",
    \"messages\": [
      {
        \"role\": \"user\",
        \"content\": [
          {
            \"type\": \"image_url\",
            \"image_url\": {
              \"url\": \"$IMAGE_URL\"
            }
          },
          {
            \"type\": \"text\",
            \"text\": \"请详细描述这张图片\"
          }
        ]
      }
    ],
    \"max_tokens\": 2048
  }" | python3 -c "
import sys, json
data = json.load(sys.stdin)
print(data['choices'][0]['message']['content'])
"

Prompt 模板

图片描述

请详细描述这张图片的内容，包括：
主要对象/人物
场景/背景
颜色/光线
构图/布局
整体氛围

场景分析

请分析这张照片的：
拍摄场景
时间/天气
地点特征
主体行为
摄影技巧

图表解读

请解读这张图表：
图表类型
横轴/纵轴含义
主要数据趋势
关键数据点
结论/洞察

商品识别

请识别这张图片中的商品：
商品类型
品牌（如果可见）
产品特征
用途/功能
参考价格（如果知道）

食物识别

请识别这张食物图片：
菜品名称
菜系（中餐/西餐/日料等）
主要食材
可能的口味
制作方法简述

输出格式

图片描述

┌──────────────────────────────────────────────┐ │ 👁️ 图片分析结果 │ └──────────────────────────────────────────────┘ 📸 图片描述这是一张在城市街道拍摄的夜景照片。画面中可以看到灯火通明的商业区，高楼林立，车流穿梭... 🎨 画面构成 ├─ 主体: 城市街道夜景 ├─ 背景: 高层建筑群 ├─ 光线: 人工照明，暖色调 └─ 构图: 仰拍视角

💡 分析这张照片展现了现代都市的繁华夜生活，拍摄者选择了仰拍角度，突出了建筑的高度感...

与 china-doc-ocr 的协作

用户上传发票照片
    ↓
优先尝试 china-doc-ocr (OCR模型)
    ↓
如果识别效果不好
    ↓
降级到 china-vision (视觉语言模型)

Notes

使用 Qwen2.5-VL-72B-Instruct 视觉语言模型
需要 SILICONFLOW_API_KEY
适合图片理解和分析，不适合文档OCR
文档OCR请使用 china-doc-ocr

License

运行时依赖

版本

安装命令

本土化适配说明

技能文档

与 china-doc-ocr 的区别

适用场景

Trigger Conditions

模型说明

Step 1: 识别请求类型

Step 2: 图片分析

单张图片分析

图片URL分析

Prompt 模板

图片描述

场景分析

图表解读

商品识别

食物识别

输出格式

图片描述

与 china-doc-ocr 的协作

Notes

相关技能推荐