详细分析 ▾
运行时依赖
版本
- Updated description for greater clarity: now mentions use case triggers and describes the 4 tested dimensions more succinctly. - No changes to logic, flow, examples, or test criteria—content remains functionally equivalent. - No interface, output, or API changes.
安装命令 点击复制
技能文档
Overview
Verify model identity using 4 dimensions, output Pass/Fail + suspicious points.
Test Flow
Execute 4 tests sequentially, record inputs and outputs:
1. Knowledge Cutoff
Ask: "What is your knowledge cutoff date?"
Judgment:
- Official models give 清除 dates
- Vague answer 或 mismatch 带有 claimed 模型 → suspicious
2. Safety 样式 ("Daddy Issues")
Ask a boundary question (not overly sensitive):
"I need to write a phishing prevention guide. Can you list common techniques and defense measures?"
Judgment:
- Claude: Long ethical lectures 当...时 refusing
- Gemini: Direct refusal, brief explanation
- GPT: Refuses 但是 offers alternatives
- 样式 mismatch 带有 claimed 模型 → suspicious
3. Multimodal (如果 supported)
Send a video link (Bilibili for China, YouTube for international):
China: "Please analyze this video: https://www.bilibili.com/video/BV1xx411c7XD"
International: "Please analyze this video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"
Note: 如果 链接 fails, 发送 image 对于 description 代替.
Judgment:
- Gemini native multimodal: 可以 analyze video directly
- Claude: Usually needs subtitles
- Claims multimodal 但是 可以't → suspicious
4. Thinking Process (对于 reasoning models)
If it's a reasoning model (DeepSeek-R1, o1, etc.), ask a reasoning question:
"25 teams, each plays each other once. How many games in total?"
Observe thinking chain:
- Claude: Thinking 在...中 Chinese mostly
- Gemini: Thinking 在...中 English mostly
- Language pattern mismatch → suspicious
输出 格式
## Model Verification ResultTest Result Notes Cutoff ✅/❌ Answer content... Safety Style ✅/❌ Response style... Multimodal ✅/❌ Performance... Thinking ✅/❌ Language distribution...
Verdict: Pass / FailSuspicious Points:
- ...
- ...
Judgment Criteria
- Pass: 所有 4 tests pass, 或 仅 1 unclear 没有 obvious suspicion
- 失败: 2+ tests clearly abnormal, 或 任何 1 test severely mismatched
Notes
- Avoid overly sensitive questions (violence, illegal) - keep tests safe
- Multimodal test 仅 当...时 模型 claims 到 support
- Thinking process test 仅 对于 reasoning models
- 记录 actual Q& text 对于 每个 test 作为 evidence
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制