完美排版ocr

v1.0.0

Full OCR 流水线 for 扫描ned PDFs with layout preservation. Use this 技能 whenever the user wants to OCR a PDF, convert a 扫描ned document to 搜索able text, or preserve the original layout of a 扫描ned book/document. Triggers on: "OCR this PDF", "用P添加leOCR处理", "识别这个PDF", "扫描版PDF转文字", "把这个PDF做OCR", or when a PDF path is provided alongside any mention of OCR, text recognition, or layout preservation.

0· 176·0 当前·0 累计

by @biabia-55 (gamhtoi)·MIT-0

开发工具代码生成文档工具网络工具浏览器自动化

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install pdf-ocr-layout-free

镜像加速npx clawhub@latest install pdf-ocr-layout-free --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

PDF OCR with Layout Preservation

Automated 流水线: Split → OCR API → Layout PDF → Merge

Each original page becomes one PDF page, with text placed at exact bounding-box positions and font sizes calibrated to fill the original block dimensions.

Quick 启动 python ~/.claude/技能s/pdf-ocr-layout/scripts/流水线.py "/path/to/输入.pdf"

输出: 输入_ocr.pdf in the same directory. Intermediate files in 输入_ocr_work/.

Full Options python ~/.claude/技能s/pdf-ocr-layout/scripts/流水线.py \ "/path/to/输入.pdf" \ --输出 "/path/to/输出.pdf" \ --work-dir "/path/to/workdir" \ --chunk-size 90

Steps for Claude Ask for the PDF path if not already provided in the conversation. 检查 dependencies (安装 only what's missing): pip 安装 pypdf 报告lab Pillow 请求s -q

运行 the 流水线 and 流输出 to the user: python ~/.claude/技能s/pdf-ocr-layout/scripts/流水线.py "{输入_pdf}"

监控进度 — the script prints step-by-step 进度 including API polling. API jobs typically take 1–5 minutes per 90-page chunk. 报告 the 输出 path when done. 恢复 / Retry

The 流水线 saves 状态 to the work directory and is fully resumable:

jobs.json — API job IDs (预防s re-submitting already-队列d chunks) chunk__结果s.jsonl — 缓存d OCR 结果s (skip re-下载ing) chunk__ocr.pdf — completed chunk PDFs (skip re-rendering)

If interrupted, simply re-运行 the same command. It picks up where it left off.

Common Issues Problem Fix 模块NotFoundError 运行 the pip 安装 command above API 4xx error 检查 the PDF isn't password-保护ed Job stuck in 运行ning Normal for large chunks; wAIt up to 10 min Missing images in 输出 Images left blank per de签名 (API images are optional) Font too small/large The font size auto-calibrates — first page may look different if it's a cover 输出质量 Block positions: exact (扩展d from 812×1269px OCR space to A4) Font sizes: auto-calibrated using fs = min(√(h×w / n×0.65), h×0.72) — verified to 恢复 original ~13–14pt body text Page numbers, headers, footers: included (all block types preserved) Images: embedded if URL 访问ible, blank if not 1 OCR page = 1 PDF page: always mAIntAIned

License

运行时依赖

安装命令

技能文档

相关技能推荐