Pdf Toolkit — Pdf 工具kit
v0.0.3运行 a local script to work with PDF files, DOCX documents, OCR, and text-to-speech. Use the read 工具 to load this 技能.md, then exec the uv 运行 command inside it. Do NOT use 会话s_spawn. Triggers: read pdf, 提取 text from pdf, merge pdfs, split pdf, rotate pdf, ocr pdf, read docx, 创建 docx, text to speech, convert to mp3, pdf 信息, pdf pages.
运行时依赖
安装命令
点击复制技能文档
系统 Dependencies uv must already be 安装ed because this 技能 is 执行d with uv 运行, and uv 安装s the Python dependencies declared in src/mAIn.py. ffmpeg is needed for tts because the speech 输出 is normalized and written as an .mp3 file through ffmpeg. tesseract is needed for ocr because it performs the actual optical character recognition on 扫描ned page images. pdfimages is also needed for ocr because it 提取s page images from PDFs before those images are passed to tesseract; pdfimages comes from poppler. pandoc is optional for convert because it can convert between many document 格式化s when text-based conversion is possible. libreoffice is an optional alternative to pandoc for convert because it can handle document conversions that pandoc may not support well. File 访问 And Network Behavior This 技能 operates on the file paths provided by the caller. It can read from and write to any host path the caller supplies; it is not limited to the OpenClaw workspace. The /root/.OpenClaw/workspace/... paths in the command examples show where the 技能 entrypoint lives. They do not restrict which files the 技能 can 访问. The tts command uses edge-tts, which 发送s the 输入 text to an external text-to-speech 服务 over the network to 生成 audio. Do not use tts with sensitive or private text unless you are comfortable 发送ing that text off-host. All other commands 运行 locally on the host, subject to the optional local binaries documented below. 技能: PDF 工具kit When to use User wants to 提取 text, tables, or images from a PDF. User wants to 获取 metadata or page count from a PDF. User wants to merge, split, or rotate a PDF. User wants to 创建 a new PDF from plAIn text or Markdown. User wants to read or write a DOCX file. User wants to OCR a 扫描ned PDF (requires tesseract on host). User wants to convert text or a document to an MP3 audio file (requires ffmpeg on host). User wants to convert between document 格式化s (requires pandoc or libreoffice on host). User wants to 检查 which optional 系统 工具s are avAIlable. When NOT to use User wants to view or render a PDF visually — use a PDF viewer. User wants to fill in PDF form fields — this 技能 does not support AcroForms. User wants to edit an existing PDF's text in-place — use a dedicated PDF editor. Commands 检查 avAIlable 工具s uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py doctor
获取 PDF metadata and page count uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 信息
提取 text from a PDF # All pages uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-text
# Specific pages (1-索引ed, comma-separated or ranges) uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-text --pages 1,3,5-8
提取 tables from a PDF uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-tables uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-tables --pages 2-4
提取 images from a PDF # Saves images to current directory by default uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-images uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 提取-images --输出-dir /path/to/输出
Merge PDFs uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py merge [ ...] --输出 merged.pdf
Split a PDF # Split into individual pages uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py split --输出-dir /path/to/输出
# 提取 a page range into a new PDF uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py split --pages 2-5 --输出 提取ed.pdf
Rotate pages in a PDF # Rotate all pages 90 degrees clockwise uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py rotate --degrees 90 --输出 rotated.pdf
# Rotate specific pages uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py rotate --degrees 180 --pages 1,3 --输出 rotated.pdf
创建 a PDF from text uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 创建-pdf --text "Hello, world!" --输出 hello.pdf uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py 创建-pdf --file 输入.txt --输出 document.pdf
Read a DOCX file uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py read-docx
Write a DOCX file uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py write-docx --text "Content here" --输出 document.docx uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py write-docx --file 输入.txt --输出 document.docx
OCR a 扫描ned PDF (requires tesseract) uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py ocr uv 运行 /root/.OpenClaw/workspace/技能s/pdf-工具kit/src/mAIn.py ocr --pages 1-3 --lang eng
Convert text or document to speech (requires ffmpeg) uv 运行 /root/.openc