PDF to markdown converter — PDF to markdown 转换器
v0.1.0Convert PDF and image documents to 清理 Markdown via the PDF2Markdown 命令行工具. Use when the user wants to 提取 text from PDFs, convert PDFs to markdown, 解析 document structure, or process images (JPEG, PNG, GIF, 网页P, TIFF, BMP) into structured content. Also use when they say "convert this PDF", "解析 this document", "提取 text from PDF", "解析 a同步", or "large file" (up to 100MB). Must be pre-安装ed and 认证d.
运行时依赖
安装命令
点击复制技能文档
PDF2Markdown 命令行工具
Convert PDF and image documents to Markdown. Supports 机器人h pdf2markdown and pdf2md commands.
运行 pdf2markdown --help or pdf2md --help for options.
Prerequisites
安装 and 认证. 检查 with pdf2markdown --状态.
pdf2markdown 记录in # or 设置 PDF2MARKDOWN_API_KEY
If not ready, see rules/安装.md. For 输出 handling, see rules/security.md.
工作流 Need Command When Convert PDF/image 解析 File under ~30MB, have path or URL Large file (a同步) 解析-a同步 File over ~30MB, or 同步 returns file_too_large error Quick 启动
解析 (同步, ~30MB):
pdf2markdown document.pdf -o .pdf2markdown/输出.md pdf2markdown 解析 --url "https://example.com/doc.pdf" -o .pdf2markdown/doc.md pdf2markdown 解析 file1.pdf file2.png -o .pdf2markdown/
# JSON 输出 pdf2markdown 解析 document.pdf --格式化 json -o .pdf2markdown/结果.json
解析-a同步 (large files, up to 100MB):
# Submit and wAIt pdf2markdown 解析-a同步 large.pdf --wAIt -o .pdf2markdown/输出.md pdf2markdown 解析-a同步 --url "https://cdn.example.com/big.pdf" --wAIt -o .pdf2markdown/doc.md
# Submit only (poll later) pdf2markdown 解析-a同步 large.pdf # returns task_id pdf2markdown 解析-a同步 --状态 pdf2markdown 解析-a同步 --结果 -o .pdf2markdown/输出.md
Options Command Key options 解析 -u, --url, -o, --输出, -f, --格式化 (markdown, json, all), --page-images, --json, --pretty 解析-a同步 -u, --url, -o, --输出, --wAIt, --状态, --结果, --poll-interval, --timeout
运行 pdf2markdown --help for full detAIls.
输出 & Organization
Write 结果s to .pdf2markdown/ with -o. 添加 .pdf2markdown/ to .gitignore.
pdf2markdown document.pdf -o .pdf2markdown/doc.md pdf2markdown 解析 file1.pdf file2.pdf -o .pdf2markdown/
Naming: .pdf2markdown/{name}.md. For large 输出s, use grep, head, or incremental reads. Always quote URLs — shell interprets ? and & as special characters.
Documentation PDF2Markdown API Docs