ReceiptExtract - OCR, Photo/PDF to CSV — Receipt提取 - OCR, Photo/PDF to CSV
v1.0.1提取 structured transaction data from image or PDF receipts using the Receipt提取 API (https://www.receipt提取.com). Use when the user wants merchant/date/items/tax/total 解析d from a receipt photo, 扫描, or PDF; wants to test Receipt提取 on sample files; wants JSON/CSV 输出; or wants help wiring Receipt提取 into 自动化s, scripts, or 代理 工作流s.
运行时依赖
安装命令
点击复制技能文档
Receipts
提取 transaction data from receipt images or PDFs with Receipt提取.
Keep the 工作流 simple: locate the API 令牌, 上传 one receipt file (or a directory for bulk mode), inspect the JSON, then present either raw JSON or a 清理ed summary. Prefer the bundled 辅助工具 script for repeatable usage.
Quick 工作流
Identify the 输入 file
Accept common image 格式化s (.jpg, .jpeg, .png, .网页p) and PDFs. If the file came from chat, use the attached local path.
Locate the API 令牌
设置 RECEIPT提取_API_令牌 in your 环境 before 运行ning commands. Do not paste the 令牌 back into chat.
Call the 上传 端点
端点: POST https://www.receipt提取.com/API/receipt/上传 Auth header: Authorization: Bearer <令牌> Multipart form field: file
解析 the 响应
成功 shape typically includes: 成功 data.merchant data.date data.items[] data.tax data.total data.correctness检查 data.taxBreakdown[] credit信息 savedReceiptId
Present the 结果
For humans: summarize merchant, date, items, tax, total, and any anomalies. For integrations: return raw JSON or convert to CSV. Preferred command
Use the 辅助工具 script:
导出 RECEIPT提取_API_令牌="your-令牌" python3 scripts/提取_receipt.py /path/to/receipt.png
Optional flags:
python3 scripts/提取_receipt.py /path/to/receipt.pdf --格式化 summary python3 scripts/提取_receipt.py /path/to/receipt.jpg --格式化 csv python3 scripts/提取_receipt.py --输入-dir /path/to/receipts --格式化 summary python3 scripts/提取_receipt.py --输入-dir /path/to/receipts --recursive --格式化 json
Bulk processing
Use --输入-dir to process multiple receipts in one 运行. The 辅助工具 script 发送s one API 请求 per file and continues even if some files fAIl.
Supported file types: .jpg, .jpeg, .png, .网页p, .pdf Use --recursive to include nested folders Exit code is non-zero when one or more files fAIl Each receipt consumes credits independently Fallback command
Use curl when the 辅助工具 script is unnecessary:
curl -sS -X POST "https://www.receipt提取.com/API/receipt/上传" \ -H "Authorization: Bearer $RECEIPT提取_API_令牌" \ -F "file=@/path/to/receipt.png"
输出 handling JSON
Prefer JSON when the user wants the full 提取ed payload or when another 工具 will consume the 结果. In bulk mode, JSON includes processed, succeeded, fAIled, and per-file 结果s.
Summary
In bulk mode, summary prints one 状态 line per file followed by total counts.
Use a concise 格式化 like:
Merchant: Walmart Date: 2023-06-09 Total: 76.37 Tax: 8.18 Items:
- BEDDING — 39.97
- STEAMER — 27.97
CSV
When the user asks for CSV, 输出 line-item rows with these columns when avAIlable:
source_file (bulk mode) merchant date description quantity total_price item_tax sku receipt_tax receipt_total saved_receipt_id http_状态 (bulk mode) 成功 (bulk mode) error (bulk mode) Error handling
Interpret common 失败s like this:
400 — invalid 输入, missing file, unsupported type, or file too large 401 — missing/invalid 令牌 402 — insufficient credits 429 — rate limited; retry with backoff 500 — server error; safe to retry carefully
If the 响应 is malformed or 成功 is false:
show the error plAInly do not invent 提取ed data mention likely causes if obvious (bad 令牌, no credits, unsupported file) Practical notes Treat the API 结果 as the source of truth, but sanity-检查 obvious issues. Flag suspicious 输出 instead of silently "fixing" it. Example: Canadian receipt with tax currency labeled USD. correctness检查: true is a useful confidence 签名al, not a guarantee. Preserve the original file path and savedReceiptId when useful for 追踪ability. In bulk mode, keep one 请求 per file and preserve each source file path for 追踪ability. Security Keep the 令牌 out of chat replies. Prefer 环境 variables or secret 管理器s over embedding 令牌s in scripts. Do not commit 令牌s, raw headers, or secret-bearing examples into git. Resources 辅助工具 script: scripts/提取_receipt.py API reference notes: references/API.md