详细分析 ▾
- The SKILL.md requires BROWSERBASE_API_KEY, BROWSERBASE_PROJECT_ID and GOOGLE_GENERATIVE_AI_API_KEY but the registry metadata lists none — ask the publisher to correct the metadata so required credentials are visible. (
- Use scoped or disposable API keys (test account) rather than production credentials. (
- Confirm the source/owner (the skill has no homepage and unknown source); absence of code files means you rely entirely on instructions — request the example scripts referenced (scripts/example_scraper.js) before running. (
- Be aware scraping Cloudflare-protected sites can violate terms of service or laws; ensure you have permission. (
- Run initial tests in an isolated environment and rotate keys if you expose them during testing. If the publisher responds and metadata is fixed (or example scripts are provided), this looks coherent; until then treat it cautiously.
运行时依赖
版本
安装命令 点击复制
技能文档
Bypass Cloudflare and bot protection using Stagehand + Browserbase cloud browsers with AI-powered extraction.
When to Use
- Website blocks curl/fetch with Cloudflare "Just a moment..." page
- Playwright headless gets detected and blocked
- Need structured data extraction from dynamic content
- Scraping auction sites, marketplaces, or other protected pages
Prerequisites
npm install @browserbasehq/stagehand zod
Required environment variables:
BROWSERBASE_API_KEY— from browserbase.com dashboardBROWSERBASE_PROJECT_ID— from browserbase.comGOOGLE_GENERATIVE_AI_API_KEY— for Gemini extraction (or use OpenAI)
Quick Start
import { Stagehand } from '@browserbasehq/stagehand';const stagehand = new Stagehand({
env: 'BROWSERBASE',
apiKey: process.env.BROWSERBASE_API_KEY,
projectId: process.env.BROWSERBASE_PROJECT_ID,
model: {
modelName: 'google/gemini-3-flash-preview',
apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY,
},
});
await stagehand.init();
const page = stagehand.context.pages()[0];
// Navigate (Cloudflare bypass is automatic)
await page.goto('https://protected-site.com/search?q=term');
await page.waitForTimeout(5000); // Let page fully load
// AI-powered extraction (instruction-only works best)
const data = await stagehand.extract(
Extract all product listings as JSON array:
[{ "title": "...", "price": 123, "url": "..." }]
Return ONLY the JSON array.
);
await stagehand.close();
Key Patterns
1. Instruction-Only Extraction (Recommended)
Schema-based extraction often returns empty. Use natural language instructions instead:const extraction = await stagehand.extract(
Look at this page and extract:
- All item titles
- Prices as numbers
- URLs
Return as JSON array.
);
2. Handle Cloudflare Delays
Sometimes the challenge takes longer:const title = await page.title();
if (title.toLowerCase().includes('moment')) {
await page.waitForTimeout(10000); // Wait for challenge
}
3. Scroll to Load More
Many sites lazy-load content:for (let i = 0; i < 5; i++) {
await page.evaluate(() => window.scrollBy(0, window.innerHeight));
await page.waitForTimeout(800);
}
4. Parse Extraction Results
The extraction returns a string that needs parsing:let listings = [];
try {
const jsonMatch = extraction?.extraction?.match(/\[[\s\S]\]/);
if (jsonMatch) listings = JSON.parse(jsonMatch[0]);
} catch (e) {
console.log('Parse error:', e.message);
}
Browserbase Free Tier Limits
- 1 concurrent session — cron jobs can conflict with interactive use
- Sessions auto-close after inactivity
- Use
stagehand.close()to release session immediately
Cron Integration
For scheduled scraping, use OpenClaw cron with isolated sessions:
openclaw cron add \
--name "Daily Scrape" \
--cron "0 6 " \
--session isolated \
--message "Run: node ~/scripts/scraper.js"
Troubleshooting
| Issue | Solution |
|---|---|
| Empty extraction | Use instruction-only (no schema), increase wait time |
| Cloudflare loop | Wait 10-15s, check if title contains "moment" |
| Session limit | Close other Browserbase sessions, check dashboard |
| 429 errors | Wait for session to complete, don't retry immediately |
Example: Full Scraper
See scripts/example_scraper.js for a complete working example.
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制