2026-03-24 01:49:31 +08:00

56 lines
3.2 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

容器运行正常,成功获取到关键词并开始爬取。
创建完成3 个文件
┌───────────────────────────┬─────────────────────────────────┐
│ 文件 │ 用途 │
├───────────────────────────┼─────────────────────────────────┤
│ spider.Dockerfile │ 爬虫专用镜像,轻量无前端/nginx │
├───────────────────────────┼─────────────────────────────────┤
│ spider-entrypoint.sh │ 支持单容器多进程INSTANCES=3
├───────────────────────────┼─────────────────────────────────┤
│ docker-compose.spider.yml │ 一键启动三个平台 │
└───────────────────────────┴─────────────────────────────────┘
使用方式
单机快速启动3 个平台各 3 个实例 = 9 个爬虫):
docker compose -f docker-compose.spider.yml up -d
多台电脑部署:
# 1. 构建并推送到你的镜像仓库
docker build -f spider.Dockerfile -t your-registry/jobdata-spider:latest .
docker push your-registry/jobdata-spider:latest
# 2. 每台电脑上拉取运行
docker run -d --name spider-boss \
-e PLATFORM=boss \
-e INSTANCES=3 \
-e API_BASE_URL=http://124.222.106.226:9999 \
-e SLEEP_MIN_SECONDS=5 \
-e SLEEP_MAX_SECONDS=12 \
-e INLINE_COMPANY=0 \
your-registry/jobdata-spider:latest
调参:
# 加速:降低延迟,多开实例
-e SLEEP_MIN_SECONDS=3 -e SLEEP_MAX_SECONDS=8 -e INSTANCES=5
# 需要代理
-e PROXY_TUNNEL=proxy.example.com:8080 -e PROXY_USERNAME=xxx -e PROXY_PASSWORD=xxx
# Boss 需要 Token会自动从后端 API 获取,也可手动指定)
-e BOSS_MPT=xxx -e BOSS_WT2=xxx
默认配置已优化:延迟 5-12 秒(比之前 10-20
快一倍INLINE_COMPANY=0关掉内联公司爬取。3 台电脑每台 INSTANCES=3Boss
吞吐量可以从现在的 5 个/小时 → ~270 个/小时。
✻ Cooked for 2m 16s
───────────────────────────────────────────────────────────────────────────────────────
 
───────────────────────────────────────────────────────────────────────────────────────
Opus 4.6 (1M context) │ JobData █░░░░░░░░░ 14%
⏵⏵ accept edits on (shift+tab to cycle)