- ARCH-04: job51 migrated to crawler_core (no old deps) - ARCH-05: zhilian migrated to crawler_core (no old deps) - 34 new mock tests (17 job51 + 17 zhilian) - Added _parse_zhilian_response custom parser for zhilian API format - Fixed POST Searcher _request() overrides for job51/zhilian - Full regression: 98 passed in 0.12s
6.0 KiB
6.0 KiB
Technology Stack
Analysis Date: 2026-03-21
Languages
Primary:
- Python 3.13 - Backend API, crawlers, ECS pipeline scripts
- JavaScript (ES Modules) - Vue3 frontend (no strict TypeScript; TS installed but JS used in
.vueand.jsfiles)
Secondary:
- TypeScript 5.1.6 - Type definitions referenced in frontend toolchain only
Runtime
Environment:
- Python 3.13 (required:
>=3.13perPipfile; Dockerfile usespython:3.11-slim-bullseyefor container build) - Node.js 18 (Dockerfile
node:18-alpinefor frontend build stage)
Package Manager:
- Python:
pipenv(development),pip/requirements.txt(Docker production) - Frontend:
pnpm(lockfile:web/pnpm-lock.yamlpresent) - Lockfiles:
Pipfile.lock(Python),web/pnpm-lock.yaml(frontend),uv.lock(uv-compatible)
Frameworks
Core Backend:
- FastAPI 0.111.0 - REST API framework (
app/__init__.pyfactory viacreate_app()) - Starlette 0.37.2 - ASGI underpinning (middleware, static files)
- Uvicorn 0.34.0 - ASGI server (20 workers default, configurable via
UVICORN_WORKERS) - Uvloop 0.21.0 - High-performance event loop (non-Windows only)
ORM / Database:
- Tortoise-ORM 0.23.0 - Async ORM for MySQL (
app/models/) - Aerich 0.8.1 - Database migrations for Tortoise-ORM (
migrations/directory) - aiomysql - Async MySQL driver (Tortoise backend)
- clickhouse-connect 0.7.19 - ClickHouse async client (
app/core/clickhouse.py) - asyncpg - Async PostgreSQL driver (listed in Pipfile but not actively used)
Validation / Serialization:
- Pydantic 2.10.5 - Request/response schemas (
app/schemas/) - pydantic-settings 2.7.1 - Settings management via env vars (
app/settings/config.py) - orjson 3.10.14 - Fast JSON serialization
- ujson 5.10.0 - Alternative JSON library
Authentication / Security:
- PyJWT 2.10.1 - JWT token generation/verification (HS256, 7-day expiry)
- passlib 1.7.4 - Password hashing
- argon2-cffi 23.1.0 - Argon2 password hasher
Scheduling:
- APScheduler - Async job scheduler (
app/core/scheduler.py, 6 registered cron tasks)
HTTP Clients:
- httpx 0.28.1 - Async HTTP client (backend service-to-service calls)
- requests - Sync HTTP client (spider scripts in
jobs_spider/)
Logging:
- loguru 0.7.3 - Structured logging (unified throughout backend and spiders)
Core Frontend:
- Vue 3.3.4 - SPA framework (
web/src/main.js) - Vue Router 4.2.4 - Client-side routing (
web/src/router/index.js) - Pinia 2.1.6 - State management (
web/src/store/) - Naive UI 2.34.4 - Component library (admin UI)
- ECharts 6.0.0 - Data visualization charts (
web/src/views/analytics/) - axios 1.4.0 - HTTP client (
web/src/utils/http/) - vue-i18n 9 - Internationalization
Build / Dev Tools:
- Vite 4.4.6 - Frontend bundler (
web/vite.config.js) - UnoCSS 66.5.10 - Atomic CSS engine
- unplugin-auto-import 20.3.0 - Auto-imports for Vue APIs
- unplugin-vue-components 30.0.0 - Auto-imports for components
- @iconify/vue + @iconify/json - Icon library
Spider-specific:
- playwright 1.57.0 - Browser automation (anti-detection in crawlers)
- PyExecJS 1.5.1 - Execute JavaScript signing algorithms from Python (
jobs_spider/boss/) - PySocks - SOCKS proxy support for spider requests
- tenacity - Retry logic in crawlers
- pandas + openpyxl - Data processing and Excel export
Cloud:
- alibabacloud_ecs20140526 - Alibaba Cloud ECS SDK (
ecs_full_pipeline.py) - alibabacloud_credentials - AliCloud credential management
Code Quality:
- ruff 0.9.1 - Python linter (
pyproject.tomlconfig: line-length 120, ignores F403/F405) - black 24.10.0 - Python formatter (line-length 120, target py310/py311)
- isort 5.13.2 - Import sorter
- ESLint 8.46.0 - Frontend linter (
@zclzone+@unocssrule sets) - prettier - Frontend formatter
Key Dependencies
Critical:
clickhouse-connect==0.7.19- All analytics and job data storage; loss means no data read/writetortoise-orm==0.23.0- All business data (users, roles, keywords, tokens); paired withaerichfor migrationsfastapi==0.111.0- API layer; version-pinned for stabilityAPScheduler- 6 scheduled tasks including ECS pipeline and IP alertingalibabacloud_ecs20140526- ECS node management; required for crawler scaling
Infrastructure:
uvicorn==0.34.0+uvloop==0.21.0- Production ASGI server stackpydantic==2.10.5- All input validation; v2 API used throughoutloguru==0.7.3- Unified logging across all modulesredis- Optional distributed lock backend (app/core/locks.py; falls back to file locks if Redis unavailable)
Configuration
Environment:
- All settings in
app/settings/config.pyviapydantic-settings.BaseSettings - Environment variables override defaults at startup
- No
.envfile detected (not committed); variables set at OS/container level - Key variables:
APP_HOST,APP_PORT,UVICORN_WORKERS,CLICKHOUSE_HOST,CLICKHOUSE_USER,CLICKHOUSE_PASS,SECRET_KEY,SMTP_HOST,SMTP_USER,SMTP_PASS,REDIS_HOST,REPORT_ENDPOINT - Security warning:
config.pycontains hardcoded default values for MySQL, ClickHouse, and SMTP credentials; must be overridden in production
Build:
pyproject.toml- Python project metadata, black/ruff tool configPipfile/Pipfile.lock- Development dependency managementrequirements.txt- Production pip install (used in Dockerfile)web/vite.config.js- Vite build config with proxy support viaVITE_USE_PROXYenv varDockerfile- Multi-stage build: Node 18 (frontend) + Python 3.11 (backend) + nginx
Platform Requirements
Development:
- Python 3.13+ (Pipfile requirement)
- Node.js 18+ with pnpm
- MySQL server (Tortoise-ORM connection)
- ClickHouse server (analytics data)
- Optional: Redis (distributed locking upgrade)
Production:
- Docker-based deployment (see
Dockerfile,deploy/entrypoint.sh,deploy/web.conf) - Nginx serves frontend static files and proxies API requests
- Alibaba Cloud ECS (cn-shanghai-b zone) for crawler nodes
- Ports: 80 (nginx), 9999 (FastAPI backend direct)
Stack analysis: 2026-03-21