win 00a727519f docs(phase-3): complete execution — 2/2 plans, 98 tests passing
- ARCH-04: job51 migrated to crawler_core (no old deps)
- ARCH-05: zhilian migrated to crawler_core (no old deps)
- 34 new mock tests (17 job51 + 17 zhilian)
- Added _parse_zhilian_response custom parser for zhilian API format
- Fixed POST Searcher _request() overrides for job51/zhilian
- Full regression: 98 passed in 0.12s
2026-03-21 19:19:17 +08:00

6.0 KiB

Technology Stack

Analysis Date: 2026-03-21

Languages

Primary:

  • Python 3.13 - Backend API, crawlers, ECS pipeline scripts
  • JavaScript (ES Modules) - Vue3 frontend (no strict TypeScript; TS installed but JS used in .vue and .js files)

Secondary:

  • TypeScript 5.1.6 - Type definitions referenced in frontend toolchain only

Runtime

Environment:

  • Python 3.13 (required: >=3.13 per Pipfile; Dockerfile uses python:3.11-slim-bullseye for container build)
  • Node.js 18 (Dockerfile node:18-alpine for frontend build stage)

Package Manager:

  • Python: pipenv (development), pip / requirements.txt (Docker production)
  • Frontend: pnpm (lockfile: web/pnpm-lock.yaml present)
  • Lockfiles: Pipfile.lock (Python), web/pnpm-lock.yaml (frontend), uv.lock (uv-compatible)

Frameworks

Core Backend:

  • FastAPI 0.111.0 - REST API framework (app/__init__.py factory via create_app())
  • Starlette 0.37.2 - ASGI underpinning (middleware, static files)
  • Uvicorn 0.34.0 - ASGI server (20 workers default, configurable via UVICORN_WORKERS)
  • Uvloop 0.21.0 - High-performance event loop (non-Windows only)

ORM / Database:

  • Tortoise-ORM 0.23.0 - Async ORM for MySQL (app/models/)
  • Aerich 0.8.1 - Database migrations for Tortoise-ORM (migrations/ directory)
  • aiomysql - Async MySQL driver (Tortoise backend)
  • clickhouse-connect 0.7.19 - ClickHouse async client (app/core/clickhouse.py)
  • asyncpg - Async PostgreSQL driver (listed in Pipfile but not actively used)

Validation / Serialization:

  • Pydantic 2.10.5 - Request/response schemas (app/schemas/)
  • pydantic-settings 2.7.1 - Settings management via env vars (app/settings/config.py)
  • orjson 3.10.14 - Fast JSON serialization
  • ujson 5.10.0 - Alternative JSON library

Authentication / Security:

  • PyJWT 2.10.1 - JWT token generation/verification (HS256, 7-day expiry)
  • passlib 1.7.4 - Password hashing
  • argon2-cffi 23.1.0 - Argon2 password hasher

Scheduling:

  • APScheduler - Async job scheduler (app/core/scheduler.py, 6 registered cron tasks)

HTTP Clients:

  • httpx 0.28.1 - Async HTTP client (backend service-to-service calls)
  • requests - Sync HTTP client (spider scripts in jobs_spider/)

Logging:

  • loguru 0.7.3 - Structured logging (unified throughout backend and spiders)

Core Frontend:

  • Vue 3.3.4 - SPA framework (web/src/main.js)
  • Vue Router 4.2.4 - Client-side routing (web/src/router/index.js)
  • Pinia 2.1.6 - State management (web/src/store/)
  • Naive UI 2.34.4 - Component library (admin UI)
  • ECharts 6.0.0 - Data visualization charts (web/src/views/analytics/)
  • axios 1.4.0 - HTTP client (web/src/utils/http/)
  • vue-i18n 9 - Internationalization

Build / Dev Tools:

  • Vite 4.4.6 - Frontend bundler (web/vite.config.js)
  • UnoCSS 66.5.10 - Atomic CSS engine
  • unplugin-auto-import 20.3.0 - Auto-imports for Vue APIs
  • unplugin-vue-components 30.0.0 - Auto-imports for components
  • @iconify/vue + @iconify/json - Icon library

Spider-specific:

  • playwright 1.57.0 - Browser automation (anti-detection in crawlers)
  • PyExecJS 1.5.1 - Execute JavaScript signing algorithms from Python (jobs_spider/boss/)
  • PySocks - SOCKS proxy support for spider requests
  • tenacity - Retry logic in crawlers
  • pandas + openpyxl - Data processing and Excel export

Cloud:

  • alibabacloud_ecs20140526 - Alibaba Cloud ECS SDK (ecs_full_pipeline.py)
  • alibabacloud_credentials - AliCloud credential management

Code Quality:

  • ruff 0.9.1 - Python linter (pyproject.toml config: line-length 120, ignores F403/F405)
  • black 24.10.0 - Python formatter (line-length 120, target py310/py311)
  • isort 5.13.2 - Import sorter
  • ESLint 8.46.0 - Frontend linter (@zclzone + @unocss rule sets)
  • prettier - Frontend formatter

Key Dependencies

Critical:

  • clickhouse-connect==0.7.19 - All analytics and job data storage; loss means no data read/write
  • tortoise-orm==0.23.0 - All business data (users, roles, keywords, tokens); paired with aerich for migrations
  • fastapi==0.111.0 - API layer; version-pinned for stability
  • APScheduler - 6 scheduled tasks including ECS pipeline and IP alerting
  • alibabacloud_ecs20140526 - ECS node management; required for crawler scaling

Infrastructure:

  • uvicorn==0.34.0 + uvloop==0.21.0 - Production ASGI server stack
  • pydantic==2.10.5 - All input validation; v2 API used throughout
  • loguru==0.7.3 - Unified logging across all modules
  • redis - Optional distributed lock backend (app/core/locks.py; falls back to file locks if Redis unavailable)

Configuration

Environment:

  • All settings in app/settings/config.py via pydantic-settings.BaseSettings
  • Environment variables override defaults at startup
  • No .env file detected (not committed); variables set at OS/container level
  • Key variables: APP_HOST, APP_PORT, UVICORN_WORKERS, CLICKHOUSE_HOST, CLICKHOUSE_USER, CLICKHOUSE_PASS, SECRET_KEY, SMTP_HOST, SMTP_USER, SMTP_PASS, REDIS_HOST, REPORT_ENDPOINT
  • Security warning: config.py contains hardcoded default values for MySQL, ClickHouse, and SMTP credentials; must be overridden in production

Build:

  • pyproject.toml - Python project metadata, black/ruff tool config
  • Pipfile / Pipfile.lock - Development dependency management
  • requirements.txt - Production pip install (used in Dockerfile)
  • web/vite.config.js - Vite build config with proxy support via VITE_USE_PROXY env var
  • Dockerfile - Multi-stage build: Node 18 (frontend) + Python 3.11 (backend) + nginx

Platform Requirements

Development:

  • Python 3.13+ (Pipfile requirement)
  • Node.js 18+ with pnpm
  • MySQL server (Tortoise-ORM connection)
  • ClickHouse server (analytics data)
  • Optional: Redis (distributed locking upgrade)

Production:

  • Docker-based deployment (see Dockerfile, deploy/entrypoint.sh, deploy/web.conf)
  • Nginx serves frontend static files and proxies API requests
  • Alibaba Cloud ECS (cn-shanghai-b zone) for crawler nodes
  • Ports: 80 (nginx), 9999 (FastAPI backend direct)

Stack analysis: 2026-03-21