win 76085ac403 docs(01-02): complete sign algorithms plan — SUMMARY, STATE, ROADMAP updates
- Create 01-02-SUMMARY.md with implementation details, test counts, deviation docs
- STATE.md: mark plan 02 complete, add decisions from plan 02, update position
- ROADMAP.md: mark phase 1 plans 2/2 complete, update progress table
- REQUIREMENTS.md: mark QUAL-01 complete (41 sign algorithm unit tests)
2026-03-21 18:27:35 +08:00

7.7 KiB

phase, plan, subsystem, tags, dependency_graph, tech_stack, key_files, decisions, metrics
phase plan subsystem tags dependency_graph tech_stack key_files decisions metrics
01-shared-core 02 crawler_core
python
sign-algorithms
unit-tests
pure-functions
boss
qcwy
zhilian
requires provides affects
01-01
crawler_core/boss/sign.py
crawler_core/qcwy/sign.py
crawler_core/zhilian/sign.py
spiderJobs
app/services/crawler
tests/crawler_core
added patterns
pytest-for-sign-tests
pure-function-testing
sys-path-conftest
try-except-import-guard
created modified
crawler_core/boss/sign.py
crawler_core/qcwy/sign.py
crawler_core/zhilian/sign.py
tests/crawler_core/test_boss_sign.py
tests/crawler_core/test_qcwy_sign.py
tests/crawler_core/test_zhilian_sign.py
conftest.py
pyproject.toml
crawler_core/__init__.py
Sign algorithm files ported verbatim from spiderJobs/platforms/ — only module docstring updated to reference crawler_core
tests/crawler_core/__init__.py NOT created (would shadow crawler_core package as namespace package, breaking imports)
conftest.py at project root adds project root to sys.path for reliable crawler_core subpackage imports
pyproject.toml [tool.pytest.ini_options] pythonpath=['.'] added for pytest runner path setup
crawler_core/__init__.py wrapped in try/except ImportError to allow sign subpackages to import in envs without requests_go
duration completed tasks_completed files_created files_modified
~15 minutes 2026-03-21T10:22:46Z 2 7 2

Phase 01 Plan 02: Sign Algorithm Port + Unit Tests — Summary

One-liner: Ported Boss/51Job/Zhilian sign algorithms verbatim from spiderJobs/platforms/ to crawler_core/{boss,qcwy,zhilian}/sign.py and wrote 41 pure-function unit tests (no HTTP, no mocks) covering all exported classes and helpers.

What Was Created

File Lines Purpose
crawler_core/boss/sign.py 108 BossSign traceid generator — pure stdlib (random, time), ported from spiderJobs
crawler_core/qcwy/sign.py 89 Job51Sign HMAC-SHA256 signing — pure stdlib (hmac, hashlib), ported from spiderJobs
crawler_core/zhilian/sign.py 87 ZhilianSign header/param signing — pure stdlib (math, random), ported from spiderJobs
tests/crawler_core/test_boss_sign.py 85 13 tests: BossSign, _compute_checksum, _generate_uuid, _CHARS
tests/crawler_core/test_qcwy_sign.py 73 12 tests: Job51Sign init, build_sign_path, generate_uuid
tests/crawler_core/test_zhilian_sign.py 91 16 tests: ZhilianSign init, sign_headers, sign_params, generate_uuid
conftest.py 4 Project root sys.path setup for pytest import resolution

Exported Interfaces

crawler_core/boss/sign.py:

from crawler_core.boss.sign import BossSign, _compute_checksum, _generate_uuid, _CHARS

BossSign.generate_traceid(prefix="M-W") -> str  # "M-W" + 19-char uuid + 3-char checksum = 25 chars
BossSign(mpt="", wt2="")  # token holders, no HTTP
_compute_checksum(uuid_str: str) -> str  # 3-char base62 deterministic checksum
_generate_uuid() -> str  # 13-char hex timestamp + 6-char base62 random = 19 chars

crawler_core/qcwy/sign.py:

from crawler_core.qcwy.sign import Job51Sign, SIGN_KEY

SIGN_KEY = "abfc8f9d..."  # 64-char HMAC key
Job51Sign(sign_key=SIGN_KEY)
Job51Sign.generate_uuid() -> str  # 13-char ms timestamp + 10-char random int = 23 digits
Job51Sign.build_sign_path(endpoint, method, params, body) -> tuple[str, str]  # (url_path, sign_hex)

crawler_core/zhilian/sign.py:

from crawler_core.zhilian.sign import ZhilianSign

ZhilianSign(at="", rt="", device_id=None, version="4.1.259", channel="wxxiaochengxu", platform="12")
ZhilianSign.generate_uuid() -> str  # UUID4-format uppercase hex, 36 chars
ZhilianSign.sign_headers(page_code="0") -> dict  # 9 headers including x-zp-device-id, x-zp-business-system="73"
ZhilianSign.sign_params() -> dict  # 6 params: at, rt, channel, platform, version, d

Test Counts

File Tests Classes
test_boss_sign.py 13 TestBossSignGenerateTraceid (6), TestComputeChecksum (4), TestGenerateUuid (3)
test_qcwy_sign.py 12 TestJob51SignInit (2), TestJob51SignBuildSignPath (7), TestJob51SignGenerateUuid (3)
test_zhilian_sign.py 16 TestZhilianSignInit (4), TestZhilianSignHeaders (6), TestZhilianSignParams (3), TestZhilianGenerateUuid (3)
Total 41 10 test classes

Decisions Made

Decision Rationale
Port verbatim, change only docstring Ensures behavioral parity with originals; no regression risk
No tests/crawler_core/__init__.py Creating it would make tests/crawler_core/ a Python package named crawler_core, shadowing the real package and causing ModuleNotFoundError: No module named 'crawler_core.boss'
conftest.py at project root Pytest loads conftest.py before test collection; sys.path.insert here ensures crawler_core is findable
try/except ImportError in crawler_core/init.py requests_go is not installed in the lightweight pyenv Python used for tests; wrapping allows sign subpackages to be imported without the heavy HTTP client
pyproject.toml pythonpath=["."] Belt-and-suspenders path setup for pytest in addition to conftest.py
Old spiderJobs files preserved Plan explicitly states not to delete/modify until Phase 4 cleanup

Commits

Hash Task Description
bd1e50e Task 1 Port sign algorithms to crawler_core/ platform directories
333a6d1 Task 2 Write sign algorithm unit tests for crawler_core

Deviations from Plan

Auto-fixed Issues

1. [Rule 2 - Missing Critical Functionality] Removed tests/crawler_core/init.py

  • Found during: Task 2 verification
  • Issue: Creating tests/crawler_core/__init__.py (as the plan specified) caused pytest to treat tests/crawler_core/ as the crawler_core package, shadowing the actual crawler_core/ package at project root. Result: ModuleNotFoundError: No module named 'crawler_core.boss'
  • Fix: Removed tests/crawler_core/__init__.py; added conftest.py at project root; added [tool.pytest.ini_options] pythonpath=["."] to pyproject.toml
  • Files modified: conftest.py (new), pyproject.toml (modified), crawler_core/__init__.py (try/except guard — actually already done by 01-01 executor)
  • Commits: 333a6d1

Known Stubs

None — all three sign algorithm files are fully implemented pure functions. The test files are complete with all specified test cases.

Self-Check: PASSED

All created files exist:

  • crawler_core/boss/sign.py — FOUND
  • crawler_core/qcwy/sign.py — FOUND
  • crawler_core/zhilian/sign.py — FOUND
  • tests/crawler_core/test_boss_sign.py — FOUND
  • tests/crawler_core/test_qcwy_sign.py — FOUND
  • tests/crawler_core/test_zhilian_sign.py — FOUND
  • conftest.py — FOUND

All commits exist:

  • bd1e50e — FOUND (feat(01-02): port sign algorithms)
  • 333a6d1 — FOUND (test(01-02): write sign algorithm unit tests)

Cross-contamination checks:

  • grep -r "from spiderJobs" crawler_core/ → PASSED: no spiderJobs imports
  • grep -r "import requests" crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py → PASSED: no HTTP imports
  • git diff --name-only spiderJobs/ → PASSED: spiderJobs files untouched

Test count: 41 tests across 3 files (requirement: 24+ tests)

  • test_boss_sign.py: 13 tests (requirement: 8+)
  • test_qcwy_sign.py: 12 tests (requirement: 7+) — Note: plan specifies 5+ but acceptance criteria says 7+
  • test_zhilian_sign.py: 16 tests (requirement: 9+)

Import verification: All three sign modules import successfully from crawler_core with project root on sys.path.