- Create 01-02-SUMMARY.md with implementation details, test counts, deviation docs - STATE.md: mark plan 02 complete, add decisions from plan 02, update position - ROADMAP.md: mark phase 1 plans 2/2 complete, update progress table - REQUIREMENTS.md: mark QUAL-01 complete (41 sign algorithm unit tests)
7.7 KiB
phase, plan, subsystem, tags, dependency_graph, tech_stack, key_files, decisions, metrics
| phase | plan | subsystem | tags | dependency_graph | tech_stack | key_files | decisions | metrics | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-shared-core | 02 | crawler_core |
|
|
|
|
|
|
Phase 01 Plan 02: Sign Algorithm Port + Unit Tests — Summary
One-liner: Ported Boss/51Job/Zhilian sign algorithms verbatim from spiderJobs/platforms/ to crawler_core/{boss,qcwy,zhilian}/sign.py and wrote 41 pure-function unit tests (no HTTP, no mocks) covering all exported classes and helpers.
What Was Created
| File | Lines | Purpose |
|---|---|---|
crawler_core/boss/sign.py |
108 | BossSign traceid generator — pure stdlib (random, time), ported from spiderJobs |
crawler_core/qcwy/sign.py |
89 | Job51Sign HMAC-SHA256 signing — pure stdlib (hmac, hashlib), ported from spiderJobs |
crawler_core/zhilian/sign.py |
87 | ZhilianSign header/param signing — pure stdlib (math, random), ported from spiderJobs |
tests/crawler_core/test_boss_sign.py |
85 | 13 tests: BossSign, _compute_checksum, _generate_uuid, _CHARS |
tests/crawler_core/test_qcwy_sign.py |
73 | 12 tests: Job51Sign init, build_sign_path, generate_uuid |
tests/crawler_core/test_zhilian_sign.py |
91 | 16 tests: ZhilianSign init, sign_headers, sign_params, generate_uuid |
conftest.py |
4 | Project root sys.path setup for pytest import resolution |
Exported Interfaces
crawler_core/boss/sign.py:
from crawler_core.boss.sign import BossSign, _compute_checksum, _generate_uuid, _CHARS
BossSign.generate_traceid(prefix="M-W") -> str # "M-W" + 19-char uuid + 3-char checksum = 25 chars
BossSign(mpt="", wt2="") # token holders, no HTTP
_compute_checksum(uuid_str: str) -> str # 3-char base62 deterministic checksum
_generate_uuid() -> str # 13-char hex timestamp + 6-char base62 random = 19 chars
crawler_core/qcwy/sign.py:
from crawler_core.qcwy.sign import Job51Sign, SIGN_KEY
SIGN_KEY = "abfc8f9d..." # 64-char HMAC key
Job51Sign(sign_key=SIGN_KEY)
Job51Sign.generate_uuid() -> str # 13-char ms timestamp + 10-char random int = 23 digits
Job51Sign.build_sign_path(endpoint, method, params, body) -> tuple[str, str] # (url_path, sign_hex)
crawler_core/zhilian/sign.py:
from crawler_core.zhilian.sign import ZhilianSign
ZhilianSign(at="", rt="", device_id=None, version="4.1.259", channel="wxxiaochengxu", platform="12")
ZhilianSign.generate_uuid() -> str # UUID4-format uppercase hex, 36 chars
ZhilianSign.sign_headers(page_code="0") -> dict # 9 headers including x-zp-device-id, x-zp-business-system="73"
ZhilianSign.sign_params() -> dict # 6 params: at, rt, channel, platform, version, d
Test Counts
| File | Tests | Classes |
|---|---|---|
test_boss_sign.py |
13 | TestBossSignGenerateTraceid (6), TestComputeChecksum (4), TestGenerateUuid (3) |
test_qcwy_sign.py |
12 | TestJob51SignInit (2), TestJob51SignBuildSignPath (7), TestJob51SignGenerateUuid (3) |
test_zhilian_sign.py |
16 | TestZhilianSignInit (4), TestZhilianSignHeaders (6), TestZhilianSignParams (3), TestZhilianGenerateUuid (3) |
| Total | 41 | 10 test classes |
Decisions Made
| Decision | Rationale |
|---|---|
| Port verbatim, change only docstring | Ensures behavioral parity with originals; no regression risk |
No tests/crawler_core/__init__.py |
Creating it would make tests/crawler_core/ a Python package named crawler_core, shadowing the real package and causing ModuleNotFoundError: No module named 'crawler_core.boss' |
conftest.py at project root |
Pytest loads conftest.py before test collection; sys.path.insert here ensures crawler_core is findable |
try/except ImportError in crawler_core/init.py |
requests_go is not installed in the lightweight pyenv Python used for tests; wrapping allows sign subpackages to be imported without the heavy HTTP client |
pyproject.toml pythonpath=["."] |
Belt-and-suspenders path setup for pytest in addition to conftest.py |
| Old spiderJobs files preserved | Plan explicitly states not to delete/modify until Phase 4 cleanup |
Commits
| Hash | Task | Description |
|---|---|---|
bd1e50e |
Task 1 | Port sign algorithms to crawler_core/ platform directories |
333a6d1 |
Task 2 | Write sign algorithm unit tests for crawler_core |
Deviations from Plan
Auto-fixed Issues
1. [Rule 2 - Missing Critical Functionality] Removed tests/crawler_core/init.py
- Found during: Task 2 verification
- Issue: Creating
tests/crawler_core/__init__.py(as the plan specified) caused pytest to treattests/crawler_core/as thecrawler_corepackage, shadowing the actualcrawler_core/package at project root. Result:ModuleNotFoundError: No module named 'crawler_core.boss' - Fix: Removed
tests/crawler_core/__init__.py; addedconftest.pyat project root; added[tool.pytest.ini_options] pythonpath=["."]topyproject.toml - Files modified:
conftest.py(new),pyproject.toml(modified),crawler_core/__init__.py(try/except guard — actually already done by 01-01 executor) - Commits:
333a6d1
Known Stubs
None — all three sign algorithm files are fully implemented pure functions. The test files are complete with all specified test cases.
Self-Check: PASSED
All created files exist:
crawler_core/boss/sign.py— FOUNDcrawler_core/qcwy/sign.py— FOUNDcrawler_core/zhilian/sign.py— FOUNDtests/crawler_core/test_boss_sign.py— FOUNDtests/crawler_core/test_qcwy_sign.py— FOUNDtests/crawler_core/test_zhilian_sign.py— FOUNDconftest.py— FOUND
All commits exist:
bd1e50e— FOUND (feat(01-02): port sign algorithms)333a6d1— FOUND (test(01-02): write sign algorithm unit tests)
Cross-contamination checks:
grep -r "from spiderJobs" crawler_core/→ PASSED: no spiderJobs importsgrep -r "import requests" crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py→ PASSED: no HTTP importsgit diff --name-only spiderJobs/→ PASSED: spiderJobs files untouched
Test count: 41 tests across 3 files (requirement: 24+ tests)
- test_boss_sign.py: 13 tests (requirement: 8+)
- test_qcwy_sign.py: 12 tests (requirement: 7+) — Note: plan specifies 5+ but acceptance criteria says 7+
- test_zhilian_sign.py: 16 tests (requirement: 9+)
Import verification: All three sign modules import successfully from crawler_core with project root on sys.path.