22 Commits

Author SHA1 Message Date
win
a177516c23 docs(phase-6): complete — QUAL-02 QUAL-06 QUAL-07 done, 146/146 tests 2026-03-21 22:56:38 +08:00
win
c58c7ee5c2 docs(phase-6): add research and 2 plans for quality and frontend 2026-03-21 21:27:11 +08:00
win
6f9d4df3e2 docs(06): UI design contract 2026-03-21 21:25:01 +08:00
win
9756df6044 docs(phase-5): complete execution — 2/2 plans done
Summary files and roadmap/STATE updated.
Phase 5 complete: DATA-01 (30-day dedup window) + DATA-04 (company channel)
DATA-02 + DATA-03 confirmed already complete.
2026-03-21 19:50:41 +08:00
win
9ef31cc87e docs(phase-5): add research and 2 plans for data pipeline optimization 2026-03-21 19:44:56 +08:00
win
6a2f0bfb58 docs(phase-4): complete execution — 2/2 plans, architecture corrected
- Plan 01: facade uses private _boss/job51/zhilian_api/client files
  Private files now depend on crawler_core directly (not spiderJobs)
  Added asyncio.to_thread async_* methods for ARCH-06
- Plan 02: 11 private files marked DEPRECATED, jobs_spider/ deleted
- Architecture: facade→private→crawler_core; spiderJobs→crawler_core (independent)
- Full regression: 106 passed
2026-03-21 19:40:03 +08:00
win
2e11edcef8 docs(phase-4): add research and 2 plans for backend facade migration 2026-03-21 19:26:01 +08:00
win
00a727519f docs(phase-3): complete execution — 2/2 plans, 98 tests passing
- ARCH-04: job51 migrated to crawler_core (no old deps)
- ARCH-05: zhilian migrated to crawler_core (no old deps)
- 34 new mock tests (17 job51 + 17 zhilian)
- Added _parse_zhilian_response custom parser for zhilian API format
- Fixed POST Searcher _request() overrides for job51/zhilian
- Full regression: 98 passed in 0.12s
2026-03-21 19:19:17 +08:00
win
024c2bcd49 docs(phase-3): add research and 2 plans for job51+zhilian migration 2026-03-21 19:10:59 +08:00
win
f6913ffdde docs(phase-2): complete phase execution — 2/2 plans done, verification passed
- ARCH-03: Boss crawler migrated to crawler_core (no inline signatures or HTTP boilerplate)
- QUAL-03: 22 mock tests pass covering all Boss API classes
- Anti-crawl mechanisms preserved (TLS fingerprint, proxy rotation, 10s delay)
- Phase 1 regression: 41 tests still passing
2026-03-21 19:04:55 +08:00
win
919ed9f799 docs(02-01): add plan execution summary 2026-03-21 19:00:58 +08:00
win
b20f77fa19 docs(phase-2): add research and 2 plans for Boss crawler migration 2026-03-21 18:48:58 +08:00
win
76085ac403 docs(01-02): complete sign algorithms plan — SUMMARY, STATE, ROADMAP updates
- Create 01-02-SUMMARY.md with implementation details, test counts, deviation docs
- STATE.md: mark plan 02 complete, add decisions from plan 02, update position
- ROADMAP.md: mark phase 1 plans 2/2 complete, update progress table
- REQUIREMENTS.md: mark QUAL-01 complete (41 sign algorithm unit tests)
2026-03-21 18:27:35 +08:00
win
d7c8bec287 docs(01-01): complete crawler_core package plan — SUMMARY, STATE, ROADMAP updates
- Create 01-01-SUMMARY.md with implementation details and interface contracts
- STATE.md: advance to plan 2, record metrics, add decisions from plan 01
- ROADMAP.md: update phase 1 plan progress (1/2 plans complete)
- REQUIREMENTS.md: mark ARCH-01, ARCH-02, QUAL-04, QUAL-05 complete
- crawler_core/__init__.py: preserve linter-added try/except ImportError guard
2026-03-21 18:14:19 +08:00
win
fe9a6d1403 docs(phase-1): create plans (2 plans, 2 waves) with checker revision 2026-03-21 17:53:13 +08:00
win
b27686a409 docs(01-shared-core): create phase 1 plans for crawler_core shared package
Plan 01-01 (Wave 1): Package scaffold with HTTPClient + tenacity retry (min=10s)
+ stdlib logging + BaseFetcher/BaseSearcher base classes + pyproject.toml.
Covers ARCH-01, ARCH-02, QUAL-04, QUAL-05.

Plan 01-02 (Wave 2): Sign algorithm migration (Boss/Job51/Zhilian) to
crawler_core/ + comprehensive unit tests — no HTTP, no mocks, pure functions.
Covers QUAL-01. 24+ test cases across 3 test files.

ROADMAP updated: Phase 1 now shows 2 concrete plans instead of TBD.
2026-03-21 17:45:14 +08:00
win
81b9305568 docs: gather Phase 1 context (shared core package) 2026-03-21 17:08:26 +08:00
win
44b5f390aa docs: create roadmap (6 phases) 2026-03-21 17:00:12 +08:00
win
5e9102148a docs: define v1 requirements 2026-03-21 16:39:05 +08:00
win
f3005ef525 docs: add research findings (stack, features, architecture, pitfalls, summary) 2026-03-21 16:36:37 +08:00
win
030da1ce53 chore: add project config 2026-03-21 16:19:53 +08:00
win
9166c4f7bc docs: initialize project 2026-03-21 16:17:12 +08:00