4 Commits

Author SHA1 Message Date
win
04d6303da2 feat(01-01): create crawler_core/base.py with Result[T] and crawler_core/__init__.py
- Define generic Result[T] dataclass (7 fields: success, status_code, data, list, count, is_end_page, error)
- Port parse_response() from spiderJobs/core/base.py returning Result[Any]
- BaseFetcher: 4 template methods (_build_params, _parse required; _build_headers, _check_blocked optional)
- BaseSearcher: 4 template methods with load_all() paginator using stdlib logging
- crawler_core/__init__.py exports BaseFetcher, BaseSearcher, Result, HTTPClient, parse_response
- No ApiResult, no loguru, no spiderJobs/app imports
2026-03-21 18:10:40 +08:00
win
ceb359d535 feat(01-01): create crawler_core/http_client.py with tenacity retry and stdlib logging
- Port HTTPClient from spiderJobs/core/http_client.py
- Add tenacity @retry decorator on post() and get() (3 attempts, min=10s wait)
- Use stdlib logging.getLogger('crawler_core.http_client') — no loguru
- No imports from spiderJobs.* or app.*
- TLS fingerprint and proxy logic preserved unchanged
2026-03-21 18:08:59 +08:00
win
bd1e50e410 feat(01-02): port sign algorithms to crawler_core/ platform directories
- Add crawler_core/boss/sign.py: BossSign traceid generator (pure stdlib)
- Add crawler_core/qcwy/sign.py: Job51Sign HMAC-SHA256 signing (pure stdlib)
- Add crawler_core/zhilian/sign.py: ZhilianSign header/param signing (pure stdlib)
- Add __init__.py for all three crawler_core platform directories
- Updated module docstrings to reference crawler_core; all logic unchanged
- No imports from spiderJobs or app; no HTTP dependencies
2026-03-21 18:08:53 +08:00
win
4932177f7c feat(01-01): create crawler_core package scaffold and pyproject.toml
- Create crawler_core/pyproject.toml with setuptools build config
- Add platform namespace __init__.py files for boss, qcwy, zhilian
- Add requests_go==1.0.9 and tenacity>=8.0 to Pipfile [packages]
- Add pytest, pytest-cov, pytest-anyio to Pipfile [dev-packages]
2026-03-21 18:07:54 +08:00