--- phase: 3 plan: 2 wave: 1 title: "迁移智联招聘(zhilian)层至 crawler_core + mock 测试" depends_on: [] files_modified: - spiderJobs/platforms/zhilian/client.py - spiderJobs/platforms/zhilian/api.py - spiderJobs/platforms/zhilian/main.py - spiderJobs/platforms/zhilian/sign.py - tests/zhilian/__init__.py - tests/zhilian/test_zhilian_client.py autonomous: true requirements: - ARCH-05 --- # Phase 3 Plan 02: 迁移智联招聘(zhilian)层至 crawler_core + mock 测试 ## Objective 将 `spiderJobs/platforms/zhilian/` 从依赖 `spiderJobs.core`(旧基类)改为依赖 `crawler_core`(新基类), 同时新增 `tests/zhilian/test_zhilian_client.py` mock 测试,满足 ARCH-05。 **与 job51 的关键差异:** - zhilian api.py 使用默认的 `parse_response`(无自定义 `_parse_response` 函数),无 `ApiResult` 替换 - zhilian client.py 需要特别保留 `ZhilianSign` 的 `sign_headers()` 和 `sign_params()` 接口 - `SearchCompanyPositions._build_params()` 通过 `self._client.signer.sign_params()` 访问 signer,迁移后不受影响 ## Must Haves - [ ] `client.py` 继承 `crawler_core.http_client.HTTPClient`,使用 `crawler_core.zhilian.sign.ZhilianSign` - [ ] `api.py` 使用 `crawler_core.base.BaseFetcher/BaseSearcher` - [ ] `api.py` 中 1 处 `self._http.get(` 替换为 `self.http_client.get(`(第 200 行) - [ ] `main.py` import 更新为 `crawler_core.base` - [ ] `sign.py` 改为向后兼容桩(重新导出 `crawler_core.zhilian.sign.ZhilianSign`) - [ ] `grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/{client,api,main,sign}.py` 无输出 - [ ] `tests/zhilian/__init__.py` 存在 - [ ] `tests/zhilian/test_zhilian_client.py` 存在 - [ ] `pytest tests/zhilian/ -v` 全部通过(>= 12 个测试) --- ## Wave 1 ### Task 2.1: 更新 client.py - `spiderJobs/platforms/zhilian/client.py`(当前内容) - `crawler_core/http_client.py`(目标基类) - `crawler_core/zhilian/sign.py`(ZhilianSign 新来源) 修改 `spiderJobs/platforms/zhilian/client.py`: 1. 第 10 行改为: ```python from crawler_core.http_client import HTTPClient ``` 2. 第 11 行改为: ```python from crawler_core.zhilian.sign import ZhilianSign ``` (删除 `from spiderJobs.core.http_client import HTTPClient` 和 `from spiderJobs.platforms.zhilian.sign import ZhilianSign`) **注意:** `ZhilianClient.get/post` 方法覆写了父类,并调用 `self.signer.sign_headers(page_code)`,这是 ZhilianSign 的接口,迁移后不受影响(接口签名完全一致)。 其他所有内容不变。 - `grep "from crawler_core.http_client import HTTPClient" spiderJobs/platforms/zhilian/client.py` 有输出 - `grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/client.py` 有输出 - `grep "spiderJobs.core" spiderJobs/platforms/zhilian/client.py` 无输出 - `python -c "from spiderJobs.platforms.zhilian.client import ZhilianClient"` 无 ImportError --- ### Task 2.2: 更新 api.py - `spiderJobs/platforms/zhilian/api.py`(当前完整内容,229 行) - `crawler_core/base.py`(BaseFetcher、BaseSearcher 接口) 修改 `spiderJobs/platforms/zhilian/api.py`: 1. 第 10 行改为: ```python from crawler_core.base import BaseFetcher, BaseSearcher ``` (删除 `from spiderJobs.core.base import BaseFetcher, BaseSearcher`) 2. 第 200 行(`SearchCompanyPositions._request()`)改为: ```python return self.http_client.get(self.ENDPOINT, params) ``` (原为 `return self._http.get(self.ENDPOINT, params)`) **注意:** zhilian api.py 无 ApiResult(使用默认解析器),无需替换 ApiResult。 `SearchCompanyPositions._build_params()` 中的 `self._client.signer.sign_params()` 不需要修改。 - `grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/api.py` 有输出 - `grep "spiderJobs.core" spiderJobs/platforms/zhilian/api.py` 无输出 - `grep "self\._http" spiderJobs/platforms/zhilian/api.py` 无输出 - `python -c "from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, SearchCompanyPositions"` 无 ImportError --- ### Task 2.3: 更新 main.py - `spiderJobs/platforms/zhilian/main.py`(当前内容,113 行) 修改 `spiderJobs/platforms/zhilian/main.py`: 1. 第 32 行改为: ```python from crawler_core.base import BaseFetcher, BaseSearcher ``` (删除 `from spiderJobs.core.base import BaseFetcher, BaseSearcher`) 其他内容不变(无 sign import,main.py 中签名通过 ZhilianClient 自动注入)。 - `grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/main.py` 有输出 - `grep "spiderJobs.core" spiderJobs/platforms/zhilian/main.py` 无输出 --- ### Task 2.4: 将 sign.py 改为向后兼容桩 - `spiderJobs/platforms/zhilian/sign.py`(当前内容,87 行的独立实现) - `crawler_core/zhilian/sign.py`(权威实现) 将 `spiderJobs/platforms/zhilian/sign.py` 完全替换为: ```python """ 向后兼容桩 — 智联招聘签名 已迁移至 crawler_core.zhilian.sign。 直接从 crawler_core 重新导出,避免下游代码出现 ImportError。 """ from crawler_core.zhilian.sign import ZhilianSign # noqa: F401 __all__ = ["ZhilianSign"] ``` - `grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/sign.py` 有输出 - `python -c "from spiderJobs.platforms.zhilian.sign import ZhilianSign; print(ZhilianSign().generate_uuid())"` 成功打印 UUID --- ### Task 2.5: 创建 tests/zhilian/__init__.py 创建 `tests/zhilian/__init__.py`,内容:`# tests/zhilian/` --- ### Task 2.6: 编写 tests/zhilian/test_zhilian_client.py - `spiderJobs/platforms/zhilian/api.py`(迁移后版本) - `spiderJobs/platforms/zhilian/client.py`(迁移后版本) - `crawler_core/zhilian/sign.py`(ZhilianSign 接口) - `tests/boss/test_boss_client.py`(参考风格) 创建 `tests/zhilian/test_zhilian_client.py`,包含以下测试: ```python """ 智联招聘 HTTP 层 mock 测试(QUAL-03 / ARCH-05) 使用 MagicMock 替代真实 HTTP 客户端,无网络依赖。 """ from __future__ import annotations from unittest.mock import MagicMock from crawler_core.zhilian.sign import ZhilianSign from spiderJobs.platforms.zhilian.api import ( SearchPositions, GetPositionDetail, GetCompanyExtDetail, GetCompanyDetail, SearchCompanyPositions, ) from spiderJobs.platforms.zhilian.client import ZhilianClient from crawler_core.base import Result # ── 1. SearchPositions(POST cgate)───────────────────── class TestSearchPositions: def _make_client(self, status_code=200, data=None): mock_client = MagicMock() mock_client.post.return_value = (status_code, data or {}) return mock_client def test_search_success_returns_list(self): data = { "data": {"list": [{"title": "Python 工程师"}], "numFound": 1}, "pageInfo": {"pageNum": 1, "pageSize": 15, "totalNum": 1} } searcher = SearchPositions(keyword="Python", city_code=538, client=self._make_client(200, data)) result = searcher.search(page_index=1) assert result.success is True def test_search_http_error(self): searcher = SearchPositions(client=self._make_client(403, {})) result = searcher.search(page_index=1) assert result.success is False assert result.status_code == 403 # ── 2. GetPositionDetail(GET cgate)──────────────────── class TestGetPositionDetail: def test_fetch_success(self): mock_client = MagicMock() mock_client.get.return_value = (200, {"data": {"jobName": "高级工程师"}}) fetcher = GetPositionDetail(number="CC123456", client=mock_client) result = fetcher.fetch() assert result.success is True def test_fetch_404(self): mock_client = MagicMock() mock_client.get.return_value = (404, {}) fetcher = GetPositionDetail(number="notexist", client=mock_client) result = fetcher.fetch() assert result.success is False assert result.status_code == 404 # ── 3. GetCompanyExtDetail(GET cgate)────────────────── class TestGetCompanyExtDetail: def test_fetch_success(self): mock_client = MagicMock() mock_client.get.return_value = (200, {"data": {"companyName": "测试公司"}}) fetcher = GetCompanyExtDetail( company_name="测试公司", company_number="CZ123", client=mock_client) result = fetcher.fetch() assert result.success is True # ── 4. GetCompanyDetail(GET cgate)───────────────────── class TestGetCompanyDetail: def test_fetch_success(self): mock_client = MagicMock() mock_client.get.return_value = (200, {"data": {"companyNumber": "CZ123"}}) fetcher = GetCompanyDetail(number="CZ123", client=mock_client) result = fetcher.fetch() assert result.success is True def test_fetch_http_error(self): mock_client = MagicMock() mock_client.get.return_value = (500, {}) fetcher = GetCompanyDetail(number="CZ123", client=mock_client) result = fetcher.fetch() assert result.success is False # ── 5. SearchCompanyPositions(GET capi)──────────────── class TestSearchCompanyPositions: def test_search_success(self): mock_signer = MagicMock(spec=ZhilianSign) mock_signer.sign_params.return_value = {"at": "", "rt": ""} mock_client = MagicMock() mock_client.signer = mock_signer mock_client.get.return_value = (200, {"data": {"list": [{"jobName": "测试岗位"}]}, "pageInfo": {}}) searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client) result = searcher.search(page_index=1) assert result.success is True assert mock_signer.sign_params.called def test_search_http_error(self): mock_signer = MagicMock(spec=ZhilianSign) mock_signer.sign_params.return_value = {} mock_client = MagicMock() mock_client.signer = mock_signer mock_client.get.return_value = (403, {}) searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client) result = searcher.search(page_index=1) assert result.success is False # ── 6. ZhilianClient — 签名头注入 ─────────────────────── class TestZhilianClientHeaders: def test_sign_headers_injects_at_rt(self): signer = ZhilianSign(at="mock_at", rt="mock_rt") client = ZhilianClient(signer=signer) headers = client.signer.sign_headers() assert headers["x-zp-at"] == "mock_at" assert headers["x-zp-rt"] == "mock_rt" def test_sign_headers_has_required_keys(self): client = ZhilianClient() headers = client.signer.sign_headers() for key in ["x-zp-at", "x-zp-rt", "x-zp-action-id", "x-zp-device-id"]: assert key in headers def test_default_signer_empty_tokens(self): client = ZhilianClient() headers = client.signer.sign_headers() assert headers["x-zp-at"] == "" assert headers["x-zp-rt"] == "" ``` - `test -f tests/zhilian/test_zhilian_client.py && echo "OK"` 输出 OK - `pipenv run python -m pytest tests/zhilian/ -v` 全部通过(>= 12 个测试) --- ## Verification ```bash # 1. 验证所有 zhilian 模块 import 正确 pipenv run python -c " from spiderJobs.platforms.zhilian.client import ZhilianClient, create_cgate_client, create_capi_client from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, GetCompanyDetail, SearchCompanyPositions from spiderJobs.platforms.zhilian.main import main, create_searcher from crawler_core.base import BaseFetcher, BaseSearcher assert issubclass(SearchPositions, BaseSearcher) assert issubclass(GetPositionDetail, BaseFetcher) print('✅ 所有 zhilian 模块 import 成功,继承关系正确') " # 2. 确认无旧依赖残留 grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/client.py spiderJobs/platforms/zhilian/api.py spiderJobs/platforms/zhilian/main.py spiderJobs/platforms/zhilian/sign.py && echo "❌ 仍有旧依赖" || echo "✅ 无旧依赖" # 3. 运行 mock 测试 pipenv run python -m pytest tests/zhilian/ -v # 4. 三平台全量回归 pipenv run python -m pytest tests/ -v --tb=short ```