13 KiB
phase, plan, wave, title, depends_on, files_modified, autonomous, requirements
| phase | plan | wave | title | depends_on | files_modified | autonomous | requirements | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 2 | 1 | 迁移智联招聘(zhilian)层至 crawler_core + mock 测试 |
|
true |
|
Phase 3 Plan 02: 迁移智联招聘(zhilian)层至 crawler_core + mock 测试
Objective
将 spiderJobs/platforms/zhilian/ 从依赖 spiderJobs.core(旧基类)改为依赖 crawler_core(新基类),
同时新增 tests/zhilian/test_zhilian_client.py mock 测试,满足 ARCH-05。
与 job51 的关键差异:
- zhilian api.py 使用默认的
parse_response(无自定义_parse_response函数),无ApiResult替换 - zhilian client.py 需要特别保留
ZhilianSign的sign_headers()和sign_params()接口 SearchCompanyPositions._build_params()通过self._client.signer.sign_params()访问 signer,迁移后不受影响
Must Haves
client.py继承crawler_core.http_client.HTTPClient,使用crawler_core.zhilian.sign.ZhilianSignapi.py使用crawler_core.base.BaseFetcher/BaseSearcherapi.py中 1 处self._http.get(替换为self.http_client.get((第 200 行)main.pyimport 更新为crawler_core.basesign.py改为向后兼容桩(重新导出crawler_core.zhilian.sign.ZhilianSign)grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/{client,api,main,sign}.py无输出tests/zhilian/__init__.py存在tests/zhilian/test_zhilian_client.py存在pytest tests/zhilian/ -v全部通过(>= 12 个测试)
Wave 1
Task 2.1: 更新 client.py
<read_first>
spiderJobs/platforms/zhilian/client.py(当前内容)crawler_core/http_client.py(目标基类)crawler_core/zhilian/sign.py(ZhilianSign 新来源) </read_first>
- 第 10 行改为:
from crawler_core.http_client import HTTPClient - 第 11 行改为:
(删除from crawler_core.zhilian.sign import ZhilianSignfrom spiderJobs.core.http_client import HTTPClient和from spiderJobs.platforms.zhilian.sign import ZhilianSign)
注意: ZhilianClient.get/post 方法覆写了父类,并调用 self.signer.sign_headers(page_code),这是 ZhilianSign 的接口,迁移后不受影响(接口签名完全一致)。
其他所有内容不变。
<acceptance_criteria>
grep "from crawler_core.http_client import HTTPClient" spiderJobs/platforms/zhilian/client.py有输出grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/client.py有输出grep "spiderJobs.core" spiderJobs/platforms/zhilian/client.py无输出python -c "from spiderJobs.platforms.zhilian.client import ZhilianClient"无 ImportError </acceptance_criteria>
Task 2.2: 更新 api.py
<read_first>
spiderJobs/platforms/zhilian/api.py(当前完整内容,229 行)crawler_core/base.py(BaseFetcher、BaseSearcher 接口) </read_first>
-
第 10 行改为:
from crawler_core.base import BaseFetcher, BaseSearcher(删除
from spiderJobs.core.base import BaseFetcher, BaseSearcher) -
第 200 行(
SearchCompanyPositions._request())改为:return self.http_client.get(self.ENDPOINT, params)(原为
return self._http.get(self.ENDPOINT, params))
注意: zhilian api.py 无 ApiResult(使用默认解析器),无需替换 ApiResult。
SearchCompanyPositions._build_params() 中的 self._client.signer.sign_params() 不需要修改。
<acceptance_criteria>
grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/api.py有输出grep "spiderJobs.core" spiderJobs/platforms/zhilian/api.py无输出grep "self\._http" spiderJobs/platforms/zhilian/api.py无输出python -c "from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, SearchCompanyPositions"无 ImportError </acceptance_criteria>
Task 2.3: 更新 main.py
<read_first>
spiderJobs/platforms/zhilian/main.py(当前内容,113 行) </read_first>
- 第 32 行改为:
(删除from crawler_core.base import BaseFetcher, BaseSearcherfrom spiderJobs.core.base import BaseFetcher, BaseSearcher)
其他内容不变(无 sign import,main.py 中签名通过 ZhilianClient 自动注入)。
<acceptance_criteria>
grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/main.py有输出grep "spiderJobs.core" spiderJobs/platforms/zhilian/main.py无输出 </acceptance_criteria>
Task 2.4: 将 sign.py 改为向后兼容桩
<read_first>
spiderJobs/platforms/zhilian/sign.py(当前内容,87 行的独立实现)crawler_core/zhilian/sign.py(权威实现) </read_first>
"""
向后兼容桩 — 智联招聘签名
已迁移至 crawler_core.zhilian.sign。
直接从 crawler_core 重新导出,避免下游代码出现 ImportError。
"""
from crawler_core.zhilian.sign import ZhilianSign # noqa: F401
__all__ = ["ZhilianSign"]
<acceptance_criteria>
grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/sign.py有输出python -c "from spiderJobs.platforms.zhilian.sign import ZhilianSign; print(ZhilianSign().generate_uuid())"成功打印 UUID </acceptance_criteria>
Task 2.5: 创建 tests/zhilian/init.py
创建 `tests/zhilian/__init__.py`,内容:`# tests/zhilian/`Task 2.6: 编写 tests/zhilian/test_zhilian_client.py
<read_first>
spiderJobs/platforms/zhilian/api.py(迁移后版本)spiderJobs/platforms/zhilian/client.py(迁移后版本)crawler_core/zhilian/sign.py(ZhilianSign 接口)tests/boss/test_boss_client.py(参考风格) </read_first>
"""
智联招聘 HTTP 层 mock 测试(QUAL-03 / ARCH-05)
使用 MagicMock 替代真实 HTTP 客户端,无网络依赖。
"""
from __future__ import annotations
from unittest.mock import MagicMock
from crawler_core.zhilian.sign import ZhilianSign
from spiderJobs.platforms.zhilian.api import (
SearchPositions, GetPositionDetail, GetCompanyExtDetail,
GetCompanyDetail, SearchCompanyPositions,
)
from spiderJobs.platforms.zhilian.client import ZhilianClient
from crawler_core.base import Result
# ── 1. SearchPositions(POST cgate)─────────────────────
class TestSearchPositions:
def _make_client(self, status_code=200, data=None):
mock_client = MagicMock()
mock_client.post.return_value = (status_code, data or {})
return mock_client
def test_search_success_returns_list(self):
data = {
"data": {"list": [{"title": "Python 工程师"}], "numFound": 1},
"pageInfo": {"pageNum": 1, "pageSize": 15, "totalNum": 1}
}
searcher = SearchPositions(keyword="Python", city_code=538,
client=self._make_client(200, data))
result = searcher.search(page_index=1)
assert result.success is True
def test_search_http_error(self):
searcher = SearchPositions(client=self._make_client(403, {}))
result = searcher.search(page_index=1)
assert result.success is False
assert result.status_code == 403
# ── 2. GetPositionDetail(GET cgate)────────────────────
class TestGetPositionDetail:
def test_fetch_success(self):
mock_client = MagicMock()
mock_client.get.return_value = (200, {"data": {"jobName": "高级工程师"}})
fetcher = GetPositionDetail(number="CC123456", client=mock_client)
result = fetcher.fetch()
assert result.success is True
def test_fetch_404(self):
mock_client = MagicMock()
mock_client.get.return_value = (404, {})
fetcher = GetPositionDetail(number="notexist", client=mock_client)
result = fetcher.fetch()
assert result.success is False
assert result.status_code == 404
# ── 3. GetCompanyExtDetail(GET cgate)──────────────────
class TestGetCompanyExtDetail:
def test_fetch_success(self):
mock_client = MagicMock()
mock_client.get.return_value = (200, {"data": {"companyName": "测试公司"}})
fetcher = GetCompanyExtDetail(
company_name="测试公司", company_number="CZ123", client=mock_client)
result = fetcher.fetch()
assert result.success is True
# ── 4. GetCompanyDetail(GET cgate)─────────────────────
class TestGetCompanyDetail:
def test_fetch_success(self):
mock_client = MagicMock()
mock_client.get.return_value = (200, {"data": {"companyNumber": "CZ123"}})
fetcher = GetCompanyDetail(number="CZ123", client=mock_client)
result = fetcher.fetch()
assert result.success is True
def test_fetch_http_error(self):
mock_client = MagicMock()
mock_client.get.return_value = (500, {})
fetcher = GetCompanyDetail(number="CZ123", client=mock_client)
result = fetcher.fetch()
assert result.success is False
# ── 5. SearchCompanyPositions(GET capi)────────────────
class TestSearchCompanyPositions:
def test_search_success(self):
mock_signer = MagicMock(spec=ZhilianSign)
mock_signer.sign_params.return_value = {"at": "", "rt": ""}
mock_client = MagicMock()
mock_client.signer = mock_signer
mock_client.get.return_value = (200, {"data": {"list": [{"jobName": "测试岗位"}]},
"pageInfo": {}})
searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client)
result = searcher.search(page_index=1)
assert result.success is True
assert mock_signer.sign_params.called
def test_search_http_error(self):
mock_signer = MagicMock(spec=ZhilianSign)
mock_signer.sign_params.return_value = {}
mock_client = MagicMock()
mock_client.signer = mock_signer
mock_client.get.return_value = (403, {})
searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client)
result = searcher.search(page_index=1)
assert result.success is False
# ── 6. ZhilianClient — 签名头注入 ───────────────────────
class TestZhilianClientHeaders:
def test_sign_headers_injects_at_rt(self):
signer = ZhilianSign(at="mock_at", rt="mock_rt")
client = ZhilianClient(signer=signer)
headers = client.signer.sign_headers()
assert headers["x-zp-at"] == "mock_at"
assert headers["x-zp-rt"] == "mock_rt"
def test_sign_headers_has_required_keys(self):
client = ZhilianClient()
headers = client.signer.sign_headers()
for key in ["x-zp-at", "x-zp-rt", "x-zp-action-id", "x-zp-device-id"]:
assert key in headers
def test_default_signer_empty_tokens(self):
client = ZhilianClient()
headers = client.signer.sign_headers()
assert headers["x-zp-at"] == ""
assert headers["x-zp-rt"] == ""
<acceptance_criteria>
test -f tests/zhilian/test_zhilian_client.py && echo "OK"输出 OKpipenv run python -m pytest tests/zhilian/ -v全部通过(>= 12 个测试) </acceptance_criteria>
Verification
# 1. 验证所有 zhilian 模块 import 正确
pipenv run python -c "
from spiderJobs.platforms.zhilian.client import ZhilianClient, create_cgate_client, create_capi_client
from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, GetCompanyDetail, SearchCompanyPositions
from spiderJobs.platforms.zhilian.main import main, create_searcher
from crawler_core.base import BaseFetcher, BaseSearcher
assert issubclass(SearchPositions, BaseSearcher)
assert issubclass(GetPositionDetail, BaseFetcher)
print('✅ 所有 zhilian 模块 import 成功,继承关系正确')
"
# 2. 确认无旧依赖残留
grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/client.py spiderJobs/platforms/zhilian/api.py spiderJobs/platforms/zhilian/main.py spiderJobs/platforms/zhilian/sign.py && echo "❌ 仍有旧依赖" || echo "✅ 无旧依赖"
# 3. 运行 mock 测试
pipenv run python -m pytest tests/zhilian/ -v
# 4. 三平台全量回归
pipenv run python -m pytest tests/ -v --tb=short