13 KiB
Raw Blame History

phase, plan, wave, title, depends_on, files_modified, autonomous, requirements
phase plan wave title depends_on files_modified autonomous requirements
3 2 1 迁移智联招聘zhilian层至 crawler_core + mock 测试
spiderJobs/platforms/zhilian/client.py
spiderJobs/platforms/zhilian/api.py
spiderJobs/platforms/zhilian/main.py
spiderJobs/platforms/zhilian/sign.py
tests/zhilian/__init__.py
tests/zhilian/test_zhilian_client.py
true
ARCH-05

Phase 3 Plan 02: 迁移智联招聘zhilian层至 crawler_core + mock 测试

Objective

spiderJobs/platforms/zhilian/ 从依赖 spiderJobs.core(旧基类)改为依赖 crawler_core(新基类), 同时新增 tests/zhilian/test_zhilian_client.py mock 测试,满足 ARCH-05。

与 job51 的关键差异:

  • zhilian api.py 使用默认的 parse_response(无自定义 _parse_response 函数),无 ApiResult 替换
  • zhilian client.py 需要特别保留 ZhilianSignsign_headers()sign_params() 接口
  • SearchCompanyPositions._build_params() 通过 self._client.signer.sign_params() 访问 signer迁移后不受影响

Must Haves

  • client.py 继承 crawler_core.http_client.HTTPClient,使用 crawler_core.zhilian.sign.ZhilianSign
  • api.py 使用 crawler_core.base.BaseFetcher/BaseSearcher
  • api.py 中 1 处 self._http.get( 替换为 self.http_client.get((第 200 行)
  • main.py import 更新为 crawler_core.base
  • sign.py 改为向后兼容桩(重新导出 crawler_core.zhilian.sign.ZhilianSign
  • grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/{client,api,main,sign}.py 无输出
  • tests/zhilian/__init__.py 存在
  • tests/zhilian/test_zhilian_client.py 存在
  • pytest tests/zhilian/ -v 全部通过(>= 12 个测试)

Wave 1

Task 2.1: 更新 client.py

<read_first>

  • spiderJobs/platforms/zhilian/client.py(当前内容)
  • crawler_core/http_client.py(目标基类)
  • crawler_core/zhilian/sign.pyZhilianSign 新来源) </read_first>
修改 `spiderJobs/platforms/zhilian/client.py`
  1. 第 10 行改为:
    from crawler_core.http_client import HTTPClient
    
  2. 第 11 行改为:
    from crawler_core.zhilian.sign import ZhilianSign
    
    (删除 from spiderJobs.core.http_client import HTTPClientfrom spiderJobs.platforms.zhilian.sign import ZhilianSign

注意: ZhilianClient.get/post 方法覆写了父类,并调用 self.signer.sign_headers(page_code),这是 ZhilianSign 的接口,迁移后不受影响(接口签名完全一致)。 其他所有内容不变。

<acceptance_criteria>

  • grep "from crawler_core.http_client import HTTPClient" spiderJobs/platforms/zhilian/client.py 有输出
  • grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/client.py 有输出
  • grep "spiderJobs.core" spiderJobs/platforms/zhilian/client.py 无输出
  • python -c "from spiderJobs.platforms.zhilian.client import ZhilianClient" 无 ImportError </acceptance_criteria>

Task 2.2: 更新 api.py

<read_first>

  • spiderJobs/platforms/zhilian/api.py当前完整内容229 行)
  • crawler_core/base.pyBaseFetcher、BaseSearcher 接口) </read_first>
修改 `spiderJobs/platforms/zhilian/api.py`
  1. 第 10 行改为:

    from crawler_core.base import BaseFetcher, BaseSearcher
    

    (删除 from spiderJobs.core.base import BaseFetcher, BaseSearcher

  2. 第 200 行(SearchCompanyPositions._request())改为:

    return self.http_client.get(self.ENDPOINT, params)
    

    (原为 return self._http.get(self.ENDPOINT, params)

注意: zhilian api.py 无 ApiResult使用默认解析器无需替换 ApiResult。 SearchCompanyPositions._build_params() 中的 self._client.signer.sign_params() 不需要修改。

<acceptance_criteria>

  • grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/api.py 有输出
  • grep "spiderJobs.core" spiderJobs/platforms/zhilian/api.py 无输出
  • grep "self\._http" spiderJobs/platforms/zhilian/api.py 无输出
  • python -c "from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, SearchCompanyPositions" 无 ImportError </acceptance_criteria>

Task 2.3: 更新 main.py

<read_first>

  • spiderJobs/platforms/zhilian/main.py当前内容113 行) </read_first>
修改 `spiderJobs/platforms/zhilian/main.py`
  1. 第 32 行改为:
    from crawler_core.base import BaseFetcher, BaseSearcher
    
    (删除 from spiderJobs.core.base import BaseFetcher, BaseSearcher

其他内容不变(无 sign importmain.py 中签名通过 ZhilianClient 自动注入)。

<acceptance_criteria>

  • grep "from crawler_core.base import BaseFetcher, BaseSearcher" spiderJobs/platforms/zhilian/main.py 有输出
  • grep "spiderJobs.core" spiderJobs/platforms/zhilian/main.py 无输出 </acceptance_criteria>

Task 2.4: 将 sign.py 改为向后兼容桩

<read_first>

  • spiderJobs/platforms/zhilian/sign.py当前内容87 行的独立实现)
  • crawler_core/zhilian/sign.py(权威实现) </read_first>
将 `spiderJobs/platforms/zhilian/sign.py` 完全替换为:
"""
向后兼容桩 — 智联招聘签名

已迁移至 crawler_core.zhilian.sign。
直接从 crawler_core 重新导出,避免下游代码出现 ImportError。
"""

from crawler_core.zhilian.sign import ZhilianSign  # noqa: F401

__all__ = ["ZhilianSign"]

<acceptance_criteria>

  • grep "from crawler_core.zhilian.sign import ZhilianSign" spiderJobs/platforms/zhilian/sign.py 有输出
  • python -c "from spiderJobs.platforms.zhilian.sign import ZhilianSign; print(ZhilianSign().generate_uuid())" 成功打印 UUID </acceptance_criteria>

Task 2.5: 创建 tests/zhilian/init.py

创建 `tests/zhilian/__init__.py`,内容:`# tests/zhilian/`

Task 2.6: 编写 tests/zhilian/test_zhilian_client.py

<read_first>

  • spiderJobs/platforms/zhilian/api.py(迁移后版本)
  • spiderJobs/platforms/zhilian/client.py(迁移后版本)
  • crawler_core/zhilian/sign.pyZhilianSign 接口)
  • tests/boss/test_boss_client.py(参考风格) </read_first>
创建 `tests/zhilian/test_zhilian_client.py`,包含以下测试:
"""
智联招聘 HTTP 层 mock 测试QUAL-03 / ARCH-05

使用 MagicMock 替代真实 HTTP 客户端,无网络依赖。
"""

from __future__ import annotations
from unittest.mock import MagicMock
from crawler_core.zhilian.sign import ZhilianSign
from spiderJobs.platforms.zhilian.api import (
    SearchPositions, GetPositionDetail, GetCompanyExtDetail,
    GetCompanyDetail, SearchCompanyPositions,
)
from spiderJobs.platforms.zhilian.client import ZhilianClient
from crawler_core.base import Result


# ── 1. SearchPositionsPOST cgate─────────────────────

class TestSearchPositions:

    def _make_client(self, status_code=200, data=None):
        mock_client = MagicMock()
        mock_client.post.return_value = (status_code, data or {})
        return mock_client

    def test_search_success_returns_list(self):
        data = {
            "data": {"list": [{"title": "Python 工程师"}], "numFound": 1},
            "pageInfo": {"pageNum": 1, "pageSize": 15, "totalNum": 1}
        }
        searcher = SearchPositions(keyword="Python", city_code=538,
                                   client=self._make_client(200, data))
        result = searcher.search(page_index=1)
        assert result.success is True

    def test_search_http_error(self):
        searcher = SearchPositions(client=self._make_client(403, {}))
        result = searcher.search(page_index=1)
        assert result.success is False
        assert result.status_code == 403


# ── 2. GetPositionDetailGET cgate────────────────────

class TestGetPositionDetail:

    def test_fetch_success(self):
        mock_client = MagicMock()
        mock_client.get.return_value = (200, {"data": {"jobName": "高级工程师"}})
        fetcher = GetPositionDetail(number="CC123456", client=mock_client)
        result = fetcher.fetch()
        assert result.success is True

    def test_fetch_404(self):
        mock_client = MagicMock()
        mock_client.get.return_value = (404, {})
        fetcher = GetPositionDetail(number="notexist", client=mock_client)
        result = fetcher.fetch()
        assert result.success is False
        assert result.status_code == 404


# ── 3. GetCompanyExtDetailGET cgate──────────────────

class TestGetCompanyExtDetail:

    def test_fetch_success(self):
        mock_client = MagicMock()
        mock_client.get.return_value = (200, {"data": {"companyName": "测试公司"}})
        fetcher = GetCompanyExtDetail(
            company_name="测试公司", company_number="CZ123", client=mock_client)
        result = fetcher.fetch()
        assert result.success is True


# ── 4. GetCompanyDetailGET cgate─────────────────────

class TestGetCompanyDetail:

    def test_fetch_success(self):
        mock_client = MagicMock()
        mock_client.get.return_value = (200, {"data": {"companyNumber": "CZ123"}})
        fetcher = GetCompanyDetail(number="CZ123", client=mock_client)
        result = fetcher.fetch()
        assert result.success is True

    def test_fetch_http_error(self):
        mock_client = MagicMock()
        mock_client.get.return_value = (500, {})
        fetcher = GetCompanyDetail(number="CZ123", client=mock_client)
        result = fetcher.fetch()
        assert result.success is False


# ── 5. SearchCompanyPositionsGET capi────────────────

class TestSearchCompanyPositions:

    def test_search_success(self):
        mock_signer = MagicMock(spec=ZhilianSign)
        mock_signer.sign_params.return_value = {"at": "", "rt": ""}
        mock_client = MagicMock()
        mock_client.signer = mock_signer
        mock_client.get.return_value = (200, {"data": {"list": [{"jobName": "测试岗位"}]},
                                               "pageInfo": {}})
        searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client)
        result = searcher.search(page_index=1)
        assert result.success is True
        assert mock_signer.sign_params.called

    def test_search_http_error(self):
        mock_signer = MagicMock(spec=ZhilianSign)
        mock_signer.sign_params.return_value = {}
        mock_client = MagicMock()
        mock_client.signer = mock_signer
        mock_client.get.return_value = (403, {})
        searcher = SearchCompanyPositions(company_id="CZ123", client=mock_client)
        result = searcher.search(page_index=1)
        assert result.success is False


# ── 6. ZhilianClient — 签名头注入 ───────────────────────

class TestZhilianClientHeaders:

    def test_sign_headers_injects_at_rt(self):
        signer = ZhilianSign(at="mock_at", rt="mock_rt")
        client = ZhilianClient(signer=signer)
        headers = client.signer.sign_headers()
        assert headers["x-zp-at"] == "mock_at"
        assert headers["x-zp-rt"] == "mock_rt"

    def test_sign_headers_has_required_keys(self):
        client = ZhilianClient()
        headers = client.signer.sign_headers()
        for key in ["x-zp-at", "x-zp-rt", "x-zp-action-id", "x-zp-device-id"]:
            assert key in headers

    def test_default_signer_empty_tokens(self):
        client = ZhilianClient()
        headers = client.signer.sign_headers()
        assert headers["x-zp-at"] == ""
        assert headers["x-zp-rt"] == ""

<acceptance_criteria>

  • test -f tests/zhilian/test_zhilian_client.py && echo "OK" 输出 OK
  • pipenv run python -m pytest tests/zhilian/ -v 全部通过(>= 12 个测试) </acceptance_criteria>

Verification

# 1. 验证所有 zhilian 模块 import 正确
pipenv run python -c "
from spiderJobs.platforms.zhilian.client import ZhilianClient, create_cgate_client, create_capi_client
from spiderJobs.platforms.zhilian.api import SearchPositions, GetPositionDetail, GetCompanyDetail, SearchCompanyPositions
from spiderJobs.platforms.zhilian.main import main, create_searcher
from crawler_core.base import BaseFetcher, BaseSearcher
assert issubclass(SearchPositions, BaseSearcher)
assert issubclass(GetPositionDetail, BaseFetcher)
print('✅ 所有 zhilian 模块 import 成功,继承关系正确')
"

# 2. 确认无旧依赖残留
grep -rn "spiderJobs.core" spiderJobs/platforms/zhilian/client.py spiderJobs/platforms/zhilian/api.py spiderJobs/platforms/zhilian/main.py spiderJobs/platforms/zhilian/sign.py && echo "❌ 仍有旧依赖" || echo "✅ 无旧依赖"

# 3. 运行 mock 测试
pipenv run python -m pytest tests/zhilian/ -v

# 4. 三平台全量回归
pipenv run python -m pytest tests/ -v --tb=short