8.9 KiB
8.9 KiB
phase, plan, wave, title, depends_on, files_modified, autonomous, requirements
| phase | plan | wave | title | depends_on | files_modified | autonomous | requirements | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 1 | 1 | 迁移 facade 层 import 至 spiderJobs.platforms.* + asyncio.to_thread 桥接 |
|
true |
|
Phase 4 Plan 01: 迁移 facade 层 import 至 spiderJobs.platforms.* + asyncio.to_thread 桥接
Objective
将 app/services/crawler/ 的三个 facade 文件(boss.py/qcwy.py/zhilian.py)从
引用内部私有复制文件(_boss_api.py、_boss_client.py 等)改为直接引用
spiderJobs.platforms.*(已基于 crawler_core),满足 ARCH-06/ARCH-07。
对外接口(set_proxy()、get_job_detail() 等)完全不变。
同时为每个 Service 添加 asyncio.to_thread() 异步包装方法(ARCH-06)。
Must Haves
boss.py改导入spiderJobs.platforms.boss.{api,client,sign}qcwy.py改导入spiderJobs.platforms.job51.{api,client}zhilian.py改导入spiderJobs.platforms.zhilian.{api,client,sign}- 三个 Service 各添加 async 方法(
asyncio.to_thread包装) python -c "from app.services.crawler.boss import BossService"无 ImportErrorpytest tests/ -v全部通过(无回归)
Wave 1
Task 1.1: 更新 boss.py
<read_first>
app/services/crawler/boss.py(当前内容,116 行)spiderJobs/platforms/boss/api.py(GetBrandDetail/GetJobDetail/SearchBrandJobs/SearchRecJobs 导出)spiderJobs/platforms/boss/client.py(BossClient/create_client 导出,含 batch())spiderJobs/platforms/boss/sign.py(BossSign → crawler_core 桩) </read_first>
-
将 import 块(第 12-19 行)替换为:
from spiderJobs.platforms.boss.api import ( GetBrandDetail, GetJobDetail, SearchBrandJobs, SearchRecJobs, ) from spiderJobs.platforms.boss.client import BossClient, create_client from spiderJobs.platforms.boss.sign import BossSign -
在
BossService类末尾添加异步包装方法:# ── asyncio.to_thread 桥接(ARCH-06)──────── async def async_get_job_detail( self, job_id: str, lid: str = "", security_id: str = "" ) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.get_job_detail_by_id, job_id, lid, security_id) async def async_get_company_detail(self, company_id: str) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.get_company_detail_by_id, company_id) async def async_get_company_jobs( self, company_id: str, page: int = 1 ) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.get_company_jobs_by_id, company_id, page) async def async_search_jobs( self, keyword: str, city_code: str = "101010100", page: int = 1 ) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.search_jobs, keyword, city_code, page)
<acceptance_criteria>
grep "from spiderJobs.platforms.boss" app/services/crawler/boss.py有输出grep "app.services.crawler._boss" app/services/crawler/boss.py无输出grep "asyncio.to_thread" app/services/crawler/boss.py有输出pipenv run python -c "from app.services.crawler.boss import BossService"成功 </acceptance_criteria>
Task 1.2: 更新 qcwy.py
<read_first>
app/services/crawler/qcwy.py(当前内容,103 行)spiderJobs/platforms/job51/api.py(GetCompanyInfo/GetJobDetail/SearchCompanyJobs/SearchRecommendJobs 导出)spiderJobs/platforms/job51/client.py(Job51Client/create_client 导出) </read_first>
-
将 import 块(第 12-18 行)替换为:
from spiderJobs.platforms.job51.api import ( GetCompanyInfo, GetJobDetail, SearchCompanyJobs, SearchRecommendJobs, ) from spiderJobs.platforms.job51.client import Job51Client, create_client -
在
QcwyService类末尾添加异步包装方法:# ── asyncio.to_thread 桥接(ARCH-06)──────── async def async_get_job_detail(self, job_id: str) -> Dict: import asyncio return await asyncio.to_thread(self.get_job_detail, job_id) async def async_get_company_info(self, company_id: str) -> Dict: import asyncio return await asyncio.to_thread(self.get_company_info, company_id) async def async_get_company_jobs( self, company_id: str, page: int = 1, page_size: int = 30, **kwargs ) -> Dict: import asyncio return await asyncio.to_thread( self.get_company_jobs_by_id, company_id, page, page_size ) async def async_search_jobs( self, keyword: str, job_area: str = "020000", page: int = 1 ) -> List: import asyncio return await asyncio.to_thread(self.search_jobs, keyword, job_area, page)
<acceptance_criteria>
grep "from spiderJobs.platforms.job51" app/services/crawler/qcwy.py有输出grep "app.services.crawler._job51" app/services/crawler/qcwy.py无输出grep "asyncio.to_thread" app/services/crawler/qcwy.py有输出pipenv run python -c "from app.services.crawler.qcwy import QcwyService"成功 </acceptance_criteria>
Task 1.3: 更新 zhilian.py
<read_first>
app/services/crawler/zhilian.py(当前内容,143 行)spiderJobs/platforms/zhilian/api.py(GetCompanyDetail/GetPositionDetail/SearchCompanyPositions/SearchPositions 导出)spiderJobs/platforms/zhilian/client.py(ZhilianClient/create_capi_client/create_cgate_client 导出)spiderJobs/platforms/zhilian/sign.py(ZhilianSign → crawler_core 桩) </read_first>
-
将 import 块(第 12-23 行)替换为:
from spiderJobs.platforms.zhilian.api import ( GetCompanyDetail, GetPositionDetail, SearchCompanyPositions, SearchPositions, ) from spiderJobs.platforms.zhilian.client import ( ZhilianClient, create_capi_client, create_cgate_client, ) from spiderJobs.platforms.zhilian.sign import ZhilianSign -
在
ZhilianService类末尾添加异步包装方法:# ── asyncio.to_thread 桥接(ARCH-06)──────── async def async_get_job_detail(self, job_number: str) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.get_job_detail, job_number) async def async_get_company_detail(self, company_number: str) -> Optional[Dict]: import asyncio return await asyncio.to_thread(self.get_company_detail, company_number) async def async_get_company_jobs( self, company_number: str, page_index: int = 1, page_size: int = 30, work_city: Optional[int] = None, ) -> Optional[Dict]: import asyncio return await asyncio.to_thread( self.get_company_jobs_by_id, company_number, page_index, page_size, work_city ) async def async_search_jobs( self, city_id: int = 801, page_size: int = 15, page_index: int = 1, job_level3_code: Optional[str] = None, ) -> List: import asyncio return await asyncio.to_thread( self.search_jobs, city_id, page_size, page_index, job_level3_code )
<acceptance_criteria>
grep "from spiderJobs.platforms.zhilian" app/services/crawler/zhilian.py有输出grep "app.services.crawler._zhilian" app/services/crawler/zhilian.py无输出grep "asyncio.to_thread" app/services/crawler/zhilian.py有输出pipenv run python -c "from app.services.crawler.zhilian import ZhilianService"成功 </acceptance_criteria>
Verification
# 1. 验证三个 facade 模块 import 正确
pipenv run python -c "
from app.services.crawler.boss import BossService
from app.services.crawler.qcwy import QcwyService
from app.services.crawler.zhilian import ZhilianService
print('✅ 三个 facade 模块 import 成功')
# 验证无旧导入
import inspect, sys
for svc in [BossService, QcwyService, ZhilianService]:
src = inspect.getsourcefile(svc)
with open(src) as f:
content = f.read()
assert '_boss_' not in content and '_job51_' not in content and '_zhilian_' not in content, f'{src} 仍有旧导入!'
print('✅ 无旧导入残留')
# 验证 async 方法存在
assert hasattr(BossService, 'async_get_job_detail')
assert hasattr(QcwyService, 'async_get_company_info')
assert hasattr(ZhilianService, 'async_get_company_detail')
print('✅ asyncio 桥接方法存在')
"
# 2. 验证旧导入无残留
grep -rn "from app.services.crawler._" app/services/crawler/boss.py app/services/crawler/qcwy.py app/services/crawler/zhilian.py && echo "❌ 旧导入残留" || echo "✅ 无旧导入"
# 3. 全量回归
pipenv run python -m pytest tests/ -v --tb=short