2 Commits

Author SHA1 Message Date
win
45fba5697e fix: 定时任务 company channel 的 job 也走 push_mapper + enrichment
company_cleaning_job → sync_company_jobs → store_batch(channel="company")
之前 channel="company" 的 job 配置没有 push_mapper,导致:
- 不会生成 push_data_list → 不调 push_to_remote
- 不触发 company_desc 补全

三个平台 channel="company" 配置加上对应的 push_mapper
2026-03-22 22:01:06 +08:00
win
3d202c3486 feat(05): data pipeline optimization (DATA-01, DATA-04)
Plan 01 - DATA-01: 30-day window dedup fix:
- dedup.py: both single-field and double-field SQL queries now include
  AND created_at > now() - INTERVAL 30 DAY
- tests/ingest/test_dedup.py: 6 mock tests validating 30-day window

Plan 02 - DATA-04: company vs search job channel separation:
- schemas/ingest.py: ChannelType.COMPANY = 'company'
- configs/boss.py: register channel='company' config
- configs/qcwy.py: register channel='company' config
- configs/zhilian.py: register channel='company' config
- company_jobs_sync.py: store_batch(..., 'mini', ...) → (..., 'company', ...)

DATA-02: confirmed already complete (job.py has /data/batch-async endpoint)
DATA-03: confirmed already complete (company_cleaner.py full pipeline)

Full regression: 112 passed (106 existing + 6 new)
2026-03-21 19:50:06 +08:00