24 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
01-shared-core 02 execute 2
01-01
crawler_core/boss/sign.py
crawler_core/qcwy/sign.py
crawler_core/zhilian/sign.py
tests/crawler_core/__init__.py
tests/crawler_core/test_boss_sign.py
tests/crawler_core/test_qcwy_sign.py
tests/crawler_core/test_zhilian_sign.py
true
QUAL-01
truths artifacts key_links
Three sign algorithm files exist under crawler_core/{boss,qcwy,zhilian}/sign.py
`pytest tests/crawler_core/` exits 0 — all sign algorithm tests pass
Sign tests require NO network access, NO tokens, NO mocks — pure function assertions
BossSign.generate_traceid() returns a string matching the pattern M-W[0-9a-zA-Z]{22}
Job51Sign.build_sign_path() returns a tuple of (str, str) — (url_path, sign_hex)
ZhilianSign.sign_headers() returns a dict with key 'x-zp-device-id'
Old spiderJobs/platforms/*/sign.py files are NOT deleted or modified
path provides exports
crawler_core/boss/sign.py Boss traceid generation — pure functions, no I/O
BossSign
path provides exports
crawler_core/qcwy/sign.py 51Job HMAC-SHA256 signing — pure functions, no I/O
Job51Sign
SIGN_KEY
path provides exports
crawler_core/zhilian/sign.py Zhilian header/param signing — pure functions, no I/O
ZhilianSign
path provides
tests/crawler_core/test_boss_sign.py BossSign unit tests — 6+ test cases
path provides
tests/crawler_core/test_qcwy_sign.py Job51Sign unit tests — 5+ test cases
path provides
tests/crawler_core/test_zhilian_sign.py ZhilianSign unit tests — 5+ test cases
from to via
tests/crawler_core/test_boss_sign.py crawler_core/boss/sign.py from crawler_core.boss.sign import BossSign, _compute_checksum, _generate_uuid
from to via
crawler_core/boss/sign.py stdlib only import random, time — no external deps
Migrate all three platform sign algorithm files into crawler_core/ and write comprehensive unit tests for each. Sign algorithms are pure functions — they are the highest-value, lowest-cost tests in the codebase.

Purpose: QUAL-01 requires unit test coverage for core signing algorithms. Tests validate that the ported algorithms match the originals. Pure functions mean no HTTP mocking needed.

Output: Three sign.py files in crawler_core/{boss,qcwy,zhilian}/ and three test files in tests/crawler_core/ — all tests pass with pytest tests/crawler_core/.

<execution_context> @/.claude/get-shit-done/workflows/execute-plan.md @/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/01-shared-core/01-01-SUMMARY.md @.planning/phases/01-shared-core/1-CONTEXT.md

From spiderJobs/platforms/boss/sign.py:

# Module-level constants:
_CHARS = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

# Module-level private functions:
def _to_u32(n: int) -> int: ...          # Masks to 32-bit unsigned
def _compute_checksum(uuid_str: str) -> str: ...  # 3-char checksum from 19-char uuid
def _generate_uuid() -> str: ...         # 13-char hex timestamp + 6-char base62 random

class BossSign:
    def __init__(self, *, mpt: str = "", wt2: str = ""): ...
    @staticmethod
    def generate_traceid(prefix: str = "M-W") -> str: ...
    # Result format: "{prefix}{19-char uuid}{3-char checksum}" e.g. "M-W0019d0a8af5f32gtVvnD4M"

From spiderJobs/platforms/job51/sign.py:

SIGN_KEY = "abfc8f9dcf8c3f3d8aa294ac5f2cf2cc7767e5592590f39c3f503271dd68562b"

class Job51Sign:
    def __init__(self, *, sign_key: str = SIGN_KEY): ...
    @staticmethod
    def generate_uuid() -> str: ...   # 13-char timestamp + 10-char random int
    def build_sign_path(
        self,
        endpoint: str,
        method: str = "GET",
        params: dict | None = None,
        body: dict | None = None,
    ) -> tuple[str, str]: ...   # Returns (url_path, sign_hex)
    # url_path format: "/{endpoint}?api_key=51job&timestamp={ts}[&param=val...]"
    # sign_hex: HMAC-SHA256 hex string, 64 chars

From spiderJobs/platforms/zhilian/sign.py:

class ZhilianSign:
    def __init__(self, *, at="", rt="", device_id=None, version="4.1.259",
                 channel="wxxiaochengxu", platform="12"): ...
    @staticmethod
    def generate_uuid() -> str: ...   # UUID4-format with uppercase hex
    def sign_headers(self, page_code: str = "0") -> dict: ...
    # Returns dict with keys: x-zp-at, x-zp-rt, x-zp-action-id, x-zp-page-code,
    #   x-zp-version, x-zp-channel, x-zp-platform, x-zp-device-id, x-zp-business-system
    def sign_params(self) -> dict: ...
    # Returns dict with keys: at, rt, channel, platform, version, d
Task 1: Port sign algorithms to crawler_core/ platform directories - /Users/win/2025/AICoding/JobData/spiderJobs/platforms/boss/sign.py (source — read every line before writing) - /Users/win/2025/AICoding/JobData/spiderJobs/platforms/job51/sign.py (source — read every line before writing) - /Users/win/2025/AICoding/JobData/spiderJobs/platforms/zhilian/sign.py (source — read every line before writing) - /Users/win/2025/AICoding/JobData/crawler_core/boss/__init__.py (confirm it exists from Plan 01) crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py Copy the three sign algorithm files to their new locations under crawler_core/, making only one change per file: update the module docstring to reference crawler_core.

crawler_core/boss/sign.py: Copy spiderJobs/platforms/boss/sign.py verbatim. Update the module docstring first line to:

Boss直聘 Traceid 生成算法 (crawler_core)

No other changes. ALL private functions (_to_u32, _compute_checksum, _generate_uuid) and the BossSign class stay exactly as-is.

crawler_core/qcwy/sign.py: Copy spiderJobs/platforms/job51/sign.py verbatim. Update the module docstring first line to:

前程无忧 (51Job) 签名算法 (crawler_core)

No other changes. SIGN_KEY constant and Job51Sign class stay exactly as-is. The import json inside build_sign_path stays inside the method (do not hoist it).

crawler_core/zhilian/sign.py: Copy spiderJobs/platforms/zhilian/sign.py verbatim. Update the module docstring first line to:

智联招聘签名算法 (crawler_core)

No other changes. ZhilianSign class stays exactly as-is.

Verification after writing each file:

  • No file imports from spiderJobs.*
  • No file imports from app.*
  • No file uses requests or any HTTP library
  • Each file is entirely self-contained with only stdlib imports (random, time, math, hmac, hashlib, urllib.parse)

DO NOT delete or modify:

  • spiderJobs/platforms/boss/sign.py
  • spiderJobs/platforms/job51/sign.py
  • spiderJobs/platforms/zhilian/sign.py

These old files remain in place until Phase 4 cleanup. cd /Users/win/2025/AICoding/JobData && python -c " import sys sys.path.insert(0, '.') import re

from crawler_core.boss.sign import BossSign, _compute_checksum, _generate_uuid from crawler_core.qcwy.sign import Job51Sign, SIGN_KEY from crawler_core.zhilian.sign import ZhilianSign

Boss

tid = BossSign.generate_traceid() assert re.match(r'^M-W[0-9a-f]{13}[0-9a-zA-Z]{6}[0-9a-zA-Z]{3}$', tid), f'Traceid format wrong: {tid}' print(f'BossSign OK: {tid}')

Job51

signer = Job51Sign() path, sign = signer.build_sign_path('open/test', 'GET', params={'key': 'val'}) assert path.startswith('/open/test?api_key=51job&timestamp='), f'Path format wrong: {path}' assert len(sign) == 64, f'Sign length wrong: {len(sign)}' print(f'Job51Sign OK: sign={sign[:8]}...')

Zhilian

zs = ZhilianSign(at='token123', rt='refresh456') headers = zs.sign_headers() assert 'x-zp-device-id' in headers, 'device-id missing' assert headers['x-zp-business-system'] == '73', 'business-system wrong' assert len(headers) == 9, f'Expected 9 header keys, got {len(headers)}' params = zs.sign_params() assert params['at'] == 'token123', f'at field wrong: {params["at"]}' assert len(params) == 6, f'Expected 6 param keys, got {len(params)}' print('ZhilianSign OK') print('All sign algorithms imported and validated') " <acceptance_criteria> - crawler_core/boss/sign.py exists: ls crawler_core/boss/sign.py exits 0 - crawler_core/qcwy/sign.py exists: ls crawler_core/qcwy/sign.py exits 0 - crawler_core/zhilian/sign.py exists: ls crawler_core/zhilian/sign.py exits 0 - grep -r "from spiderJobs" crawler_core/ returns empty (no cross-imports) - grep -r "import requests" crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py returns empty (no HTTP imports in sign files) - BossSign.generate_traceid() returns a string of length 25 (3-char prefix + 19-char uuid + 3-char checksum) - Job51Sign().build_sign_path("test", "GET") returns tuple where [1] is 64-char string - ZhilianSign().sign_headers() returns dict with x-zp-business-system == "73" - Old files UNCHANGED: diff spiderJobs/platforms/boss/sign.py crawler_core/boss/sign.py shows only docstring difference </acceptance_criteria> Three sign.py files in crawler_core/ — pure functions, no HTTP, no cross-imports from app or spiderJobs.

Task 2: Write sign algorithm unit tests - /Users/win/2025/AICoding/JobData/crawler_core/boss/sign.py (just created — read to understand exact exports) - /Users/win/2025/AICoding/JobData/crawler_core/qcwy/sign.py (just created) - /Users/win/2025/AICoding/JobData/crawler_core/zhilian/sign.py (just created) - /Users/win/2025/AICoding/JobData/.planning/research/STACK.md (testing section: "Sign algorithm tests — Pure functions, no HTTP, no mocks needed") - /Users/win/2025/AICoding/JobData/tests/ (check what already exists to avoid conflicts) tests/crawler_core/__init__.py tests/crawler_core/test_boss_sign.py tests/crawler_core/test_qcwy_sign.py tests/crawler_core/test_zhilian_sign.py Create `tests/crawler_core/__init__.py` (empty file) and three test files.

tests/crawler_core/init.py: Empty file (just creates the package).

Important: Add sys.path setup to each test file since crawler_core is not yet pip-installed:

import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

Place this at the TOP of each test file before any crawler_core imports.


tests/crawler_core/test_boss_sign.py:

"""Unit tests for crawler_core.boss.sign — BossSign and helper functions.

All tests are pure function assertions: no HTTP, no network, no mocks.
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

import re
import pytest
from crawler_core.boss.sign import BossSign, _compute_checksum, _generate_uuid, _CHARS


class TestBossSignGenerateTraceid:
    def test_traceid_format(self):
        tid = BossSign.generate_traceid()
        assert re.match(r'^M-W[0-9a-f]{13}[0-9a-zA-Z]{6}[0-9a-zA-Z]{3}$', tid), \
            f"Traceid format wrong: {tid}"

    def test_traceid_length(self):
        tid = BossSign.generate_traceid()
        assert len(tid) == 25, f"Expected 25 chars, got {len(tid)}: {tid}"

    def test_traceid_custom_prefix(self):
        tid = BossSign.generate_traceid(prefix="X-Y")
        assert tid.startswith("X-Y"), f"Expected X-Y prefix, got: {tid}"

    def test_traceid_uniqueness(self):
        t1 = BossSign.generate_traceid()
        t2 = BossSign.generate_traceid()
        assert t1 != t2, "Two calls should return different traceids"

    def test_bosssign_init_defaults(self):
        sign = BossSign()
        assert sign.mpt == ""
        assert sign.wt2 == ""

    def test_bosssign_init_with_tokens(self):
        sign = BossSign(mpt="mpt_token", wt2="wt2_token")
        assert sign.mpt == "mpt_token"
        assert sign.wt2 == "wt2_token"


class TestComputeChecksum:
    def test_checksum_length(self):
        checksum = _compute_checksum("1234567890abc456789")  # 19 chars
        assert len(checksum) == 3, f"Expected 3 chars, got {len(checksum)}"

    def test_checksum_chars_in_base62(self):
        checksum = _compute_checksum("1234567890abc456789")
        for ch in checksum:
            assert ch in _CHARS, f"Char {ch!r} not in base62 set"

    def test_checksum_deterministic(self):
        uuid_str = "1234567890abc456789"
        c1 = _compute_checksum(uuid_str)
        c2 = _compute_checksum(uuid_str)
        assert c1 == c2, "Same input must produce same checksum"

    def test_checksum_differs_for_different_input(self):
        # Different inputs should (almost always) produce different checksums
        c1 = _compute_checksum("1234567890abc456789")
        c2 = _compute_checksum("9876543210xyz456789")
        # Not guaranteed to differ but extremely likely
        # We test at least that they are valid 3-char strings
        assert len(c1) == 3 and len(c2) == 3


class TestGenerateUuid:
    def test_generate_uuid_length(self):
        uuid = _generate_uuid()
        assert len(uuid) == 19, f"Expected 19 chars, got {len(uuid)}: {uuid}"

    def test_generate_uuid_hex_prefix(self):
        uuid = _generate_uuid()
        hex_part = uuid[:13]
        assert re.match(r'^[0-9a-f]{13}$', hex_part), \
            f"First 13 chars should be hex: {hex_part}"

    def test_generate_uuid_base62_suffix(self):
        uuid = _generate_uuid()
        rand_part = uuid[13:]
        for ch in rand_part:
            assert ch in _CHARS, f"Char {ch!r} in random suffix not in base62"

tests/crawler_core/test_qcwy_sign.py:

"""Unit tests for crawler_core.qcwy.sign — Job51Sign.

All tests are pure function assertions: no HTTP, no network, no mocks.
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

import re
import pytest
from crawler_core.qcwy.sign import Job51Sign, SIGN_KEY


class TestJob51SignInit:
    def test_default_sign_key(self):
        signer = Job51Sign()
        assert signer.sign_key == SIGN_KEY
        assert len(SIGN_KEY) == 64  # 64-char hex key

    def test_custom_sign_key(self):
        custom_key = "a" * 64
        signer = Job51Sign(sign_key=custom_key)
        assert signer.sign_key == custom_key


class TestJob51SignBuildSignPath:
    def setup_method(self):
        self.signer = Job51Sign()

    def test_returns_tuple_of_two_strings(self):
        result = self.signer.build_sign_path("open/test")
        assert isinstance(result, tuple)
        assert len(result) == 2
        assert all(isinstance(s, str) for s in result)

    def test_get_path_format(self):
        path, sign = self.signer.build_sign_path("open/test", "GET")
        assert path.startswith("/open/test?api_key=51job&timestamp="), \
            f"Path format wrong: {path}"

    def test_sign_hex_length(self):
        _, sign = self.signer.build_sign_path("open/test")
        assert len(sign) == 64, f"Sign should be 64-char hex, got {len(sign)}: {sign}"

    def test_sign_hex_format(self):
        _, sign = self.signer.build_sign_path("open/test")
        assert re.match(r'^[0-9a-f]{64}$', sign), f"Sign not hex: {sign}"

    def test_get_vs_post_different_sign(self):
        _, get_sign = self.signer.build_sign_path("open/test", "GET")
        _, post_sign = self.signer.build_sign_path("open/test", "POST", body={"k": "v"})
        assert get_sign != post_sign, "GET and POST should produce different signatures"

    def test_get_with_params_includes_params_in_path(self):
        path, _ = self.signer.build_sign_path("open/test", "GET", params={"city": "shanghai"})
        assert "city" in path and "shanghai" in path, \
            f"Params should appear in path: {path}"

    def test_sign_key_in_path(self):
        path, _ = self.signer.build_sign_path("open/jobs")
        assert "api_key=51job" in path, f"api_key=51job missing from path: {path}"


class TestJob51SignGenerateUuid:
    def test_generate_uuid_is_string(self):
        uuid = Job51Sign.generate_uuid()
        assert isinstance(uuid, str)

    def test_generate_uuid_length(self):
        uuid = Job51Sign.generate_uuid()
        # 13-char ms timestamp + 10-char random int = 23 chars
        assert len(uuid) == 23, f"Expected 23 chars, got {len(uuid)}: {uuid}"

    def test_generate_uuid_numeric(self):
        uuid = Job51Sign.generate_uuid()
        assert uuid.isdigit(), f"UUID should be all digits: {uuid}"

tests/crawler_core/test_zhilian_sign.py:

"""Unit tests for crawler_core.zhilian.sign — ZhilianSign.

All tests are pure function assertions: no HTTP, no network, no mocks.
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))

import re
import pytest
from crawler_core.zhilian.sign import ZhilianSign

EXPECTED_HEADER_KEYS = {
    "x-zp-at", "x-zp-rt", "x-zp-action-id", "x-zp-page-code",
    "x-zp-version", "x-zp-channel", "x-zp-platform", "x-zp-device-id",
    "x-zp-business-system",
}

EXPECTED_PARAM_KEYS = {"at", "rt", "channel", "platform", "version", "d"}


class TestZhilianSignInit:
    def test_defaults(self):
        sign = ZhilianSign()
        assert sign.at == ""
        assert sign.rt == ""
        assert sign.version == "4.1.259"
        assert sign.channel == "wxxiaochengxu"
        assert sign.platform == "12"
        assert sign.device_id  # auto-generated, not empty

    def test_custom_tokens(self):
        sign = ZhilianSign(at="at_token", rt="rt_token")
        assert sign.at == "at_token"
        assert sign.rt == "rt_token"

    def test_custom_device_id(self):
        sign = ZhilianSign(device_id="CUSTOM-DEVICE-ID")
        assert sign.device_id == "CUSTOM-DEVICE-ID"

    def test_auto_device_id_is_uuid4_format(self):
        sign = ZhilianSign()
        uuid_pattern = r'^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$'
        assert re.match(uuid_pattern, sign.device_id), \
            f"device_id not UUID4 format: {sign.device_id}"


class TestZhilianSignHeaders:
    def setup_method(self):
        self.sign = ZhilianSign(at="at123", rt="rt456")

    def test_keys_exactly_nine(self):
        headers = self.sign.sign_headers()
        assert set(headers.keys()) == EXPECTED_HEADER_KEYS, \
            f"Header keys wrong: {set(headers.keys())}"

    def test_business_system_is_73(self):
        headers = self.sign.sign_headers()
        assert headers["x-zp-business-system"] == "73"

    def test_tokens_reflected(self):
        headers = self.sign.sign_headers()
        assert headers["x-zp-at"] == "at123"
        assert headers["x-zp-rt"] == "rt456"

    def test_action_id_is_uuid4_format(self):
        headers = self.sign.sign_headers()
        action_id = headers["x-zp-action-id"]
        uuid_pattern = r'^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$'
        assert re.match(uuid_pattern, action_id), \
            f"action_id not UUID4 format: {action_id}"

    def test_action_id_unique_per_call(self):
        h1 = self.sign.sign_headers()
        h2 = self.sign.sign_headers()
        assert h1["x-zp-action-id"] != h2["x-zp-action-id"], \
            "action_id must be freshly generated on each call"

    def test_device_id_in_headers(self):
        headers = self.sign.sign_headers()
        assert headers["x-zp-device-id"] == self.sign.device_id


class TestZhilianSignParams:
    def setup_method(self):
        self.sign = ZhilianSign(at="at789", rt="rt012", device_id="DEV-ID")

    def test_keys_exactly_six(self):
        params = self.sign.sign_params()
        assert set(params.keys()) == EXPECTED_PARAM_KEYS, \
            f"Param keys wrong: {set(params.keys())}"

    def test_device_id_as_d(self):
        params = self.sign.sign_params()
        assert params["d"] == "DEV-ID"

    def test_tokens_reflected(self):
        params = self.sign.sign_params()
        assert params["at"] == "at789"
        assert params["rt"] == "rt012"


class TestZhilianGenerateUuid:
    def test_uuid4_format(self):
        uuid = ZhilianSign.generate_uuid()
        uuid_pattern = r'^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$'
        assert re.match(uuid_pattern, uuid), \
            f"UUID not UUID4 format: {uuid}"

    def test_uuid_length(self):
        uuid = ZhilianSign.generate_uuid()
        assert len(uuid) == 36, f"Expected 36 chars, got {len(uuid)}"

    def test_uuid_version_4(self):
        uuid = ZhilianSign.generate_uuid()
        assert uuid[14] == "4", f"Version digit should be 4, got: {uuid[14]}"

After creating all files, run the tests to verify they pass. If any test fails due to a mismatch between the test expectation and the actual sign algorithm behavior, fix the TEST (not the sign algorithm) to match the actual behavior — the sign algorithms are the ground truth. cd /Users/win/2025/AICoding/JobData && python -m pytest tests/crawler_core/ -v --tb=short 2>&1 | tail -30 <acceptance_criteria> - tests/crawler_core/__init__.py exists (even if empty) - tests/crawler_core/test_boss_sign.py exists with at least 8 test functions - tests/crawler_core/test_qcwy_sign.py exists with at least 7 test functions - tests/crawler_core/test_zhilian_sign.py exists with at least 9 test functions - python -m pytest tests/crawler_core/ -v exits 0 — ALL tests pass - python -m pytest tests/crawler_core/ -v output contains "passed" and zero "failed" or "error" - No test in any file makes HTTP requests or reads from files (pure function tests only) - grep -r "requests\|httpx\|mock\|patch" tests/crawler_core/ returns empty (no mocking needed) - Test count: python -m pytest tests/crawler_core/ --collect-only -q reports at least 24 tests collected </acceptance_criteria> Three test files cover all sign algorithm edge cases. pytest tests/crawler_core/ exits 0. No network access required.

Run the complete verification suite to confirm Phase 1 is done:
cd /Users/win/2025/AICoding/JobData

# 1. All sign algorithms importable from crawler_core
python -c "
import sys
sys.path.insert(0, '.')
from crawler_core.boss.sign import BossSign
from crawler_core.qcwy.sign import Job51Sign
from crawler_core.zhilian.sign import ZhilianSign
print('All sign imports OK')
"

# 2. All tests pass
python -m pytest tests/crawler_core/ -v

# 3. No cross-contamination
grep -r "from spiderJobs" crawler_core/ && echo "FAIL" || echo "OK: no spiderJobs imports"
grep -r "import requests" crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py && echo "FAIL" || echo "OK: sign files have no HTTP imports"

# 4. Old files untouched
git diff --name-only spiderJobs/platforms/ && echo "FAIL: spiderJobs modified" || echo "OK: spiderJobs untouched"

<success_criteria>

  1. python -m pytest tests/crawler_core/ -v exits 0 with at least 24 tests collected and passing
  2. from crawler_core.boss.sign import BossSign succeeds (with repo root on sys.path)
  3. from crawler_core.qcwy.sign import Job51Sign succeeds
  4. from crawler_core.zhilian.sign import ZhilianSign succeeds
  5. grep -r "from spiderJobs" crawler_core/ returns empty
  6. grep -r "import requests" crawler_core/boss/sign.py crawler_core/qcwy/sign.py crawler_core/zhilian/sign.py returns empty
  7. git diff --name-only spiderJobs/ returns empty — old files untouched
  8. All three sign.py files in crawler_core/ are under 120 lines each (no bloat) </success_criteria>
After completion, create `.planning/phases/01-shared-core/01-02-SUMMARY.md` with: - List of files created (sign.py files + test files with line counts) - Test counts per file and total - Any behavioral differences discovered between spiderJobs/ originals and crawler_core/ copies - Confirmation that old spiderJobs/ files were not modified