Take over backend social scraper and Modal guardrail updates#146
Conversation
- Decodo/auth public-only HARD GUARD (assert_public_comments_isolation + job_runner chokepoint) - bug#1 foundation: counts.expected_reply_count + public child-fetch decision wired - bug#4: unknown-count + zero-recovered no longer silently marked complete - 23 new/relevant unit tests green (isolation/counts/completeness/proxy) NOTE: bundles pre-existing in-progress public-comments WIP already present in the working tree (not authored this session); committed on a feature branch as a recoverable baseline before subagent-driven implementation of the full plan.
…ils' into codex/publish/trr-backend/20260616-takeover-all
…dex/publish/trr-backend/20260616-takeover-all
WS1a (fetcher.py, counts.py): - T3 zero-reply-probe limit (env SOCIAL_INSTAGRAM_PUBLIC_COMMENTS_ZERO_REPLY_PROBE_LIMIT, default 0 = skip reply-less parents) — biggest public-lane per-post saving - T4 public fast-fail timeouts (post 20s / child 10s, independent of auth deadlines) - C1 bug#1: authenticated reply gates + fetch target + terminal clamp use expected_reply_count - C2 bug#5/#9: 429 sleeps max(backoff,cooldown); cooldown record wrapped in except OSError - C3 bug#10b: memory guardrail <= -> < (inclusive cap) WS3 (persistence.py, social_season_analytics_impl.py): - bug#3: _build_upsert_update_clause + coalesce_preserve_cols; author/url cols COALESCE-preserved (no NULL clobber on metadata-poor re-scrape) - bug#10c: no-season reply parent-link seeded from DB + depth-ordered upsert Tests: reconciled 3 status-only tests to new T3/C3 contract + added default-skip test; new env-knob + upsert-clause unit tests. Full comments_scrapling suite 89 passed; the 5 job_runner failures are pre-existing DB-unavailable env failures (verified independent of these changes by revert-retest).
|
Important Review skippedToo many files! This PR contains 168 files, which is 18 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (168)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codex Exhaustive Code ReviewFindings
Validation I did not modify files. I ran the requested diff boundary commands and targeted inspection. |
…etadata Completes WS2's bug#2 wiring: fetch_comments_for_shortcode now accepts the degraded-expected-counts signal threaded from the job runner and stamps result.diagnostic_metadata['expected_count_unknown']=True, which the completeness guard reads to keep the post retryable. Thin wrapper delegates to _fetch_comments_for_shortcode_impl so all return paths are covered. Suite: 100 passed, 5 pre-existing DB-unavailable failures (unchanged).
- _normalize_comment_date_window (ISO 8601, inclusive start / exclusive end, UTC) + _comment_date_window_predicate helpers - thread date_start/date_end through preview, target/incomplete shortcodes, start, resume; store date_start/date_end/target_window in run + shard job config - inject 'p.posted_at >= %s AND < %s' into owner + collaborator/catalog target SQL (unchanged when no window) so enumeration only scans the requested window - request schema: date_start/date_end/comments_worker_count/comments_target_batch_size - 11 new date-window unit tests; 43 socials route tests pass
|
Resolved the Codex exhaustive review blockers in |
Summary\n- Captures backend SocialBlade, Chrome profile preflight, scraper, Modal guardrail, and social ingestion updates from the takeover branch set.\n- Includes fixes for unit-test DB isolation, explicit comment-anchor launch auth probing, Modal maintenance owner env isolation, and Bravo core account expectations.\n\n## Local validation\n- PASS: previously failing backend subset, 23 tests passed.\n- PARTIAL: targeted backend suite reached 480 passed before interrupting an unbounded live Postgres wait; no failures had appeared.\n- PASS: Modal follow-through was already completed for the SocialBlade/profile slice before PR orchestration, including API canary and readiness probe.\n\n## Notes\n- SQL ownership changed through included migrations; ledger/inventory should be reviewed with the backend diff.\n- The broad local backend command was interrupted only because it was blocked inside a live Postgres cursor execution.