Skip to content

feat(grantrecs): grant-history join + daily opportunity projection (#1218)#1237

Open
paulalbert1 wants to merge 1 commit into
masterfrom
feat/funding-matcher-1218
Open

feat(grantrecs): grant-history join + daily opportunity projection (#1218)#1237
paulalbert1 wants to merge 1 commit into
masterfrom
feat/funding-matcher-1218

Conversation

@paulalbert1

Copy link
Copy Markdown
Contributor

Funding matcher: grant-history join + daily opportunity projection (#1218)

Two related follow-ups from docs/funding-matcher-redesign-handoff.md, bundled under the #1218 umbrella. Review only — do not merge.

Step 3 — grant-history join (app)

Extends the reverse matcher ("rank researchers for this opportunity") with each scholar's grant/degree history:

  • deriveGrantSignals() (pure, unit-tested) → esiEligible / yearsSinceDegree / fundingStatus, attached post-ranking.
    • fundingStatus uses the canonical isFundingActive (12-month NCE grace), so it agrees with the profile's "Active funding" badge — not a bespoke endDate >= now.
    • esiEligible dates from the terminal research/clinical degree (PhD/MD/…), not the most recent credential, so a later MPH/cert doesn't falsely reset the ESI clock.
  • ESI clause in researcherBlurb; funding-status filter dropdown; funding + "Also in their Grants for me" row badges; two new CSV columns.
  • Cross-ref (opportunitiesInTopMatches): a cheap MySQL-only topic-affinity top-N over the open-opportunity corpus — gated to the same status/deadline the forward matcher uses, so a closed/past-due opp (reachable via the unfiltered browse list) can never over-claim "in their Grants for me." Validated against the real forward matcher in scripts/funding-crossref-compare.ts: 100% top-10 agreement, past-due guard clean.

Step 2 — daily opportunity projection (cdk)

etl:dynamodb (DynamoDB→MySQL opportunity projection) is a nightly step but sits after etl:ed, which #443 blocks — so the nightly aborts before reaching it and newly-published opportunities 404 until re-projected by hand.

  • Adds a standalone daily etl:dynamodb schedule (06:30 UTC) in Sps-Etl, mirroring the curation-backup pattern (task → retry → catch → state machine → rule → cadence alarm).
  • Creation-gated on a new opportunityProjectionScheduleEnabled flag: staging true, prod false (prod creates no resources until the corpus is published / Staging Session B: populate data + build search index #443 lands and the nightly reaches the step again, at which point the stopgap retires).

Review & verification

  • Step 3 went through a multi-agent adversarial review; the 3 confirmed issues (funding-active divergence, ESI terminal-degree, cross-ref over-claim) are fixed in this PR and re-verified.
  • App: affected unit suites green (grant-signal/ESI/funding/CSV coverage added); tsc + eslint clean.
  • cdk: etl-stack 130 tests green; staging snapshot updated; prod synth creates zero projection resources (creation-gated).
  • Cross-ref comparison + field population verified end-to-end against the local DB + OpenSearch.

Deploy (operator, gated — not in this PR)

  • Step 2: cdk deploy Sps-Etl-staging to activate the daily schedule.
  • Step 3: app image roll carries the code.
  • Staging visual verify (handoff Step 1) and prod rollout (Step 4) remain follow-ups.

Accepted residuals

  • Cross-ref skips per-scholar stage/US eligibility flags (status/deadline gate removes the dominant false-positive source) — documented in code.
  • Comparison numbers are from the local 12-opp corpus; re-run the harness on staging's 653 to confirm at scale.

…1218)

Step 3 (app) — extend the reverse funding matcher with grant history: deriveGrantSignals() attaches esiEligible / yearsSinceDegree / fundingStatus per researcher (funding via the canonical isFundingActive NCE-grace rule; ESI dated from the terminal research/clinical degree, not the latest credential). Adds the ESI blurb clause, a funding-status filter, funding + 'Also in their Grants for me' row badges, and two CSV columns. The cross-ref is a cheap MySQL-only topic-affinity top-N over the open-opportunity corpus (same status/deadline gate the forward matcher uses, so closed/past-due opps can't over-claim); validated against the real forward matcher in scripts/funding-crossref-compare.ts (100% top-10 agreement, past-due guard clean).

Step 2 (cdk) — standalone daily etl:dynamodb schedule (06:30 UTC) so newly-published opportunities stop 404-ing while the nightly is blocked at etl:ed (#443). Creation-gated on a new opportunityProjectionScheduleEnabled flag (staging on, prod off), modeled on the curationBackup schedule: task -> retry -> catch -> state machine -> rule -> cadence alarm.

Tests: +grant-signal/ESI/funding/CSV unit coverage; cdk staging rules 8->9 + projection schedule test + prod-absence test (130 cdk tests). Operator deploys: cdk deploy Sps-Etl-staging (Step 2) + app image roll (Step 3).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant