Date: 2026-05-19 Scope: TRR-Backend admin social/profile routes, database pool behavior, social catalog freshness DebugPro status: INCONCLUSIVE root cause, OPEN bug
The backend is intermittently exhausting its local database pool and hitting statement timeouts while serving admin social/profile reads. The most visible failure is the Instagram catalog freshness route for thetraitorsus, but the same pressure window also degrades the live-status stream and unrelated admin show reads.
The bounded cause is query and pool pressure in admin read paths that share the backend process. The exact root cause is not fully proven yet because the current evidence is from live logs and source inspection, not an isolated query plan or load reproduction.
api/routers/socials/__init__.py:5255exposesPOST /profiles/{platform}/{account_handle}/catalog/freshnesswith a defaultstatement_timeout_ms=3000.trr_backend/socials/social_season_analytics_impl.py:59123opens thecatalog-freshnessconnection from thesocial_profilepool, sets a local statement timeout, and then calls_catalog_recent_runs.trr_backend/socials/social_season_analytics_impl.py:31515builds_catalog_recent_runsas a multi-stage query oversocial.scrape_runsand lateral joins intosocial.scrape_jobs..logs/workspace/trr-backend.logshowscatalog-freshnesstiming out withpsycopg2.errors.QueryCanceled: canceling statement due to statement timeout, followed byPOST /api/v1/admin/socials/profiles/instagram/thetraitorsus/catalog/freshnessreturning503 Service Unavailable.- The same log window shows pool saturation:
acquire_failed ... error=PoolError in_use=3 available=0forfetch_one,fetch_all, andread. api/routers/socials/__init__.py:6486runs/live-status/stream; the log showsSocial live-status stream tick degradedafterasyncio.wait_fortimes out around the same pool-pressure window..logs/workspace/runtime-reconcile.jsonreportsoverall_state=ok, database history in sync, Modal readiness ok, and Instagram remote auth ready. That bounds this away from migration drift, missing Modal functions, or missing Instagram cookies.
Start the workspace in the normal TRR browser-verification lane, open the Instagram account profile, and trigger the catalog freshness check while the page is also loading catalog posts, review queue, cookie health, and live status.
Observed failing signals from the current log:
- Backend route:
POST /api/v1/admin/socials/profiles/instagram/thetraitorsus/catalog/freshness - Backend response:
503 Service Unavailable - Backend exception:
psycopg2.errors.QueryCanceled: canceling statement due to statement timeout - Pool pressure:
PoolError in_use=3 available=0
Practical fix direction:
- Make
catalog-freshnesscheap before it reaches the request timeout. The first target is_catalog_recent_runs, because the failing stack enters that query before the route returns 503. - Add a focused index or query rewrite for the
social.scrape_runstosocial.scrape_jobslookup pattern used by account catalog recent-run reads. - Keep the route bounded: if the recent-run query times out, return a partial freshness payload with
latest_runmarked unavailable instead of failing the entire freshness check. - Add a narrow regression test that simulates timeout from
_catalog_recent_runsand verifies that the route returns a degraded freshness response rather than a 503.
Likely files:
trr_backend/socials/social_season_analytics_impl.pyapi/routers/socials/__init__.pytests/api/routers/test_socials_season_analytics.pytests/repositories/test_socialblade_growth.pyor a new social catalog freshness test- possibly a Supabase migration under
supabase/migrations/if the query needs a new index
- Passed:
.venv/bin/python -m pytest tests/api/test_admin_socialblade.py tests/repositories/test_socialblade_growth.py -q - Result:
22 passed in 2.50s - Not yet run: live browser reproduction after a backend fix, because this report only documents the bug.
- Not yet run: query plan capture for
_catalog_recent_runs; that is the next evidence step before choosing index versus query rewrite.
- Add a backend test for catalog freshness timeout degradation.
- Add a query-plan check or documented index rationale for the account catalog recent-run lookup.
- Keep
make statusor a focused pressure probe in the verification path so future fixes prove both route behavior and pool pressure.
Bounded sweep completed by log/source inspection only:
- Catalog freshness route
- Live-status stream
- Admin show/social read routes affected in the same pressure window
- Runtime reconcile and Modal readiness
No code sweep or fix was applied.
- Does
_catalog_recent_runsneed a new composite/expression index, or can the query be split into cheaper run and job lookups? - Should catalog freshness return partial data when recent-run lookup times out?
- Are the current local backend pool caps intentionally low for this page load, or should the account profile page defer more secondary reads?