Skip to content

Story 2451: AI-Assisted Description for Link Posts type#2477

Open
javiercoronadonarvaez wants to merge 41 commits into
developfrom
javiercoronarv/2451-ai-assisted-description-link-post
Open

Story 2451: AI-Assisted Description for Link Posts type#2477
javiercoronadonarvaez wants to merge 41 commits into
developfrom
javiercoronarv/2451-ai-assisted-description-link-post

Conversation

@javiercoronadonarvaez

@javiercoronadonarvaez javiercoronadonarvaez commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Issue: #2451

Summary & Context

Adds an Auto-Generate Description flow to the Link post type for any public web URL: the backend fetches the page, extracts the main article text (title + body) with trafilatura, and runs it through the same summarizer used by Blog/News so the user gets an editable draft instead of a blank Description field.

  • Figma link: N/A
  • Link to components/page: localhost:8000/news/add (select Link as the post type)

⚠️ Built on top of #2473.
You'll need to update your local .env so the summarization endpoint can reach OpenRouter:

  • OPENROUTER_API_KEY=<key> (required)
  • SUMMARIZATION_MODEL=<model> (optional, defaults to gpt-oss-120b — see config/settings.py:658)

Changes

  • New endpoint POST /v3/news/generate-link-description/ that accepts any public http(s) URL, fetches the page (10s timeout) through an SSRF-guarded helper, extracts title + body, and returns a summarized description as JSON. Three distinct failure modes (invalid/non-public URL → 400, fetch/extraction failure → 502, summarization failure → 502) map to separate inline messages.
  • New helper extract_article in news/helpers.py: trafilatura isolates the main article and strips boilerplate (navigation, footers, comments, ads), keeping the summarizer focused and cheaper than feeding it the whole page. Falls back to a naive visible-text dump (extract_content) when trafilatura can't parse a page, and returns ("", "") only when even the fallback is empty. (Replaces the previous cppalliance.org-only extract_cppalliance_post CSS selectors.)
  • New SSRF-safe fetch safe_get in news/helpers.py, used by both the endpoint and the background set_summary_for_link_entry task: rejects non-http(s) schemes and hosts resolving to loopback/private/link-local/reserved IPs (e.g. 127.0.0.1, 10.x, 169.254.169.254), and follows redirects manually so each hop's host is re-validated.
  • Background set_summary_for_link_entry task now uses trafilatura extraction + the SSRF guard, and skips gracefully (logs and returns) when no readable text can be extracted.
  • Link post type mirrors Blog/News' Description UX: Auto-Generate button (enabled once the link field holds a valid http(s) URL), 1000-char counter, Saving/Saved indicator, per-post-type localStorage draft, inline help (hidden once the URL is valid) / error states, and the same spinner + "generating…" placeholder shown while the request is in flight.
  • description field is persisted to Link.summary on submit (same path Blog/News already use), so the generated text is available immediately on the post page without waiting for the summary_dispatcher task.

‼️ Risks & Considerations ‼️

  • Extraction is not universal: trafilatura is a solid, widely-used main-content extractor, but no extractor handles every page. JS-rendered SPAs (content not in the fetched HTML), paywalled pages, anti-bot blocks, and non-article URLs (PDFs, etc.) won't yield usable text. When extraction comes back empty the user sees the inline "We couldn't read that link" error and can still write the description manually — failure is graceful, not a crash.
  • SSRF surface: removing the cppalliance.org allowlist means the server now fetches arbitrary user-supplied URLs. This is mitigated by safe_get, which blocks loopback/private/link-local/reserved targets (including the cloud-metadata address 169.254.169.254) and re-validates each redirect hop. Known residual gap: DNS rebinding (a host that resolves to a public IP at check time and a private one at connect time) is not closed — documented in safe_get; fully closing it would require pinning the validated IP into the actual connection.
  • Endpoint is not login-gated and not rate-limited yet (matches the existing generate_description endpoint — flagged in a NOTE: in news/views.py). Both should get @login_required + throttling before this is exposed externally — more important now that the fetch target is fully user-controlled.
  • Synchronous outbound fetch from a web worker with a 10s timeout — a hung upstream still ties up a worker for up to 10s.
  • Stacked on Story 2428: AI-Assisted Description for Blog and News Posts #2473: the diff against develop includes that PR's work; review against the parent branch (javiercoronarv/2428-ai-assisted-desc-blog-news) for a clean view of the changes unique to this PR.

Screenshots

Persistence and Functionality

Functionality.Persistence.mov

Dimensionality and Dark/Light Modes

Dimensionality.DarkLightMode.mov

Self-review Checklist

  • Tag at least one team member from each team to review this PR
  • Link this PR to the related GitHub Project ticket

Frontend

  • UI implementation matches Figma design
  • Tested in light and dark mode
  • Responsive / mobile verified
  • Accessibility checked (keyboard navigation, etc.)
  • Ensure design tokens are used for colors, spacing, typography, etc. – No hardcoded values
  • Test without JavaScript (if applicable)
  • No console errors or warnings

Summary by CodeRabbit

  • New Features

    • AI-powered description endpoints for posts and external links
    • Per-post-type draft persistence in the create-post form
    • WYSIWYG editor bridge emitting live updates and accepting programmatic content
  • Improvements

    • Summary field added to post forms with live character counters and generate/save-state UI
    • Safer external link fetching and more robust main-content extraction
    • Configurable AI model settings and input-size guardrails
  • Chores

    • Tooling and dependency updates; formatting hook bumped

@javiercoronadonarvaez javiercoronadonarvaez force-pushed the javiercoronarv/2451-ai-assisted-description-link-post branch 2 times, most recently from f72233e to e57a705 Compare June 3, 2026 15:23
@javiercoronadonarvaez javiercoronadonarvaez changed the title Javiercoronarv/2451 ai assisted description link post Story 2451: AI-Assisted Description for Link Posts type Jun 3, 2026
@javiercoronadonarvaez javiercoronadonarvaez linked an issue Jun 3, 2026 that may be closed by this pull request
@javiercoronadonarvaez javiercoronadonarvaez force-pushed the javiercoronarv/2451-ai-assisted-description-link-post branch from eda4415 to ebd76ef Compare June 4, 2026 19:38
- Add description box alongside 'Saving' + Saving Icon section
- Does not display anything if the user hasn't started typing
- Display Saving text + Icon while user types and display Saved text + Icon once he's finished typing
- Corroborate auto generate description works with wysiwyg component
- Fix in accordance to GitHub Actions failure
…ated

- Add 'Hold on! We are generating a description for your content, it may take a few seconds' as placeholder in description box
…ry.summary

- Persist user-typed summary
- Skip background regeneration when set
- Auto-Generate button disabled unless a properly composed URL is input
…ded Link is invalid

- 'Auto-Generate only works for .html links from cppalliance.org' message displayed only when Link does not follow this pattern
@javiercoronadonarvaez javiercoronadonarvaez force-pushed the javiercoronarv/2451-ai-assisted-description-link-post branch from 16566d2 to 671b8a7 Compare June 4, 2026 21:27
javiercoronadonarvaez and others added 4 commits June 8, 2026 11:58
- Truncation to comply with 1000 character limit from Design templates
- Optimize prompt
- Update pre commit config file to match black's version in requirements (26.1.0)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds AI-assisted description generation: safe link fetching/extraction, centralized summarization (sync + Celery wrapper), new backend endpoints, form/template/JS wiring for create-post UI with draft persistence, and supporting dependency/settings and maintenance edits.

Changes

News description generation and persistence

Layer / File(s) Summary
Settings, constants, and dependencies
config/settings.py, news/constants.py, requirements.in, requirements.txt
New SUMMARIZATION_MODEL and WHATS_NEW_MODEL env settings, DESCRIPTION_SUMMARY_MAX_LENGTH constant, and HTML-extraction dependency additions/updates.
Safe URL fetching and article extraction
news/helpers.py
Introduced UnsafeURLError, safe_get() with SSRF protection and redirect re-validation, and extract_article() using trafilatura with fallback extraction.
Synchronous summarization and Celery wrapper
news/tasks.py
Added _truncate_to_length() and generate_summary() (prompting, timeout, truncation) and refactored summarize_content() to wrap the sync function and accept max_length.
Form summary field additions
news/forms.py
Added summary to BlogPostForm, LinkForm, and NewsForm Meta.fields.
Backend generation endpoints and wiring
config/urls.py, news/views.py
Registered v3/news/generate-description/ and v3/news/generate-link-description/ POST endpoints; pre-populate summary from description on V3 create submit.
WYSIWYG editor custom event bridge
frontend/wysiwyg-editor.js
Dispatches wysiwyg-update with id, char count, value, and programmatic flag; listens for wysiwyg-set-content to restore content programmatically.
Description section styling and WYSIWYG adjustments
static/css/v3/create-post-page.css
Adds description UI styles, char counter, save-state indicators/spin animation, and adjusts editor resize behavior.
Icon template and textarea field enhancements
templates/includes/icon.html, templates/v3/includes/_field_textarea.html
Adds saving/saved inline SVGs; adds alpine_x_model, alpine_at_input, and char_counter_max with live counter and maxlength wiring.
Create post template with Alpine draft persistence and generation
templates/news/v3/create.html
Extends window.createPostForm() with per-type localStorage drafts, WYSIWYG bridging, link validation, and async generateDescription()/generateLinkDescription() flows that POST to backend endpoints.
Test and fixture updates & misc maintenance
news/tests/test_forms.py, various files
Updated form-field tests to expect the summary field; added module-level logging/clients, pre-commit bump, whitespace/dedent/test-skip edits, and library/test formatting tweaks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Poem

🐰 A rabbit hops through code so neat,
Fetches links safe, then prunes and sweet.
From WYSIWYG taps to model art,
Drafts persist and summaries start.
A little hop — new posts get heart!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.28% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Story 2451: AI-Assisted Description for Link Posts type' clearly and specifically summarizes the main change: adding AI-assisted description generation for the Link post type.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with Issue, Summary & Context, Changes, Risks & Considerations, Screenshots, and Self-review Checklist sections. It provides detailed technical context and implementation details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch javiercoronarv/2451-ai-assisted-description-link-post

Comment @coderabbitai help to get the list of available commands and usage tips.

@javiercoronadonarvaez javiercoronadonarvaez marked this pull request as ready for review June 9, 2026 13:52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
frontend/wysiwyg-editor.js (1)

824-827: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Markdown-mode edits are not propagated to the host state.

Line 824 updates state.markdownText, but Line 916 only emits wysiwyg-update on ProseMirror updates. In Markdown mode, the host page’s draft persistence/save indicator/counter can become stale (Line 908 also reads characters from editor state only).

Proposed fix
   mdTextarea.addEventListener("input", () => {
     state.markdownText = mdTextarea.value;
     debouncedMdPreview();
+    dispatchState(false);
   });
@@
+  const currentCharacters = () =>
+    state.mode === "markdown"
+      ? state.markdownText.length
+      : editor.state.doc.textContent.length;
+
   const dispatchState = (programmatic) => {
     editorEl.dispatchEvent(
       new CustomEvent("wysiwyg-update", {
         detail: {
           id: textareaId,
-          characters: editor.state.doc.textContent.length,
+          characters: currentCharacters(),
           value: currentValue(),
           programmatic: !!programmatic,
         },
         bubbles: true,
       }),
     );
   };

Also applies to: 903-909, 916-917

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/wysiwyg-editor.js` around lines 824 - 827, The markdown input
handler for mdTextarea currently sets state.markdownText and calls
debouncedMdPreview but does not notify the host that the document changed;
update the mdTextarea "input" listener (the block that sets state.markdownText
and calls debouncedMdPreview) to also emit the same wysiwyg-update event used by
the ProseMirror change handler, including the current markdownText and computed
character/count payload the host expects (the same shape used where
wysiwyg-update is emitted for ProseMirror updates), so the host-side draft
persistence/indicator and character counter stay in sync; ensure the emitted
payload mirrors what the existing ProseMirror emitter builds (character count
and document text) and reuse the same emit function or helper if one exists.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@libraries/forms.py`:
- Around line 371-383: The tuple unpacking from mailinglist_wordcloud_task.get()
includes an unused first element; change the variable name mailinglist_words to
_mailinglist_words (or just _) in the unpacking expression inside the function
where mailinglist_wordcloud_task.get() is called so the unused value is explicit
and linter warnings are resolved (leave mailinglist_wordcloud_base64 and
mailinglist_wordcloud_top_words unchanged).
- Around line 442-444: The call to determine_versions in generate_context
currently unpacks three values into report_before_release, prior_version,
version but report_before_release is never used; change the unpack to use a
throwaway name (e.g. _report_before_release or _) for the first element while
keeping prior_version and version so lint stops flagging an unused
binding—locate the determine_versions call inside generate_context and replace
report_before_release with the chosen underscore variable.

In `@news/helpers.py`:
- Around line 73-77: The loop in safe_get currently downloads the full response
body (requests.get(..., allow_redirects=False)) which can exhaust memory; change
to request with stream=True inside safe_get, read the response body
incrementally (e.g., via resp.iter_content) and enforce a hard byte cap (e.g.,
MAX_RESPONSE_BYTES) across reads, raising an error or closing the connection if
the cap is exceeded; ensure you still check _url_host_is_safe(url) and handle
redirects via max_redirects/timeout and propagate or wrap errors with
UnsafeURLError or a new ResponseTooLargeError as appropriate.

In `@news/tasks.py`:
- Line 180: The log at logger.info currently includes the raw fetched markup
(markup[:100]) which can leak third-party content/PII and create
high-cardinality noise; update the log in news/tasks.py to remove the markup
snippet and instead log only safe metadata such as entry.external_url (or a
derived safe identifier like its hostname) and/or the markup length (e.g.,
len(markup)) so you preserve useful telemetry without exposing content.
- Around line 99-104: The code currently returns a falsy summary value which can
be an empty string and then get persisted, potentially clearing an existing
Entry.summary; change the falsy-check so that when summary is empty or invalid
you return None (not the empty summary) to signal “do not save” to the saver
task, and add a brief logger.debug or logger.warning with context (e.g., include
response metadata) before returning; update the branch that checks summary (the
variable summary from response.choices[0].message.content) to return None on
falsy values so the persistence path can skip saving.

In `@news/views.py`:
- Around line 588-627: The generate_description view is currently
unauthenticated and exposes a paid LLM endpoint; wrap it with Django's
login_required decorator so authentication runs before the POST check by adding
`@login_required` as the outer decorator above generate_description (i.e., ensure
`@login_required` appears above `@require_POST`), and consider adding a
rate-limiting decorator or middleware around generate_description to prevent
abuse and limit calls/timeouts.
- Around line 634-697: Add the missing authentication decorator to the
generate_link_description view: update the function decorated with `@require_POST`
in news/views.py by placing Django's `@login_required` decorator (from
django.contrib.auth.decorators import login_required) immediately above the view
so the symbol generate_link_description is only callable by authenticated users;
ensure the import for login_required is added if not present and keep
`@login_required` above `@require_POST` to enforce login before method check.

In `@requirements.txt`:
- Around line 321-329: Update the vulnerable pinned packages: change the lxml
pin (currently lxml[html-clean]==6.0.2) to at least lxml>=6.1.0 and bump urllib3
(currently urllib3==2.6.3) to at least urllib3>=2.7.0, then regenerate the
flattened requirements file from the input pins using the repo's compile command
(e.g., run the same pip-compile/uv pip compile command used elsewhere to produce
./requirements.txt) so the new versions and resolved transitive pins are written
to requirements.txt.

In `@templates/news/v3/create.html`:
- Around line 96-111: The textarea with id "field-post_description"
(x-model="description", `@input`="onDescriptionEdited") is left with
name="description" even when hidden for link/video posts, causing duplicate form
keys; update the template to only submit the write-up value when active — for
example bind the name or disabled attribute to the write-up state (use
:name="isWriteUp ? 'description' : null" or :disabled="!isWriteUp") or remove
the static name and conditionally set it when isWriteUp is true so the hidden
textarea does not send a stale value.

---

Outside diff comments:
In `@frontend/wysiwyg-editor.js`:
- Around line 824-827: The markdown input handler for mdTextarea currently sets
state.markdownText and calls debouncedMdPreview but does not notify the host
that the document changed; update the mdTextarea "input" listener (the block
that sets state.markdownText and calls debouncedMdPreview) to also emit the same
wysiwyg-update event used by the ProseMirror change handler, including the
current markdownText and computed character/count payload the host expects (the
same shape used where wysiwyg-update is emitted for ProseMirror updates), so the
host-side draft persistence/indicator and character counter stay in sync; ensure
the emitted payload mirrors what the existing ProseMirror emitter builds
(character count and document text) and reuse the same emit function or helper
if one exists.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: fae02e15-aa7c-46e5-b45b-c37eb300a3f9

📥 Commits

Reviewing files that changed from the base of the PR and between 00a8e72 and 3dd05d6.

📒 Files selected for processing (45)
  • .pre-commit-config.yaml
  • ak/views.py
  • config/settings.py
  • config/urls.py
  • core/context_processors.py
  • core/tests/test_managers.py
  • core/tests/test_tasks.py
  • core/views.py
  • frontend/wysiwyg-editor.js
  • gunicorn.conf.py
  • libraries/admin.py
  • libraries/forms.py
  • libraries/tests/fixtures.py
  • libraries/tests/test_commands.py
  • libraries/tests/test_tasks.py
  • mailing_list/management/commands/sync_mailinglist_stats.py
  • mailing_list/tasks.py
  • marketing/tests.py
  • news/acl.py
  • news/constants.py
  • news/forms.py
  • news/helpers.py
  • news/tasks.py
  • news/tests/test_forms.py
  • news/urls.py
  • news/views.py
  • requirements.in
  • requirements.txt
  • slack/management/commands/clear_slack_activity.py
  • slack/management/commands/fetch_slack_activity.py
  • static/css/v3/create-post-page.css
  • static/js/v3/wysiwyg-editor.js
  • templates/includes/icon.html
  • templates/news/v3/create.html
  • templates/v3/includes/_field_textarea.html
  • users/forms.py
  • users/tests/test_urls.py
  • users/tests/test_views.py
  • users/urls.py
  • users/views.py
  • versions/releases.py
  • versions/tasks.py
  • versions/tests/test_ai_tasks.py
  • versions/tests/test_api.py
  • versions/views.py
💤 Files with no reviewable changes (18)
  • gunicorn.conf.py
  • users/forms.py
  • slack/management/commands/clear_slack_activity.py
  • ak/views.py
  • users/tests/test_views.py
  • core/context_processors.py
  • core/tests/test_managers.py
  • mailing_list/tasks.py
  • core/tests/test_tasks.py
  • news/acl.py
  • libraries/tests/test_commands.py
  • slack/management/commands/fetch_slack_activity.py
  • users/urls.py
  • versions/releases.py
  • users/tests/test_urls.py
  • news/urls.py
  • versions/tests/test_api.py
  • versions/tests/test_ai_tasks.py

Comment thread libraries/forms.py
Comment on lines +371 to +383
mailinglist_contributor_release_count, mailinglist_contributor_new_count = (
mailing_list_contributors_task.get()
)
(mailinglist_post_stats, total_mailinglist_count) = (
mailing_list_stats_task.get()
)
(commit_contributors_release_count, commit_contributors_new_count) = (
mailinglist_post_stats, total_mailinglist_count = mailing_list_stats_task.get()
commit_contributors_release_count, commit_contributors_new_count = (
commit_contributors_task.get()
)
(
mailinglist_words,
mailinglist_wordcloud_base64,
mailinglist_wordcloud_top_words,
) = mailinglist_wordcloud_task.get()
(search_wordcloud_base64, search_wordcloud_top_words, search_stats) = (
search_wordcloud_base64, search_wordcloud_top_words, search_stats = (

@coderabbitai coderabbitai Bot Jun 9, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unused unpacked value in task result handling.

mailinglist_words is unpacked but never used. Rename it to _mailinglist_words (or _) to make intent explicit and satisfy lint rules.

🧰 Tools
🪛 Ruff (0.15.15)

[warning] 379-379: Unpacked variable mailinglist_words is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libraries/forms.py` around lines 371 - 383, The tuple unpacking from
mailinglist_wordcloud_task.get() includes an unused first element; change the
variable name mailinglist_words to _mailinglist_words (or just _) in the
unpacking expression inside the function where mailinglist_wordcloud_task.get()
is called so the unused value is explicit and linter warnings are resolved
(leave mailinglist_wordcloud_base64 and mailinglist_wordcloud_top_words
unchanged).

Source: Linters/SAST tools

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope for this PR. They're pre-existing lint, unrelated to anything changed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks for the context. Would you like me to open a follow-up GitHub issue to track the cleanup of the unused mailinglist_words variable so it doesn't get lost?

(ᴖ̥̥̥̥ ‸ ᴖ̥̥̥̥)

Comment thread libraries/forms.py
Comment on lines +442 to 444
report_before_release, prior_version, version = determine_versions(
report_configuration.version
)

@coderabbitai coderabbitai Bot Jun 9, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unused value from determine_versions unpacking.

report_before_release is assigned but unused in generate_context. Replace with _report_before_release (or _) to avoid dead bindings and keep lint clean.

🧰 Tools
🪛 Ruff (0.15.15)

[warning] 442-442: Unpacked variable report_before_release is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libraries/forms.py` around lines 442 - 444, The call to determine_versions in
generate_context currently unpacks three values into report_before_release,
prior_version, version but report_before_release is never used; change the
unpack to use a throwaway name (e.g. _report_before_release or _) for the first
element while keeping prior_version and version so lint stops flagging an unused
binding—locate the determine_versions call inside generate_context and replace
report_before_release with the chosen underscore variable.

Source: Linters/SAST tools

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope for this PR. They're pre-existing lint, unrelated to anything changed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, @javiercoronadonarvaez — the formatting change here is purely cosmetic (parens removal) and the RUF059 warning on report_before_release is indeed pre-existing. I'll dismiss this comment.

Would you like me to open a follow-up GitHub issue to track fixing the unused report_before_release binding in generate_context (and the similar one in get_stats) so it doesn't get lost?

(´• ω •`) 🐇

Comment thread news/helpers.py Outdated
Comment thread news/tasks.py
Comment thread news/tasks.py Outdated
Comment thread news/views.py
Comment on lines +588 to +627
@require_POST
def generate_description(request):
"""Generate an AI description from submitted content (synchronous).

Backs the "Auto-Generate Description" button on the v3 create-post page.
Runs the summarization model inline and returns the result as JSON so the
browser can drop it into the Description field.

NOTE: intentionally not login-gated yet (local testing). This endpoint calls
a paid LLM, so add @login_required (and rate limiting) before it ships.
"""
title = request.POST.get("title", "").strip()
content = request.POST.get("content", "").strip()

if not content:
return JsonResponse({"error": "Add some content first."}, status=400)

try:
# Call the plain helper
summary = generate_summary(
content,
title,
settings.SUMMARIZATION_MODEL,
DESCRIPTION_SUMMARY_MAX_LENGTH,
timeout=30,
)
except Exception:
logger.exception("generate_description: summarization failed")
return JsonResponse(
{"error": "Could not generate a description. Please try again."},
status=502,
)

if not summary:
return JsonResponse(
{"error": "Could not generate a description. Please try again."},
status=502,
)

return JsonResponse({"description": summary.strip()})

@coderabbitai coderabbitai Bot Jun 9, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing authentication exposes paid LLM endpoint to abuse.

The docstring notes this endpoint should have @login_required before shipping, but it's being merged without authentication. Unauthenticated access to an endpoint that calls paid LLM APIs creates:

  • Cost exposure from repeated calls
  • Resource exhaustion (30s timeout ties up workers)

Add @login_required before the @require_POST decorator (order matters: @login_required should be outer so it runs first). Consider rate limiting as a follow-up.

🛡️ Proposed fix
+from django.contrib.auth.decorators import login_required
+
+@login_required
 `@require_POST`
 def generate_description(request):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/views.py` around lines 588 - 627, The generate_description view is
currently unauthenticated and exposes a paid LLM endpoint; wrap it with Django's
login_required decorator so authentication runs before the POST check by adding
`@login_required` as the outer decorator above generate_description (i.e., ensure
`@login_required` appears above `@require_POST`), and consider adding a
rate-limiting decorator or middleware around generate_description to prevent
abuse and limit calls/timeouts.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this ticket

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, @javiercoronadonarvaez. The docstring already captures the intent (add @login_required (and rate limiting) before it ships), so the risk is documented in-code.

Would you like me to open a follow-up GitHub issue to track adding authentication and rate limiting to both generate_description and generate_link_description before they ship to production?

Comment thread news/views.py
Comment on lines +634 to +697
@require_POST
def generate_link_description(request):
"""Fetch the linked page, extract its main text, and summarize it.

Works for any public http(s) URL (trafilatura isolates the main article and
falls back to a visible-text dump for unusual templates). The fetch goes
through ``safe_get``, which blocks SSRF to internal/private hosts and
re-validates redirect targets.

Failure modes return separate JSON errors:
- URL is missing, malformed, or points at a non-public host (400).
- Fetch failed or no readable text could be extracted (502, "couldn't
read that link").
- Summarization failed or returned empty (502, "couldn't generate").

NOTE: intentionally not login-gated yet (matches `generate_description`).
Add @login_required + rate-limiting before this ships.
"""
url = request.POST.get("url", "").strip()
if not url:
return JsonResponse({"error": _LINK_INVALID_ERROR}, status=400)

try:
resp = safe_get(url, timeout=10)
resp.raise_for_status()
except UnsafeURLError:
logger.warning("generate_link_description: blocked unsafe url", url=url)
return JsonResponse({"error": _LINK_INVALID_ERROR}, status=400)
except requests.RequestException:
logger.exception("generate_link_description: fetch failed", url=url)
return JsonResponse({"error": _LINK_FETCH_ERROR}, status=502)

title, body = extract_article(resp.text, url=url)
if not body:
# Page fetched OK but no readable text could be isolated.
logger.warning("generate_link_description: extraction empty", url=url)
return JsonResponse({"error": _LINK_FETCH_ERROR}, status=502)

# Feed the extracted body through the same summarizer used by the Blog/News
# path — synchronously, with a real timeout so a hung upstream doesn't tie
# up a web worker (autoretry_for on the Celery task is a no-op when called
# inline; see news/tasks.py).
try:
summary = generate_summary(
body,
title,
settings.SUMMARIZATION_MODEL,
DESCRIPTION_SUMMARY_MAX_LENGTH,
timeout=30,
)
except Exception:
logger.exception("generate_link_description: summarization failed", url=url)
return JsonResponse(
{"error": "Could not generate a description. Please try again."},
status=502,
)

if not summary:
return JsonResponse(
{"error": "Could not generate a description. Please try again."},
status=502,
)

return JsonResponse({"description": summary.strip()})

@coderabbitai coderabbitai Bot Jun 9, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Same authentication gap as generate_description.

This endpoint also lacks @login_required despite calling the paid LLM API. Additionally, the total potential worker tie-up is ~40s (10s fetch + 30s summarization), making unauthenticated abuse more impactful.

Apply the same fix: add @login_required before @require_POST.

🛡️ Proposed fix
+@login_required
 `@require_POST`
 def generate_link_description(request):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/views.py` around lines 634 - 697, Add the missing authentication
decorator to the generate_link_description view: update the function decorated
with `@require_POST` in news/views.py by placing Django's `@login_required`
decorator (from django.contrib.auth.decorators import login_required)
immediately above the view so the symbol generate_link_description is only
callable by authenticated users; ensure the import for login_required is added
if not present and keep `@login_required` above `@require_POST` to enforce login
before method check.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this ticket

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, @javiercoronadonarvaez — the docstring already notes this as a known gap to address before shipping. Would you like me to open a follow-up GitHub issue to track adding @login_required and rate-limiting to both generate_description and generate_link_description?

Comment thread requirements.txt Outdated
Comment thread templates/news/v3/create.html

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
requirements.in (1)

78-78: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin trafilatura to a specific version.

The trafilatura dependency lacks a version constraint, which creates security and reproducibility risks. Since this library parses arbitrary external web content (a high-risk surface), unpinned versions can silently introduce vulnerabilities or breaking changes.

📌 Recommended version constraint
 # Main-content extraction from arbitrary web pages (Link post Auto-Generate)
-trafilatura
+trafilatura>=2.0.0,<3.0.0  # pin major version for stability

Adjust the specific version range based on current testing. At minimum, pin the major version to prevent breaking changes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements.in` at line 78, Replace the unpinned dependency entry
"trafilatura" in requirements.in with a specific, constrained version to ensure
reproducible and secure installs; update the line that currently contains
trafilatura to either an exact pin (e.g., trafilatura==<stable-version>) or a
bounded range that pins the major version (e.g., trafilatura>=<min>,<max>) after
verifying compatibility in tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@requirements.in`:
- Line 78: Replace the unpinned dependency entry "trafilatura" in
requirements.in with a specific, constrained version to ensure reproducible and
secure installs; update the line that currently contains trafilatura to either
an exact pin (e.g., trafilatura==<stable-version>) or a bounded range that pins
the major version (e.g., trafilatura>=<min>,<max>) after verifying compatibility
in tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d7fa9b30-4af4-476d-bd5e-d78a1d41e795

📥 Commits

Reviewing files that changed from the base of the PR and between cce6fa3 and 2aebc6b.

📒 Files selected for processing (2)
  • requirements.in
  • requirements.txt
✅ Files skipped from review due to trivial changes (1)
  • requirements.txt

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
news/helpers.py (1)

74-77: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Docstring claims auth + rate-limiting that don't exist yet.

The docstring states the DNS-rebinding gap is "acceptable here given the endpoint is also gated by auth + rate limiting", but per the PR notes the endpoint is currently not login-gated or rate-limited. This documentation/implementation mismatch could mislead future maintainers about the actual threat model.

Update the comment to reflect reality, or add a TODO referencing the planned auth/rate-limiting work.

Suggested fix
-    Residual gap: DNS rebinding (host resolves public at check time, private at
-    connect time) is not closed — that needs pinning the validated IP into the
-    connection. Acceptable here given the endpoint is also gated by auth + rate
-    limiting; revisit if the threat model tightens.
+    Residual gap: DNS rebinding (host resolves public at check time, private at
+    connect time) is not closed — that needs pinning the validated IP into the
+    connection. TODO: add auth + rate-limiting to the calling endpoints before
+    production use; revisit if the threat model tightens.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/helpers.py` around lines 74 - 77, The comment in news/helpers.py that
says the DNS-rebinding gap is "acceptable here given the endpoint is also gated
by auth + rate limiting" is inaccurate because auth/rate-limiting aren't
implemented; update that docstring/comment to reflect reality by removing or
revising the claim and either (a) stating explicitly that the endpoint is
currently not login-gated or rate-limited and documenting the current risk, or
(b) adding a TODO with the planned auth/rate-limiting ticket reference and a
short note that the DNS-pin mitigation should be added when that work lands;
locate the existing DNS-rebinding text (the lines mentioning "Residual gap: DNS
rebinding...") and edit it accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@news/helpers.py`:
- Around line 74-77: The comment in news/helpers.py that says the DNS-rebinding
gap is "acceptable here given the endpoint is also gated by auth + rate
limiting" is inaccurate because auth/rate-limiting aren't implemented; update
that docstring/comment to reflect reality by removing or revising the claim and
either (a) stating explicitly that the endpoint is currently not login-gated or
rate-limited and documenting the current risk, or (b) adding a TODO with the
planned auth/rate-limiting ticket reference and a short note that the DNS-pin
mitigation should be added when that work lands; locate the existing
DNS-rebinding text (the lines mentioning "Residual gap: DNS rebinding...") and
edit it accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 24a2b839-ec8d-4672-90d2-1017e73a757b

📥 Commits

Reviewing files that changed from the base of the PR and between 2aebc6b and 9729724.

📒 Files selected for processing (1)
  • news/helpers.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
templates/news/v3/create.html (3)

236-243: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the same http(s) rule for submit validation.

linkUrlValid already treats only http:/https: as valid, but prepareSubmit() accepts any absolute URL that new URL() can parse. With native validation disabled, schemes like ftp: still pass submit-time checks even though the rest of this form treats them as invalid.

Suggested fix
             } else {
-              try { new URL(url); } catch (_) {
+              try {
+                const parsed = new URL(url);
+                if (parsed.protocol !== 'http:' && parsed.protocol !== 'https:') {
+                  throw new Error('unsupported scheme');
+                }
+              } catch (_) {
                 this.errors.external_url = 'Please enter a valid URL.';
               }
             }

Also applies to: 580-586

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/news/v3/create.html` around lines 236 - 243, linkUrlValid enforces
only http/https but prepareSubmit currently accepts any absolute URL; update
prepareSubmit to reuse the same validation logic. In prepareSubmit (the submit
handler), instead of only trying new URL(externalUrl), call or replicate
linkUrlValid's checks: trim externalUrl, ensure it's non-empty, parse with new
URL(), and require parsed.protocol === 'http:' || parsed.protocol === 'https:';
if it fails, treat the URL as invalid and block submission. Reference:
linkUrlValid, prepareSubmit, externalUrl.

564-615: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cancel pending autosave timers before clearing drafts.

If the user submits within the 800ms debounce window, the scheduled save*Draft() callbacks can run after clear*Draft() and recreate stale drafts in localStorage. That makes old content reappear on the next visit even though the submit path was successful.

Suggested fix
           } else {
             // Validation passed; the form will POST — drop the saved drafts.
             // (On a server-side reject, the description re-initializes from
             //  `form.summary.value` so the user's text stays visible.)
+            clearTimeout(this._titleSaveTimer);
+            clearTimeout(this._externalUrlSaveTimer);
+            clearTimeout(this._contentSaveTimer);
+            clearTimeout(this._descriptionSaveTimer);
+            clearTimeout(this._linkDescriptionSaveTimer);
             this.clearTitleDraft();
             this.clearExternalUrlDraft();
             this.clearContentDraft();
             this.clearDescriptionDraft();
             this.clearLinkDescriptionDraft();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/news/v3/create.html` around lines 564 - 615, prepareSubmit
currently clears drafts (clearTitleDraft, clearExternalUrlDraft,
clearContentDraft, clearDescriptionDraft, clearLinkDescriptionDraft) but does
not cancel pending debounced autosave callbacks, allowing save*Draft timers to
run after clearing and recreate stale drafts; before calling the clear*Draft
methods, cancel any pending autosave timers (e.g. clearTimeout on whatever
timeout IDs the debounced save handlers store, or call a new
this.cancelPendingAutosaveTimers() helper that clears those timeout IDs and/or
aborts in-flight saves), and if such timeout IDs or a cancel helper do not
exist, add them to the corresponding
saveTitleDraft/saveContentDraft/saveExternalUrlDraft/saveDescriptionDraft/saveLinkDescriptionDraft
implementations so prepareSubmit can reliably cancel outstanding autosave
callbacks before clearing drafts.

138-165: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist video descriptions too.

This textarea is active for both video and link, but the draft helpers only save/restore when this.isLink is true. For video posts the field shows a “Saved” state without ever writing to localStorage, so the description is lost on refresh.

Suggested fix
-        // ── Link description draft: localStorage, keyed per post type ─────────
+        // ── Link/Video description draft: localStorage, keyed per post type ───
         linkDescriptionDraftKey() {
           return 'boost:createPost:linkDescriptionDraft:' + (this.postType || '');
         },
@@
         saveLinkDescriptionDraft() {
-          if (!this.isLink) return;
+          if (!this.isLinkType) return;
           try {
             if (this.linkDescription) localStorage.setItem(this.linkDescriptionDraftKey(), this.linkDescription);
             else localStorage.removeItem(this.linkDescriptionDraftKey());
           } catch (_) {}
         },
 
         restoreLinkDescriptionDraft({ allowEmpty = false } = {}) {
-          if (!this.isLink) {
+          if (!this.isLinkType) {
             if (allowEmpty) this.linkDescriptionSaveStatus = '';
             return;
           }

Also applies to: 460-498

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/news/v3/create.html` around lines 138 - 165, The description
textarea is used for both video and link posts (x-model="linkDescription") but
the draft save/restore logic and "saved" indicator only run when this.isLink, so
video descriptions never get written to localStorage; update the draft helpers
to run for the same condition that shows the textarea (use isLinkType or include
isVideo) and/or rename/refactor onLinkDescriptionEdited to a generic
onDescriptionEdited that saves/restores linkDescription for both video and link
posts; also adjust any checks around generateLinkDescription() usage so only the
auto-generate button remains link-only but persistence uses the unified
condition (isLinkType || isVideo or a single isDescriptionVisible flag) to
ensure video descriptions are persisted and restored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@templates/news/v3/create.html`:
- Around line 236-243: linkUrlValid enforces only http/https but prepareSubmit
currently accepts any absolute URL; update prepareSubmit to reuse the same
validation logic. In prepareSubmit (the submit handler), instead of only trying
new URL(externalUrl), call or replicate linkUrlValid's checks: trim externalUrl,
ensure it's non-empty, parse with new URL(), and require parsed.protocol ===
'http:' || parsed.protocol === 'https:'; if it fails, treat the URL as invalid
and block submission. Reference: linkUrlValid, prepareSubmit, externalUrl.
- Around line 564-615: prepareSubmit currently clears drafts (clearTitleDraft,
clearExternalUrlDraft, clearContentDraft, clearDescriptionDraft,
clearLinkDescriptionDraft) but does not cancel pending debounced autosave
callbacks, allowing save*Draft timers to run after clearing and recreate stale
drafts; before calling the clear*Draft methods, cancel any pending autosave
timers (e.g. clearTimeout on whatever timeout IDs the debounced save handlers
store, or call a new this.cancelPendingAutosaveTimers() helper that clears those
timeout IDs and/or aborts in-flight saves), and if such timeout IDs or a cancel
helper do not exist, add them to the corresponding
saveTitleDraft/saveContentDraft/saveExternalUrlDraft/saveDescriptionDraft/saveLinkDescriptionDraft
implementations so prepareSubmit can reliably cancel outstanding autosave
callbacks before clearing drafts.
- Around line 138-165: The description textarea is used for both video and link
posts (x-model="linkDescription") but the draft save/restore logic and "saved"
indicator only run when this.isLink, so video descriptions never get written to
localStorage; update the draft helpers to run for the same condition that shows
the textarea (use isLinkType or include isVideo) and/or rename/refactor
onLinkDescriptionEdited to a generic onDescriptionEdited that saves/restores
linkDescription for both video and link posts; also adjust any checks around
generateLinkDescription() usage so only the auto-generate button remains
link-only but persistence uses the unified condition (isLinkType || isVideo or a
single isDescriptionVisible flag) to ensure video descriptions are persisted and
restored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7e11b967-ab55-4876-ab66-5cd84e231215

📥 Commits

Reviewing files that changed from the base of the PR and between 9729724 and c3979af.

📒 Files selected for processing (2)
  • news/tasks.py
  • templates/news/v3/create.html
🚧 Files skipped from review as they are similar to previous changes (1)
  • news/tasks.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task: AI-Assisted Description for Link Posts type

1 participant