Skip to content

fix(job): handle access-log errors and resolve IP-limit clients relationally#5496

Closed
n0ctal wants to merge 2 commits into
MHSanaei:mainfrom
n0ctal:pr-0016-iplimit-job-hardening
Closed

fix(job): handle access-log errors and resolve IP-limit clients relationally#5496
n0ctal wants to merge 2 commits into
MHSanaei:mainfrom
n0ctal:pr-0016-iplimit-job-hardening

Conversation

@n0ctal

@n0ctal n0ctal commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

The client IP-limit job ignored errors when opening/copying/truncating the Xray access log, and located a client's inbound with a fragile settings LIKE '%email%' substring query. This adds error handling with early returns, verifies the access log is actually readable, and resolves the client-to-inbound mapping via the clients/client_inbounds relation, keeping the substring scan only as a fallback.

Why

  • Swallowed errors (j.checkError(err) with no return) let the job keep running after a failed log open/copy/truncate, operating on inconsistent state.
  • WHERE settings LIKE '%email%' can match the wrong inbound: an email that is a substring of another email, or text that merely appears elsewhere in the settings JSON. The relational lookup is exact; the JSON scan remains as a fallback so nothing regresses when the relation has no row.

Scope

  • internal/web/job/check_client_ip_job.go:
    • clearAccessLog, processLogFile: check each file error and return early instead of continuing.
    • checkAccessLogAvailable: actually open the log to confirm readability (not just stat).
    • getInboundByEmail: join client_inbounds/clients on clients.email; on miss fall back to the previous LIKE + JSON parse.
  • internal/web/job/check_client_ip_job_integration_test.go: integration coverage.

Validation

go vet ./internal/web/job/, go build ./..., and the full internal/web/job package under -race all pass.

Risk

getInboundByEmail now prefers the relational tables and falls back to the prior substring+JSON scan, so the result is unchanged when the join finds nothing. No schema change; no behavioral change for the common single-match case.

@github-actions github-actions Bot added bug Something isn't working go Pull requests that update Go code labels Jun 22, 2026
@github-actions

Copy link
Copy Markdown

This PR contains three real fixes and one correctness bug introduced in the new code. The fixes for swallowed errors, the nil-pointer-on-failed-open panic, and the substring-match false-positive are all sound. One error-variable mistake in the fallback path needs attention before merge.


Blocking issue

Wrong error returned in getInboundByEmail when the LIKE query fails

File: internal/web/job/check_client_ip_job.go, new getInboundByEmail (diff +735)

var candidates []model.Inbound
if listErr := db.Model(&model.Inbound{}).Where("settings LIKE ?", "%"+clientEmail+"%").Find(&candidates).Error; listErr \!= nil {
    return nil, err   // ← returns the JOIN error, not listErr
}

err here is the error from the First(inbound) JOIN query — typically gorm.ErrRecordNotFound when no relational row exists. If the LIKE query itself then fails (e.g., a transient DB error), the function returns nil, err (i.e., nil, gorm.ErrRecordNotFound) rather than nil, listErr.

The caller in processObserved (line 267) checks errors.Is(err, gorm.ErrRecordNotFound) and, on a true result, calls j.delInboundClientIps(email) and skips enforcement — treating the client as stale or deleted. So a transient DB error on the LIKE query causes the job to silently delete a valid client's tracking row and suppress enforcement for that client on this run.

The fix is one word: return nil, listErr on that branch.

The same variable is used at the end of the function (return nil, err) when the LIKE scan finds no exact match; that instance is correct because returning the original JOIN error (usually ErrRecordNotFound) is exactly the signal processObserved needs.


Confirmed fixes

clearAccessLog: nil-pointer panic on failed open (resolved)

Pre-PR, defer logAccessP.Close() was registered unconditionally. When os.OpenFile returned an error, logAccessP was nil and the deferred call would panic. The PR correctly moves the defer inside the success branch.

processLogFile: ignored open error (resolved)

Pre-PR, file, _ := os.Open(accessLogPath) discarded the error and passed a potentially-nil *os.File to bufio.NewScanner, which would dereference it on the first Scan() call. The PR adds a proper error check with early return.

checkAccessLogAvailable: path-only check replaced by actual open (resolved)

Pre-PR, the function only verified that the configured path string was non-empty — it did not confirm the file was actually readable. The PR opens and immediately closes the file to verify readability. The handle is properly closed (_ = file.Close()), so there is no leak.

getInboundByEmail: substring false-positive eliminated (resolved)

The prior WHERE settings LIKE '%email%' … First() call could return an inbound whose settings JSON merely contained the queried email as a substring of a different client's email. The PR promotes the relational join to the primary path and tightens the fallback to an exact client.Email == clientEmail comparison after the LIKE candidate scan. Confirmed correct via the new TestGetInboundByEmailRejectsSubstringFallbackMatch integration test.


Observations (non-blocking)

  • Existing tests cover the fallback path implicitly. The PR refactors seedInboundWithClient into a thin wrapper around seedInboundOnlyWithClient (which has no relational link). All existing integration tests therefore exercise the LIKE+JSON fallback, which is the right regression coverage.

  • checkAccessLogAvailable opens the file a second time. processLogFile opens the same file immediately after the availability check passes. The double-open is harmless (the check is a defensive probe, not a lock) but does introduce a narrow TOCTOU window. This is a pre-existing design constraint and not a regression introduced here.

  • seedInboundWithClient now has t.Helper() at two levels (the new wrapper and seedInboundOnlyWithClient). This is harmless.


This review was generated automatically; a maintainer may follow up with additional context.

@n0ctal n0ctal force-pushed the pr-0016-iplimit-job-hardening branch from 8d45599 to daa9cea Compare June 22, 2026 20:33
…ionally

The IP-limit job swallowed access-log open/copy/truncate errors and
matched a client's inbound with a fragile settings LIKE '%email%'
substring query. Handle those errors with early returns, verify the
access log is readable, and resolve the client to its inbound via the
clients/client_inbounds join, keeping the substring scan only as a
fallback.
@n0ctal n0ctal force-pushed the pr-0016-iplimit-job-hardening branch from daa9cea to 69c10f9 Compare June 22, 2026 20:36
@n0ctal n0ctal marked this pull request as ready for review June 22, 2026 22:10
@MHSanaei MHSanaei closed this Jun 23, 2026
MHSanaei added a commit that referenced this pull request Jun 23, 2026
The IP-limit job tracks per-client IPs via the core's online-stats API; the access-log parser only ran as a fallback for cores predating that API (which the panel never bundles). Remove the parser, the availability check, and the hourly rotation that truncated a log the job no longer reads.

Move the user-enabled access-log wipe to the daily clear-logs job, guarded so a disabled ('none') or missing log is left alone. Retire the now-unwritten 3xipl-ap persistent-log machinery.

Also resolve IP-limit clients via the exact clients/client_inbounds relation instead of a fragile settings LIKE '%email%' substring, keeping the JSON scan only as a fallback (carried from #5496).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants