Skip to content

fix(posix): make warmupSpaceRootCache errors non-fatal#678

Open
flash7777 wants to merge 1 commit into
opencloud-eu:mainfrom
flash7777:fix/warmup-non-fatal
Open

fix(posix): make warmupSpaceRootCache errors non-fatal#678
flash7777 wants to merge 1 commit into
opencloud-eu:mainfrom
flash7777:fix/warmup-non-fatal

Conversation

@flash7777

Copy link
Copy Markdown

Summary

  • Make warmupSpaceRootCache errors non-fatal during tree initialization
  • Cache write failures (e.g. NATS KV temporarily unavailable during upgrades) are logged instead of aborting the service
  • The subsequent async WarmupIDCache will retry and fill in any missing entries

Problem

Since commits 0ae322f and c5a3bca, warmupSpaceRootCache runs synchronously during tree.New() and aborts on cache write errors. During upgrades (e.g. 5.x → 7.x) the NATS KV store may be temporarily unavailable, causing the storage-users service to fail startup entirely. This cascades: proxy never binds its port → 502 for all requests.

Changes

Two lines changed in pkg/storage/fs/posix/tree/tree.go:

  1. Caller: return nil, errors.Wrap(...)t.log.Error()... (log and continue)
  2. Function: return errors.Wrap(...)t.log.Error()... (log and continue per space)

Test plan

  • Verified that service starts successfully when NATS KV is temporarily unavailable
  • Verified that spaces are populated once NATS becomes available (via async WarmupIDCache)
  • Tested upgrade scenario from 5.1.0 to 7.x with existing NATS data

The warmupSpaceRootCache function runs synchronously during tree
initialization. If the NATS KV cache is temporarily unavailable
(e.g. during upgrades or migrations), the hard error abort prevents
the storage-users service from starting entirely, which cascades
into proxy startup failures (502).

Change both the cache write error and the caller to log-and-continue
instead of aborting. The subsequent async WarmupIDCache will retry
and fill in any missing entries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant