Skip to content

[bug]: LogAnchorTxConfirm writes irreversibly at 1-conf #2179

Description

@jtobin

The chain porter's confirmation-success path
(tapfreighter/chain_porter.goLogAnchorTxConfirm) performs
several irreversible DB writes on a single-confirmation observation:

The conf notification is registered with numConfs = 1 in
waitForConfEventOnce. The ReOrgWatcher handles the
same-tx-different-block reorg case (it patches the proof's
BlockHeight/BlockHeader/TxMerkleProof when the same txid
re-confirms), but does not handle the case where a different
transaction replaces our anchor entirely
. The watcher waits
indefinitely for the original txid to re-confirm; meanwhile the DB
side effects from the orphaned 1-conf observation stay in place.

Concrete failure scenario

Most relevant for force-close sweep parcels, where lnd's sweeper
publishes several RBF candidates with different txids:

  1. Sweep candidates T and T' both attempt to confirm the
    same logical sweep (same inputs, different fee/witness).
  2. T confirms first at 1-conf.
  3. T's parcel runs LogAnchorTxConfirm:
    • T marked confirmed.
    • Its inputs marked spent=true.
    • Its outputs materialized as asset rows.
    • T' (and any other live candidates) marked
      superseded=true via SupersedeConflictingTransfers.
  4. A reorg drops the block carrying T; T' confirms in the
    reorganised chain instead.
  5. The dominant chain has T' as the actual confirmed anchor.
    The DB still has:
    • T recorded as confirmed (at an orphaned txid).
    • T' flagged superseded=true, so its eventual
      confirmation never triggers LogAnchorTxConfirm either.
  6. Net: the DB points at an orphaned anchor; the chain points at a
    different one we now treat as dead.

For non-sweep transfers, a different-tx reorg requires either a
fee-bumped replacement or an attacker constructing a conflicting
spend of one of our inputs. For sweep parcels it requires only that
the sweeper's chosen candidate reorgs out and a different
candidate it had already broadcast wins — concrete, not theoretical.

Likely downstream impact

  • Proof verification: assets reference the recorded anchor; if
    that's the orphaned txid, verification against the dominant chain
    fails.
  • Stuck "live" candidate: the actually-confirmed T' is flagged
    superseded and its parcel is never resumed, so the porter
    doesn't surface the (chain-true) confirmation event.
  • Coin selection: inputs are marked spent at an orphaned tx; the
    chain says they were spent at a different one. Coin selection
    result is the same (input is unavailable), but the record is
    internally inconsistent.

Two structural fixes

Option A — defer everything until SafeDepth. Register the conf
ntfn in waitForConfEventOnce with numConfs = SafeDepth instead
of 1. Drop the ReOrgWatcher's role for this path entirely.

  • Pros: clean; no inverse logic needed.
  • Cons: changes the latency profile of every confirmed transfer
    (delay = SafeDepth blocks before the on-disk state catches up to
    the chain). Affects RPC consumers that read transfer state.

Option B — add inverse operations to the ReOrgWatcher for the
different-tx case.
When the watched anchor txid doesn't
re-confirm within some bound, treat the orphaned anchor as dead and
roll back LogAnchorTxConfirm's side effects: un-spend inputs,
un-supersede conflicts, remove output asset rows, clear the chain
tx's block hash.

  • Pros: preserves current latency.
  • Cons: needs an "un-confirm" code path that doesn't exist; touches
    schema implications around asset row removal; fiddly enough to
    warrant its own audit.

Related

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Fields

No fields configured for issues without a type.

Projects

Status
🆕 New

Relationships

None yet

Development

No branches or pull requests

Issue actions