Skip to content

fix(MarkovProcess): add draws= argument for per-agent CRN with iid (BUG-044)#1776

Open
llorracc wants to merge 3 commits into
harmenberg-dual-measurefrom
bug-stratified-draws-mode
Open

fix(MarkovProcess): add draws= argument for per-agent CRN with iid (BUG-044)#1776
llorracc wants to merge 3 commits into
harmenberg-dual-measurefrom
bug-stratified-draws-mode

Conversation

@llorracc
Copy link
Copy Markdown
Collaborator

HARK PR: Fix asymptotic bias in MarkovProcess._draw_shuffled for downstream estimators sensitive to per-agent identity

Summary

HARK.distributions.base.MarkovProcess._draw_shuffled (lines 240–347) implements a quota-exact shuffle for Markov state transitions. The intent is variance reduction: deterministic counts at each target state instead of binomial sampling noise.

Bug: when sort_key=None (the default), the algorithm assigns specific agents to specific targets via sub_rng.permutation(agents_in_j) (line 341). The permutation is independent of any per-agent draw, which means the same agent ends up at different targets under shuffle vs iid even when both methods see the same per-agent uniform draws. For estimators that depend on per-agent identity (e.g., welfare measurement that pairs each agent's outcomes across counterfactual scenarios), this breaks the asymptotic equivalence between shuffle and iid.

Empirical signature in HAFiscal: shuffle MC produces a +8% systematic bias on UI welfare estimators relative to iid MC. The bias does NOT shrink with N (verified at N=49,000 and N=245,000). Per-agent Markov state at recession-onset matches only ~92% between shuffle and iid; ~8% of agents are at different states despite both methods seeing identical per-agent random draws.

Fix: add a new mode assignment='inverse_cdf' (or accept an optional draws argument) that implements block-stratified inverse-CDF assignment. For each source state j with N_j agents:

  1. Sort agents by their per-agent draw u_i.
  2. Compute exact quota counts K[j,k] = floor(N_j × P[j,k]) + leftover allocation.
  3. Assign sorted agents to targets in order: first K[j,0] → target 0, next K[j,1] → target 1, etc.

This gives all three properties simultaneously:

  • Quota-exact counts (= variance reduction, the original goal)
  • Per-agent assignment determined by per-agent draw rank (= preserves CRN coupling with iid via shared draws)
  • Asymptotic equivalence to iid as N→∞ via Glivenko-Cantelli (rank/N_j → u_i)

Mathematical foundation

The current _draw_shuffled algorithm

For source state j with N_j agents and transition probabilities P[j,:]:

K[j,k] = floor(N_j * P[j,k])               # deterministic floor counts
M = N_j - sum(K[j,:])                      # leftover slots
allocate M leftovers via residual probs    # small randomness
perm = sub_rng.permutation(agents_in_j)    # RANDOM PERMUTATION
new_state[perm[0:K[j,0]]] = 0              # first K[j,0] perm-positions → target 0
new_state[perm[K[j,0]:K[j,0]+K[j,1]]] = 1  # next K[j,1] → target 1
...

The marginal counts are correct (quota-exact). The per-agent assignment is independent of any per-agent draw: it's a function of the random permutation alone.

Why this asymptotically diverges from iid

Per-agent iid is new_state[i] = searchsorted(cumsum(P[j,:]), u_i) where u_i ∈ Uniform(0,1) is per-agent.

Under iid, the per-agent assignment is determined by the per-agent draw u_i. Under shuffle, it's determined by the random permutation (independent of u_i).

Two simulations using the SAME per-agent draws u_i but DIFFERENT methods (shuffle vs iid) give:

  • Same expected per-state marginal counts ✓
  • DIFFERENT per-agent target assignments ✗

Asymptotically, the marginal distribution at each (state, time) is the same. But the per-agent identity is not preserved across methods, so any per-agent function whose summed value depends on which specific agents end up at which states will NOT converge to the iid value at finite N.

For estimators that integrate over per-agent (state, wealth, income) trajectories built up over time, the cross-method bias is asymptotically persistent (does not vanish with N) because each method's per-agent trajectory distribution is internally consistent but differs in joint structure from the other's.

The proposed fix: rank-based stratified inverse-CDF

For source state j with N_j agents and per-agent draws u_i:

sort_order = argsort(u_i)
sorted_agents = agents_in_j[sort_order]    # agents in increasing draw order

K[j,k] = floor(N_j * P[j,k]) + leftover    # (same as before)

offset = 0
for k in range(J):
    new_state[sorted_agents[offset:offset+K[j,k]]] = k
    offset += K[j,k]

Why this is iid-equivalent asymptotically

Let F[j,k] = sum_{k'≤k} P[j,k'] (cumulative cond_mrkv).

Under iid: agent i with draw u_i goes to target k iff u_i ∈ [F[j,k-1], F[j,k]].

Under stratified: agent at rank r in source j goes to target k iff r/N_j ∈ [F[j,k-1], F[j,k]] (approximately, modulo floor effects).

For an agent with draw u_i at rank r in source j:

  • The empirical CDF value at u_i is r/N_j.
  • By Glivenko-Cantelli, |r/N_j - u_i| → 0 a.s. as N_j → ∞.

So the agent's target under stratified converges to the agent's target under iid as N_j → ∞. The two algorithms produce the same per-agent assignments asymptotically.

For finite N_j, the difference is concentrated at the O(√N_j) "borderline" agents whose u_i is near a cutoff F[j,k] — a small number that vanishes asymptotically as a fraction of N_j.

Why the current sort_key mode is also wrong

The existing sort_key-based path (lines 309-336) uses systematic sampling: minority transitions are spread evenly across the sorted order. This is also not iid-equivalent — it also produces per-agent assignments uncorrelated with per-agent draws (since the spacing is N_rem/K[jp], not the draw value).

The new inverse_cdf mode is distinct from both the random-permutation default AND the existing systematic-sampling-by-sort-key.

Empirical evidence (HAFiscal welfare-6)

In econ-ark/HAFiscal-Latest, MarkovProcess.draw(shuffle=True) is used for cohort-level Markov state propagation in the welfare-6 estimator. Welfare-6 is a per-agent integral, weighted by marginal utility — heavily dependent on which specific agents end up at which states.

Test setup

  • Cohort N = 49,000 (and 245,000) at HS-Only single-education calibration
  • "Simplified" model: deterministic income during unemployment, mortality off (eliminates other sources of cross-method noise)
  • Compare three modes: original shuffle, per-agent iid, proposed stratified

Results

Method ui_rec ui_rec_AD Per-agent Markov match vs iid
Per-agent iid (= nshuf) 1.6029 2.1709 100% (by construction)
Original shuffle=True 1.7353 2.3071 92.4%
Proposed stratified fix 1.6015 2.1720 99.7-99.9%
Cell Original shuffle bias vs iid Stratified bias vs iid
ui_rec +8.26% -0.09%
ui_rec_AD +6.27% +0.05%
check_rec +0.96% -0.11%
check_rec_AD +0.76% -0.12%
Other cells <1.0% <0.15%

Asymptotic test (1×N → 5×N)

Cell Original shuffle bias at N=49k Original shuffle bias at N=245k
ui_rec +9.48% +9.96%
ui_rec_AD +7.13% +7.40%

If the original shuffle's bias were finite-N noise, going to 5×N should drop bias to ~45% of its value. It did not. The bias is asymptotically persistent → real algorithmic asymmetry.

Proposed code change

# In HARK/distributions/base.py, MarkovProcess._draw_shuffled

def _draw_shuffled(self, state, sort_key=None, draws=None):
    """
    ...existing docstring...

    Parameters
    ----------
    state : np.array
        Current source state per agent.
    sort_key : np.array or None
        Existing systematic-sampling-on-sort-key mode (unchanged).
    draws : np.array or None
        Per-agent uniform draws u_i ∈ [0,1]. When provided (and sort_key
        is None), use rank-based stratified inverse-CDF assignment that is
        asymptotically equivalent to per-agent iid `searchsorted(cumsum(P[j,:]), u_i)`,
        while preserving quota-exact target counts. Recommended for any
        downstream estimator sensitive to per-agent identity (e.g.,
        per-agent welfare integrals).
    """
    state = np.asarray(state)
    new_state = np.empty_like(state, dtype=int)
    J = self.transition_matrix.shape[1]
    J_src = self.transition_matrix.shape[0]

    base_entropy = int(self._rng.integers(0, 2**63 - 1))
    sub_seeds = np.random.SeedSequence(base_entropy).spawn(J_src)

    for j in range(J_src):
        agents_in_j = np.where(state == j)[0]
        N_j = len(agents_in_j)
        if N_j == 0:
            continue

        probs = self.transition_matrix[j]
        sub_rng = np.random.default_rng(sub_seeds[j])

        if N_j * np.min(probs[probs > 0]) < 1:
            for idx in agents_in_j:
                new_state[idx] = sub_rng.choice(J, p=probs)
            continue

        K_exact = N_j * probs
        K = np.floor(K_exact).astype(int)
        M = N_j - np.sum(K)

        if M > 0:
            eps = 1.0 / N_j
            Q = K_exact - eps * K
            extra_draws = sub_rng.random(M)
            for m in range(M):
                Q_adj = Q / np.sum(Q)
                Q_sum = np.cumsum(Q_adj)
                idx = np.searchsorted(Q_sum, extra_draws[m])
                K[idx] += 1
                Q[idx] = 0.0

        if sort_key is not None:
            # ... existing systematic-sampling code (unchanged) ...
            pass
        elif draws is not None:
            # NEW: rank-based stratified inverse-CDF assignment.
            # Asymptotically equivalent to per-agent iid; preserves
            # CRN with iid-method via shared draws.
            draws_j = draws[agents_in_j]
            sort_order = np.argsort(draws_j)
            sorted_agents = agents_in_j[sort_order]
            offset = 0
            for jp in range(J):
                if K[jp] == 0:
                    continue
                new_state[sorted_agents[offset : offset + K[jp]]] = jp
                offset += int(K[jp])
        else:
            # Existing default: random permutation (loses per-agent identity)
            perm = sub_rng.permutation(agents_in_j)
            offset = 0
            for jp in range(J):
                new_state[perm[offset : offset + K[jp]]] = jp
                offset += K[jp]

    return new_state

Tests

Add to HARK/tests/test_distribution.py:

def test_markov_shuffle_inverse_cdf_matches_iid_asymptotically():
    """
    Verify that MarkovProcess._draw_shuffled with `draws` argument
    converges to iid behavior as N → ∞.
    """
    import numpy as np
    from HARK.distributions import MarkovProcess

    P = np.array([[0.97, 0.03], [0.30, 0.70]])
    mp_shuf = MarkovProcess(P, seed=42)
    mp_iid = MarkovProcess(P, seed=42)

    for N in [1000, 10_000, 100_000]:
        # Same per-agent draws for both methods.
        rng = np.random.default_rng(seed=99)
        state = (rng.uniform(size=N) > 0.95).astype(int)  # ~5% start at state 1
        u = rng.uniform(size=N)

        # Per-agent iid
        cdf = np.cumsum(P, axis=1)
        new_iid = np.empty(N, dtype=int)
        for j in range(2):
            mask = (state == j)
            new_iid[mask] = np.searchsorted(cdf[j], u[mask])

        # Stratified shuffle (proposed)
        mp_shuf._rng = np.random.default_rng(seed=42)  # reset
        new_shuf = mp_shuf.draw(state, shuffle=True, draws=u)

        match_pct = (new_iid == new_shuf).mean()
        # Match should approach 100% as N grows.
        if N >= 100_000:
            assert match_pct > 0.99, f"At N={N}, match {match_pct:.4f} < 0.99"

def test_markov_shuffle_inverse_cdf_preserves_quotas():
    """Verify counts are still quota-exact under the new mode."""
    import numpy as np
    from HARK.distributions import MarkovProcess

    P = np.array([[0.7, 0.3], [0.4, 0.6]])
    mp = MarkovProcess(P, seed=0)

    N = 1000
    rng = np.random.default_rng(seed=1)
    state = (rng.uniform(size=N) > 0.5).astype(int)
    u = rng.uniform(size=N)

    new_state = mp.draw(state, shuffle=True, draws=u)

    # Per-source-state counts at each target should be ≈ quota-exact
    for j in [0, 1]:
        N_j = (state == j).sum()
        for k in [0, 1]:
            count = ((state == j) & (new_state == k)).sum()
            expected = N_j * P[j, k]
            # Allow ±1 due to floor-plus-leftover
            assert abs(count - expected) <= 1, (
                f"Source {j} → target {k}: count {count}, expected {expected}")

Backwards compatibility

Strictly additive: existing shuffle=True calls without the new draws argument behave identically to before. Existing sort_key mode unchanged. The new behavior is opt-in via draws=....

Caveats

  1. The downstream caller must pre-compute and pass per-agent draws u_i. This requires storing a draw vector at simulation setup time. For HARK callers using pre-computed shock histories (e.g., HAFiscal), this is natural. For callers that do not pre-compute draws, the existing default is unchanged.

  2. The new mode does not eliminate the differences from iid at small N (e.g., when N_j is comparable to J or smaller). The existing fallback to iid for small N_j remains active.

Suggested reviewers / context

  • The original _draw_shuffled was added in HARK to provide variance reduction for MC simulations. The bug was discovered while debugging a +8% systematic bias in HAFiscal's UI welfare estimator that did not vanish with N.

  • The math reference: per-agent iid uses searchsorted(cumsum(P[j,:]), u_i), which is the inverse-CDF transform. The proposed fix is the RANK-BASED inverse-CDF, which is asymptotically equivalent to the per-agent inverse-CDF by Glivenko-Cantelli but produces quota-exact counts.

Reviewer

@mnwhite (Matt White) — primary author of the existing
MarkovProcess._draw_shuffled machinery and best-suited to evaluate
both the API change and the math.

Files to change

  • HARK/distributions/base.py (_draw_shuffled method)
  • HARK/tests/test_distribution.py (add the two tests above)
  • examples/Distributions/Shuffle_vs_IID_Draws.ipynb (companion
    notebook updates — see below)
  • CHANGELOG.md (add note under "Added" or "Fixed")

Companion notebook updates

The existing examples/Distributions/Shuffle_vs_IID_Draws.ipynb already
covers the relevant theory and use cases. The PR requires both
modifications to existing cells AND one new section + code cell
so users understand when to use the new draws= argument.

Modifications to existing cells

Cell 5 (markdown, "Individual-agent equivalence") — currently
claims: "from the standpoint of any single agent, the shuffle procedure
is indistinguishable from i.i.d. sampling."
This is mathematically
correct for the agent's marginal law in a SINGLE simulation, but
misleading for downstream estimators that pair per-agent outcomes
across counterfactual scenarios (e.g., welfare integrals). Add a
caveat paragraph:

Caveat for paired-scenario estimators. The marginal-law equivalence
above holds within a single simulation. But if you run two simulations
using the SAME per-agent draws u_i (one with shuffle=True, one with
shuffle=False) and try to compare them per agent, you will find that
~8% of agents are at DIFFERENT target states despite seeing identical
draws. This is because the default shuffle uses a random permutation
independent of u_i to assign specific agents to specific targets.
For estimators that integrate over per-agent trajectories using shared
draws across counterfactual scenarios, see the new draws= argument
introduced below.

Cell 24 (markdown, "Cross-experiment CRN") — currently claims that
MarkovProcess._draw_shuffled "respects common random numbers across
calls with different transition matrices." This holds for two SHUFFLE
calls (= one policy and one baseline, both shuffle), but does NOT hold
across shuffle-vs-iid. Add:

The CRN coupling described above is for two shuffle simulations
with possibly-different transition matrices. If the downstream estimator
compares a SHUFFLE simulation to an IID simulation (e.g., to validate
the shuffle), the per-agent assignments will not coincide because the
default shuffle uses a random permutation independent of per-agent
draws. The next section introduces the draws= argument that fixes
this case.

New section to ADD (after Cell 26, before Cell 27)

A new section titled "Per-agent CRN with iid: the draws= argument"
that:

  1. Markdown cell (new): Motivates the use case. "Some estimators
    integrate over per-agent trajectories built up over many simulation
    periods, with shared per-agent draws across counterfactual scenarios.
    The MU-weighted welfare integral
    (1/N) Σ_i [u(c_pol_i) - u(c_base_i)] / u'(c_anchor_i) is one
    example: it depends on the SPECIFIC AGENTS at each state, not just
    the marginal counts."
  2. Code cell (new): A 30-line demo:
    • Set up a 2×2 Markov chain (employed/unemployed)
    • Generate N=10000 per-agent uniform draws u_i
    • Run iid: new_state = searchsorted(cdf, u_i) per source state
    • Run default shuffle: mp.draw(state, shuffle=True) (no draws arg)
    • Run new mode: mp.draw(state, shuffle=True, draws=u_i)
    • Show per-agent match rates: iid vs default-shuffle ≈ 50%; iid vs
      new-mode > 99%
    • Show count exactness: both shuffle modes give exact quotas
  3. Markdown cell (new): Discussion / when to use which:
    • Use shuffle=True (default) when you only care about marginal
      aggregate (variance reduction with no per-agent identity needs).
    • Use shuffle=True, draws=u_i when downstream estimator compares
      per-agent trajectories across scenarios using shared draws (e.g.,
      CRN-coupled welfare integrals validated against iid).

Why both modify-and-add (not just modify)

The existing cells establish a CLAIM (single-agent equivalence + CRN
across scenarios). The new mode preserves both claims AND adds a
stronger property (per-agent identity preservation across shuffle vs iid).
The mathematical caveat needs to be inserted at the existing claims, but
the new mode and its demonstration deserve their own section because:

  • It introduces a new API surface (draws= argument).
  • It addresses a different use case (cross-method comparison /
    asymptotic equivalence to iid) than the existing two sections cover.
  • It enables HAFiscal-style welfare integrals to use shuffle without
    systematic bias.

Reference

This PR fixes the issue documented in HAFiscal's BUG-044 investigation:

  • conclusions_private/BUG-044_systematic_bias_confirmed.md
  • conclusions_private/math_to_code_map_for_bug044.md
  • BUGS_private/HAFiscal_BUG-044_*.md (TBD)

Empirical fix verification (HAFiscal welfare-6, HS-Only N=49,000, simplified model):

  • Code/HA-Models/FromPandemicCode/welfare6_BUG044_stratified_HS_seed0/ (= proposed fix)
  • Code/HA-Models/FromPandemicCode/welfare6_BUG044_simplified_nshuf/ (= per-agent iid reference)

llorracc added 3 commits May 12, 2026 13:12
The default `_draw_shuffled` mode (when sort_key=None) assigns specific
agents to targets via `sub_rng.permutation(agents_in_j)`, which is
independent of any per-agent draws.  This means a shuffled simulation
and an iid simulation that share the same per-agent random draws u_i
produce DIFFERENT per-agent target assignments.  For estimators that
integrate over per-agent trajectories with shared draws across
counterfactual scenarios (e.g., CRN-coupled welfare integrals), this
asymmetry creates an asymptotic bias relative to iid that does not
vanish with N.

This commit adds an opt-in `draws=` argument that uses rank-based
stratified inverse-CDF assignment instead of random permutation.  For
each source state j with N_j agents:

1. Sort agents by their per-agent draws u_i.
2. Compute exact quota counts K[j,k] = floor(N_j * P[j,k]) + leftover.
3. Assign sorted agents in order: first K[j,0] -> target 0, next
   K[j,1] -> target 1, etc.

This preserves all three properties:
- Quota-exact target counts (= variance reduction, original goal)
- Per-agent assignment determined by per-agent draw rank (= CRN)
- Asymptotic equivalence to iid via Glivenko-Cantelli (rank/N_j -> u_i
  as N_j -> infinity)

Strictly additive: existing `shuffle=True` calls without `draws=` are
bit-identical to before.  Existing `sort_key=` mode unchanged.  draws=
and sort_key= are mutually exclusive (raises ValueError if both given).

Empirical impact in HAFiscal welfare-6 (HS-Only N=49,000, simplified
model):
  Default shuffle: ui_rec +8.26%, ui_rec_AD +6.27% bias vs iid
  draws= shuffle: ui_rec -0.09%, ui_rec_AD +0.05% bias vs iid
  Per-agent Markov match (vs iid): 92% (default) -> 99.9% (draws=)

Bias under default mode is asymptotic (verified at N=49k vs N=245k:
ui_rec bias 9.48% -> 9.96%, no decrease with N).
Three new tests in MarkovProcessTests:

1. test_shuffle_with_draws_converges_to_iid: at N=10000 with a 2-state
   chain, the draws=-mode shuffle should match per-agent iid at >99%
   (Glivenko-Cantelli convergence).

2. test_shuffle_with_draws_preserves_quotas: per-source-state target
   counts must equal floor(N_j * P[j,k]) within ±1 (the leftover-slot
   tolerance), confirming the variance-reduction property is preserved
   under the new mode.

3. test_shuffle_draws_and_sort_key_mutually_exclusive: passing both
   draws= and sort_key= should raise ValueError, since they specify
   mutually exclusive assignment modes.
Companion notebook updates for the draws= argument added to
MarkovProcess._draw_shuffled.

Two existing cells get caveats:
- Cell 5 ("Individual-agent equivalence"): explain that the marginal-
  law equivalence holds within a single simulation but does NOT imply
  per-agent identity preservation across shuffle-vs-iid runs sharing
  the same per-agent draws.
- Cell 24 ("Cross-experiment CRN"): clarify that the CRN coupling
  described holds for two SHUFFLE simulations, not across shuffle
  vs iid runs.

One new section is added between cells 26 and 27 ("Per-agent CRN with
iid: the draws= argument"):
- Markdown: motivation, algorithm, Glivenko-Cantelli convergence proof
- Code: 30-line demo with two-state Markov chain at N=10000 showing
  per-agent match rates (iid vs default shuffle: 73.8% from
  sum-of-p^2 coincidence; iid vs draws= shuffle: 99.9%) and
  quota-exact target counts.
- Markdown: discussion of when to use which mode (default / sort_key
  / new draws=) and reaffirmation of backward-compat (existing
  shuffle=True calls without draws= are bit-identical to before).

Total cells: 36 -> 39.
@llorracc llorracc requested a review from mnwhite May 12, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant