feat(typologies): add structuring (smurfing) pattern generator by AllanSevilla05 · Pull Request #17 · SantanderAI/gen-fraud-graph

AllanSevilla05 · 2026-06-25T04:31:46Z

Summary

Adds StructuringGenerator — a second fraud typology implementing
BSA/FinCEN structuring (smurfing) patterns, where multiple smurf
accounts each send sub-$10,000 amounts to a single coordinator to
avoid Cash Transaction Report (CTR) filing.

The repo previously only generated cyclic ring patterns. Structuring
is the most commonly filed SAR typology and has a structurally
distinct graph signature (fan-in star vs cycle), which stresses
different subgraph detection algorithms.

Changes

typologies.py — StructuringGenerator dataclass with
STRUCTURING_DESCRIPTIONS (8 realistic smurfing descriptions)
config.py — three new fields: num_structuring_patterns,
structuring_smurfs_range (default 3–10 smurfs),
structuring_amount_range (default $8,000–$9,900 sub-threshold)
generator.py — wired into Phase 3; tx IDs chain from ring
generator's next_tx_id to avoid collisions
verify.py — dispatches on pattern_type so fan-in star
patterns verify correctly alongside cycle rings
tests/test_generator.py — 8 new tests in
TestStructuringGenerator

Test results

50 passed, 0 failed — coverage 97.93%

Domain note

Structuring thresholds modelled on 31 U.S.C. § 5324 and FinCEN
guidance. Default amount range ($8,000–$9,900) reflects common
real-world smurfing behaviour documented in SARs.

github-actions · 2026-06-25T04:31:55Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

AllanSevilla05 · 2026-06-25T04:37:51Z

I have read the CLA Document and I hereby sign the CLA

Adds StructuringGenerator — a fan-in star typology where multiple smurf accounts each send sub-threshold amounts to a single coordinator, modelling BSA/FinCEN structuring (smurfing) patterns. Changes: - typologies.py: add StructuringGenerator and STRUCTURING_DESCRIPTIONS - config.py: add num_structuring_patterns, structuring_smurfs_range, structuring_amount_range with auto-scaling defaults - generator.py: wire StructuringGenerator into Phase 3 pipeline - verify.py: dispatch verification on pattern_type so structuring fan-in patterns are validated correctly alongside cycle rings - test_generator.py: 8 new tests in TestStructuringGenerator

+
+
+
+    def test_transaction_count_matches_smurfs(self, tmp_dir):


+        )
+        assert n_tx == num_patterns * fixed_smurfs
+
+    def test_amounts_are_sub_threshold(self, tmp_dir):


+            amounts = [float(r["amount"]) for r in reader]
+        assert all(a < 10_000.00 for a in amounts), "Found amount >= CTR threshold"
+
+    def test_all_transactions_fan_into_coordinator(self, tmp_dir):


+                        f"src {src} not a registered smurf of coordinator {dst}"
+                    )
+
+    def test_tx_ids_do_not_collide_with_start(self, tmp_dir):


+        assert min(ids) == start
+        assert next_id == start + len(ids)
+
+    def test_neptune_format(self, tmp_dir):


+            headers = next(csv.reader(fh))
+        assert "embedding" not in headers
+
+    def test_tiny_account_pool(self, tmp_dir):


opensource-SantanderAI

Thanks @AllanSevilla05 — the structuring/smurfing typology itself is a great, well-grounded addition (fan-in star vs cycle, FinCEN/31 U.S.C. § 5324 thresholds), and we'd like to merge it. Two things need fixing first:

1. Lint (Lint & format & type-check is failing on ruff check .) — all auto-fixable:

src/gen_fraud_graph/generator.py:6 — import block unsorted (I001).
Trailing whitespace (W291): generator.py:19, typologies.py:31 (also a typo there: #Description specififc → # Description specific).
Please run ruff check --fix . && ruff format . (or black .) and push.

2. Duplicated test methods (the blocker we care most about). CodeQL flagged 6 "variable defined multiple times" alerts, and they're correct: every test in TestStructuringGenerator is defined twice, so the first copy of each is shadowed and never runs:

test_transaction_count_matches_smurfs — lines 265 and 371
test_amounts_are_sub_threshold — 282 and 388
test_all_transactions_fan_into_coordinator — 297 and 403
test_tx_ids_do_not_collide_with_start — 328 and 434
test_neptune_format — 344 and 450
test_tiny_account_pool — 360 and 466

This looks like an accidental copy/paste (or a bad merge). Please remove the duplicate definitions so each test exists once and actually executes, then re-confirm the pass/coverage numbers. If the two copies differ, keep the intended one.

Once ruff is green and the duplicate tests are removed, ping us and we'll merge. Thanks!

AllanSevilla05 · 2026-06-29T17:31:32Z

@opensource-SantanderAI Edits have been made. Ruff check green and duplicates removed.

AllanSevilla05 requested review from a team as code owners June 25, 2026 04:31

github-actions Bot added a commit that referenced this pull request Jun 25, 2026

@AllanSevilla05 has signed the CLA in #17

7b8cca2

AllanSevilla05 force-pushed the feat/structuring-typology branch from 6740f55 to 67d2191 Compare June 25, 2026 04:54

Merge branch 'main' into feat/structuring-typology

14cc481

github-advanced-security AI found potential problems Jun 29, 2026

View reviewed changes

opensource-SantanderAI requested changes Jun 29, 2026

View reviewed changes

AllanSevilla05 and others added 2 commits June 29, 2026 13:27

fix(lint): resolve ruff trailing whitespace and formatting errors

18ae92e

Merge branch 'main' into feat/structuring-typology

c6f52a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(typologies): add structuring (smurfing) pattern generator#17

feat(typologies): add structuring (smurfing) pattern generator#17
AllanSevilla05 wants to merge 4 commits into
SantanderAI:mainfrom
AllanSevilla05:feat/structuring-typology

AllanSevilla05 commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

AllanSevilla05 commented Jun 25, 2026

Uh oh!

opensource-SantanderAI left a comment

Uh oh!

AllanSevilla05 commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AllanSevilla05 commented Jun 25, 2026

Summary

Changes

Test results

Domain note

Uh oh!

github-actions Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AllanSevilla05 commented Jun 25, 2026

Uh oh!

opensource-SantanderAI left a comment

Choose a reason for hiding this comment

Uh oh!

AllanSevilla05 commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 25, 2026 •

edited

Loading