Direction: native DrawingML diagrams as an opt-in, bring-your-own-licensed reuse tool (not core)

@hugohe3 — per your note on #157 ("let's open an issue to discuss [the native diagram] direction first") and your #156 review, this is the dedicated thread for the native-diagram track, kept separate from the self-evolution / visual thread (#163). I'll follow exactly the four-step order you laid out in #156, and end with a single, self-contained PR you can run to test the real effect yourself. The five stacked PRs (#156, #158, #159, #160, #161) stay held until we've agreed the direction here.

## TL;DR — the shape

Native DrawingML diagrams are an **opt-in, experimental tool for one narrow niche**: keeping a complex *produced* figure (3D / gradient / skeuomorphic) **editable and brand-recolorable** when the user already owns the source. The repo ships **only the mechanism + one CC0/synthetic demo component** — never a lifted library, never wired into the core flow. For the large majority of figures, hand-authored SVG stays the right tool, and I'll say so explicitly below.

---

## 1. What the existing modes already cover — and the one cell DrawingML adds

PPT Master already has several visual-production modes. The honest question isn't "is SVG bad" (it isn't) — it's "is there a cell in this matrix that none of them fills?"

| Figure type | Hand-authored SVG | AI image | Chart | **Native DrawingML** |
|---|---|---|---|---|
| Flat / structural (flow, columns, matrix) | ✅ already good | — | — | not needed |
| Data visualization | ok | — | ✅ | — |
| Complex **produced** figure (3D isometric, glossy gradient, skeuomorphic) | ⚠️ doable but **very expensive to hand-author, low quality ceiling** | ✅ looks good **but raster** | — | ✅ |
| Stays **editable + precisely brand-recolorable + crisp vector** | ✅ | ❌ raster: not editable, can't recolor to brand exactly, text is baked in | partial | ✅ |

*(Formulas — `latex_render` — and the icon library are separate concerns: neither is a complex produced figure, so they're out of scope for this comparison.)*

**The narrow conclusion (and I want it to be self-limiting):** you're right that for *most* structural figures hand-authored SVG is already good enough. DrawingML earns its keep in exactly one cell — a *produced-quality* complex figure that must also stay *editable and brand-recolorable*. AI images give the look but are raster and dead to editing/recoloring; SVG stays editable but the production cost for that class of figure is high and the quality ceiling low. DrawingML is the only mode that is **both** — *conditioned on the user already owning the source figure.*

**Where it should NOT be used (drawing the boundary myself):** flat structural diagrams, data charts, anything SVG already does cleanly. This is deliberately not a general-purpose path.

The decisive contrast to *show*, not assert, is **DrawingML vs AI image** on the same complex figure: the AI version looks right but can't be recolored to the deck's brand or edited; the DrawingML version recolors to brand in one step and stays editable. That's the demo in §4.

### Applicability — the dividing line we found in practice

Not theory — this is what running the *same content with and without native* across three scenarios (government / tech / project), then iterating on the failures, actually taught us:

**Reach for native when all three hold:**
- the figure is a **complex *produced* structure** — multi-layer isometric, gradient / 3D-shaded, skeuomorphic — the class that is expensive to hand-author in SVG and whose hand-SVG quality ceiling is low;
- a component's **structure genuinely matches** the content (a 3-tier platform ↔ a 3-platform component) **and** the content fits the slots' length / role (short labels for short-label slots);
- you **own a license-clean source** for it.

**Stay with hand-authored SVG when:**
- the figure is **simple / flat** — a basic pyramid, funnel, flow, or card row — where hand-SVG comes out clean and complete (in our tests it matched or beat native there);
- the page is **content-dense** (long descriptions, mixed data) that won't condense into a component's short slots without losing substance;
- nothing in the library matches, or there is no license-clean source.

**Net:** native is a **narrow, opt-in enhancement for complex produced figures you already own** — not a general-purpose diagram path. The text-fusion friction is real but fixable (the three fixes in §4); this applicability line is the durable takeaway, and it's *earned from practice*, not asserted.

---

## 2. Licensing — dissolved by design, not negotiated

Your #156 #2 is the precondition, and I think the design removes the exposure rather than arguing around it:

- **The repo redistributes nothing third-party.** It ships **only the mechanism** (extract a slide's editable DrawingML → component; inject + recolor / text-fill / font-unify) **plus one CC0 or synthetic demo component.** No lifted library ships, by default or otherwise.
- **Each user grows their *own* local library** from sources they have the rights to — their own decks, their org's, or properly-licensed / CC0 open material. (To be precise: "open" ≠ unrestricted — CC-BY-NC, attribution terms, etc. exist; the **import-side responsibility sits with the user**, and the docs frame the tool as *bring-your-own-licensed*. The tool never bundles or scrapes arbitrary decks.)
- **We never ship non-commercial material**, and the default workflow never *encourages* lifting — it requires the user to point at a source they own.
- **What's extracted is a single-page component, not a whole deck** — a building block the user recolors into something of their own, not a verbatim asset reused as-is.

Net: what we provide is a **generative reuse mechanism** — the *feasibility* for each user to reuse from their **own** successful material — not a redistributed asset pile. That's the difference from "lifting a library into a widely-used repo."

---

## 3. Positioning — opt-in / experimental, not core

- Behind an **explicit opt-in path** — concretely, a standalone `workflows/native-diagram.md` + its own CLI, **never invoked by any SKILL.md core step** — documented as experimental; **no core-pipeline wiring** until the "where SVG falls short" cases are agreed with you.
- **On recurring value + the slot-mapping friction (your #3):** you're right that "pull a component, then hand-map slots from `meta.json`" sits close to dropping in a diagram and editing it. The value over manual editing is the **one-shot automation**: theme-flatten so a single base-hex remap re-derives every gradient/shade/tint, plus text-fill and font unification, applied across the whole component in *one* step — versus hand-recoloring every shape. The demo shows this one-shot brand recolor. Honest boundary: for a simple figure this isn't worth it; for a *produced* figure you own, one-shot recolor still beats redrawing or manually recoloring it. Hence: narrow, opt-in. (This niche is the *materials* layer of #163; the *memory* layer proposed there is exactly what would smooth the remaining hand-mapping over repeated use — the two tracks are one idea split into two threads at your request.)

---

## 4. The minimal, verifiable PR (the evidence — same artifact as §1)

### Evidence — a fair test already run (3 scenarios)

To avoid a cherry-picked demo, I ran the same content **twice** per scenario: once where the model only hand-authors SVG, once where the native library is *available and the model decides for itself* whether any page's content warrants a component. Three auto-branded scenarios (no lifted/branded assets in the deck — each got its own palette), three pages each:

| Scenario (auto-brand) | Where the model chose native | Where it declined → hand-SVG |
|---|---|---|
| Government report (navy + gold) | **only** the platform-architecture page → a 4-layer isometric stack, recolored to navy | cover, KPI-comparison page |
| Tech launch (navy + cyan) | **only** the system-architecture page → a 3-layer platform with card-rows, recolored to cyan | cover, scenario-data page |
| Project report (sea-blue + amber) | **only** the platform-architecture page → a 3-layer platform, recolored to sea-blue | cover, Q2/Q3 page |

<img width="1320" height="520" alt="Image" src="https://github.com/user-attachments/assets/cf1a2ce3-a522-43cb-b3c6-9e888ea36771" />

- **It isn't rigged.** Given a free choice, the model reached for native on **exactly one page per deck — the architecture figure — and declined on covers and data pages**, judging no genuine fit. That self-limiting behavior is the §1 boundary *observed*, not asserted.
- **But the first native pages broke on text.** On close inspection the native architecture pages leaked the component's *original* source-deck text and overflowed — the produced 3D shape was right, the text fusion was wrong. Rather than hide it, I root-caused it to three concrete, **fixable** causes:
  1. **`data-text` fails silently on the wrong shape.** The resolver only accepts an object `{"<id>":"text"}`; given an array it *silently keeps the original text* — and the model can't tell (it sees the attribute it wrote, not the render). → fix: accept both forms **and warn when a `data-text` matches zero slots**, so a silent no-op becomes impossible.
  2. **Length budgets were char-count, not CJK visual width.** A slot sized for a 4-char Latin label overflows with 4 Chinese characters (~2× width). → fix: budget in CJK-visual-width.
  3. **Library metadata was thin/wrong** (one component tagged `cycle` was actually a radiate-and-columns layout). → fix: per-slot `slot_spec` (role / length-budget / style) + corrected structure, so the model selects and maps reliably.
- **With those fixed, the complex page came out clean** (before/after attached): a 3-tier cloud-edge-device platform, the deck's *own* content on every slot within budget, **recolored to brand in one line**, carrying isometric gradient-shaded depth — a *produced* quality hand-authored SVG can't reach cheaply. The very same page *before* the fix was 100% leaked original text.

<img width="1320" height="520" alt="Image" src="https://github.com/user-attachments/assets/f2e85d63-9573-4361-af5c-ef6dd8ee6a70" />

- **The honest dividing line:** for **simple** figures (a 5-tier pyramid, a 4-stage funnel) hand-SVG comes out clean and complete — native isn't worth its friction there. Native earns its keep on **complex produced structures** (multi-layer isometric, gradient-heavy) where the depth is real and hand-SVG's ceiling is low. The text-fusion friction is **not fundamental — it's the three fixable issues above.**

### The PR itself

A **single self-contained PR**, off `main`, not the five stacked ones — **PR: #168**

- **Mechanism + the three robustness fixes** — the machinery you reviewed as sound in #156 (theme-flatten, byte-exact splice, foreign-rel stripping): extract → inject → recolor — **plus** `data-text` array-tolerance + zero-match warning, CJK-width budgets, and the `slot_spec` metadata that makes matching reliable.
- **One synthetic / CC0 demo component that ships in the repo** — `demo_synthetic_platform`, an original 3-layer capability platform (application / capability / foundation, 5 cards per tier) authored from plain DrawingML, **no vendor or client source**. The evidence above used *my own licensed* components, which stay **local / bring-your-own** (the repo redistributes nothing); the shipped demo is synthetic so you run it license-clean.
- **One reproduce command** that renders the with/without, so you test the real effect rather than read a claim:

  ```bash
  py -3.11 skills/ppt-master/scripts/demo_native_diagram.py -o native_diagram_demo_out
  ```

  It builds a two-slide before/after PPTX through the real `finalize_svg` + `svg_to_pptx` pipeline: **slide 1** places the component via a `data-native-diagram` placeholder — recolored to a sample brand and re-texted onto a new scenario in one step; **slide 2** is the same content hand-drawn as flat SVG (the "without" baseline). Open `native_diagram_demo_out/exports/*.pptx`. (Requires `py -3.11` with `python-pptx` + `lxml`.)




<img width="1280" height="720" alt="Image" src="https://github.com/user-attachments/assets/4daa5887-be06-466b-9495-5c1daf3271db" />

<img width="1280" height="720" alt="Image" src="https://github.com/user-attachments/assets/b5d0ec23-7537-43b1-9992-20cdf6bda131" />

---

## What happens to the five stacked PRs

Treating them as one direction, as you asked:

- **#156 core + #159 engine + #158 text-font (recolor / text-fill / font)** → collapse into the single minimal opt-in PR above. This is the only code I'd propose landing if the direction holds — the smallest end-to-end that proves the value, as one PR rather than five.
- **#160 workflow + #161 catalog (batch scan / gallery)** → stay held; they're the "ecosystem" parts that only make sense *after* the direction and the opt-in boundary are agreed, and they'd return one focused PR at a time, not as a batch.
- **The extracted component library / assets** → **never ships in the repo.** Bring-your-own-licensed, local-only — so there is nothing on our side to license-clean.

---

## What I'm asking

Let's align on this *shape* — opt-in, bring-your-own-licensed, mechanism-only, one narrow niche, not core. To make the decision concrete:

- **If the demo convinces you SVG genuinely falls short for that niche** → I open the single minimal opt-in PR (#156 + #159 + #158 collapsed), behind the standalone workflow, nothing touching core.
- **If it doesn't** → that's a fine answer, not a negotiation: I drop the core ambition entirely, the path stays a personal external tool, and we close the five stacked PRs.

Either way the demo PR is there for you to verify necessity directly, and #163 stays the separate self-evolution thread. Your call on whether this lives as its own issue or a comment on #156.

Thanks again for the depth of the #156 review — the OOXML feedback especially. 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Direction: native DrawingML diagrams as an opt-in, bring-your-own-licensed reuse tool (not core) #167

TL;DR — the shape

1. What the existing modes already cover — and the one cell DrawingML adds

Applicability — the dividing line we found in practice

2. Licensing — dissolved by design, not negotiated

3. Positioning — opt-in / experimental, not core

4. The minimal, verifiable PR (the evidence — same artifact as §1)

Evidence — a fair test already run (3 scenarios)

The PR itself

What happens to the five stacked PRs

What I'm asking

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Figure type	Hand-authored SVG	AI image	Chart	Native DrawingML
Flat / structural (flow, columns, matrix)	✅ already good	—	—	not needed
Data visualization	ok	—	✅	—
Complex produced figure (3D isometric, glossy gradient, skeuomorphic)	⚠️ doable but very expensive to hand-author, low quality ceiling	✅ looks good but raster	—	✅
Stays editable + precisely brand-recolorable + crisp vector	✅	❌ raster: not editable, can't recolor to brand exactly, text is baked in	partial	✅

Scenario (auto-brand)	Where the model chose native	Where it declined → hand-SVG
Government report (navy + gold)	only the platform-architecture page → a 4-layer isometric stack, recolored to navy	cover, KPI-comparison page
Tech launch (navy + cyan)	only the system-architecture page → a 3-layer platform with card-rows, recolored to cyan	cover, scenario-data page
Project report (sea-blue + amber)	only the platform-architecture page → a 3-layer platform, recolored to sea-blue	cover, Q2/Q3 page

Uh oh!

Direction: native DrawingML diagrams as an opt-in, bring-your-own-licensed reuse tool (not core) #167

Description

TL;DR — the shape

1. What the existing modes already cover — and the one cell DrawingML adds

Applicability — the dividing line we found in practice

2. Licensing — dissolved by design, not negotiated

3. Positioning — opt-in / experimental, not core

4. The minimal, verifiable PR (the evidence — same artifact as §1)

Evidence — a fair test already run (3 scenarios)

The PR itself

What happens to the five stacked PRs

What I'm asking

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions