[GH-ISSUE #863] Replace regex-based __VINEXT_CLASS stub patching in generateBundle #187

Open
opened 2026-05-06 12:37:59 +02:00 by BreizhHardware · 0 comments

Originally created by @NathanDrake2406 on GitHub (Apr 20, 2026).
Original GitHub issue: https://github.com/cloudflare/vinext/issues/863

Raised by @james-elicx on #842: https://github.com/cloudflare/vinext/pull/842#discussion_r3109733377

To be honest I'm not the biggest fan of this approach. It feels a bit brittle. It would be worth us finding another way to deal with this going forward.

What's in place today

packages/vinext/src/index.ts generateBundle patches the generated RSC entry by:

  1. Building a regex that matches the literal stub body: /function __VINEXT_CLASS\(routeIdx\)\s*\{\s*return null;?\s*\}/.
  2. Scanning every chunk for the stub name, asserting exactly one chunk carries the matching stub body.
  3. Replacing target.code via string replace with a switch-statement dispatch table.
  4. Nulling target.map because the patched body is longer than the stub.

#843 extends this same pattern by adding a second sibling stub __VINEXT_CLASS_REASONS behind VINEXT_DEBUG_CLASSIFICATION, gated by an independent env check. Each new build-time dispatch table doubles the regex surface.

Why it's brittle

  • Stub-body format is coupled to whatever the template emits verbatim. Any codegen shape change (trailing semicolon, whitespace, formatter behaviour, identifier rename, scope-hoisting rename by a future Rolldown version) silently breaks the regex and the plugin throws its drift error. The drift detection catches the break loudly, but each new dispatch function multiplies the number of places this coupling lives.
  • String replace on target.code bypasses Rolldown's module graph, its sourcemap machinery, and any downstream plugins that expect immutable chunk output after the final phase.
  • We already null the sourcemap unconditionally because we can't regenerate mappings for the injected region.
  • The approach doesn't compose: N dispatch tables means N regexes, N drift checks, N replace calls.

Lifecycle constraint (read this before picking an option)

The classification pipeline has two stages that land at different Rolldown lifecycle points:

  • Layer 1 (segment config) reads each layout file's source. Available from load onward.
  • Layer 2 (module graph BFS) requires this.getModuleInfo(moduleId) against the final module graph. Only complete in generateBundle / renderChunk.

Any option that tries to return the full Layer 1 + Layer 2 dispatch table from a virtual module's load hook will fail silently: the graph is not populated yet, so every layout looks "no evidence" and Layer 2 degrades to the runtime probe. This constraint rules out the naive "pure virtual module from load" design.

Options to explore

Listed least to most invasive. Option 1b is the current preferred target.

1a. Virtual-module injection, pure load (infeasible). Replace the stub in the source template with an import from virtual:vinext-layout-classification and have the load hook return structured dispatch-table code. This sounds right but breaks on the lifecycle constraint above. Noted so future readers do not re-discover it.

1b. Virtual-module injection, placeholder + renderChunk fill (recommended). Declare virtual:vinext-layout-classification with resolveId + load returning a typed stub such as export function getBuildTimeClassifications() { return null; } export function getBuildTimeReasons() { return null; }. In renderChunk, locate the chunk by its stable facadeModuleId (we assigned it, no regex scan across chunks) and replace the body with MagicString so sourcemaps stay correct. The RSC entry just imports the named functions. Key wins:

  • The drift surface shrinks from "the entire RSC entry chunk" to "one small module with a known schema that vinext owns end-to-end".
  • #843's __VINEXT_CLASS_REASONS becomes a sibling export from the same module, gated by the same build-time flag. One schema, multiple typed exports.
  • Tests split cleanly: unit-test codegen (given manifest + layer2, what JS does it produce?) and unit-test runtime contract (given the imported functions, does the probe loop behave correctly?). The integration test's vm.runInThisContext regex-extraction goes away.

2. renderChunk on the RSC entry with MagicString. Keep the current shape (stub lives in the RSC entry, patched by the plugin) but move from generateBundle string replace to renderChunk + MagicString. Preserves incremental sourcemaps and works with the bundler's abstractions. Still couples to the RSC entry's emitted text; a partial improvement, not a full fix.

3. Emit the dispatch table as a separate chunk via emitFile. Have generateBundle call this.emitFile({ type: 'chunk', id: 'virtual:vinext-layout-classification', ... }) and rely on the RSC entry's existing import to resolve to the emitted chunk. Removes the stub entirely but adds a chunk boundary, a runtime module import, and lifecycle ordering concerns between emitFile and the importer's finalization.

4. AST-based patch. Parse the chunk with @oxc/parser (already an indirect dependency through Vite+), locate the function declaration by name, replace its body node. Heavier but immune to whitespace/formatter drift. Useful only if 1b proves infeasible for some reason not yet identified.

Follow-ups affected

  • #843 adds the __VINEXT_CLASS_REASONS sibling. Option 1b lets it become a second named export of the same virtual module rather than a parallel stub + regex.
  • The layout-skip mechanism being reworked after #840/#841 (see #842 context section) will likely want its own build-time dispatch surface. Landing 1b before that lands avoids shipping a third parallel regex.

Non-goals

  • This is a refactor of the patch mechanism, not a change to the classification pipeline itself. Layer 1 (segment-config) and Layer 2 (module-graph) analysis stays where it is today.
  • The runtime probe loop in app-page-execution.ts does not need to change.
Originally created by @NathanDrake2406 on GitHub (Apr 20, 2026). Original GitHub issue: https://github.com/cloudflare/vinext/issues/863 Raised by @james-elicx on #842: https://github.com/cloudflare/vinext/pull/842#discussion_r3109733377 > To be honest I'm not the biggest fan of this approach. It feels a bit brittle. It would be worth us finding another way to deal with this going forward. ## What's in place today `packages/vinext/src/index.ts` `generateBundle` patches the generated RSC entry by: 1. Building a regex that matches the literal stub body: `/function __VINEXT_CLASS\(routeIdx\)\s*\{\s*return null;?\s*\}/`. 2. Scanning every chunk for the stub name, asserting exactly one chunk carries the matching stub body. 3. Replacing `target.code` via string replace with a switch-statement dispatch table. 4. Nulling `target.map` because the patched body is longer than the stub. #843 extends this same pattern by adding a second sibling stub `__VINEXT_CLASS_REASONS` behind `VINEXT_DEBUG_CLASSIFICATION`, gated by an independent env check. Each new build-time dispatch table doubles the regex surface. ## Why it's brittle - Stub-body format is coupled to whatever the template emits verbatim. Any codegen shape change (trailing semicolon, whitespace, formatter behaviour, identifier rename, scope-hoisting rename by a future Rolldown version) silently breaks the regex and the plugin throws its drift error. The drift detection catches the break loudly, but each new dispatch function multiplies the number of places this coupling lives. - String replace on `target.code` bypasses Rolldown's module graph, its sourcemap machinery, and any downstream plugins that expect immutable chunk output after the final phase. - We already null the sourcemap unconditionally because we can't regenerate mappings for the injected region. - The approach doesn't compose: N dispatch tables means N regexes, N drift checks, N replace calls. ## Lifecycle constraint (read this before picking an option) The classification pipeline has two stages that land at different Rolldown lifecycle points: - **Layer 1 (segment config)** reads each layout file's source. Available from `load` onward. - **Layer 2 (module graph BFS)** requires `this.getModuleInfo(moduleId)` against the final module graph. Only complete in `generateBundle` / `renderChunk`. Any option that tries to return the full Layer 1 + Layer 2 dispatch table from a virtual module's `load` hook will fail silently: the graph is not populated yet, so every layout looks "no evidence" and Layer 2 degrades to the runtime probe. This constraint rules out the naive "pure virtual module from `load`" design. ## Options to explore Listed least to most invasive. Option 1b is the current preferred target. **1a. Virtual-module injection, pure `load` (infeasible).** Replace the stub in the source template with an import from `virtual:vinext-layout-classification` and have the `load` hook return structured dispatch-table code. This sounds right but breaks on the lifecycle constraint above. Noted so future readers do not re-discover it. **1b. Virtual-module injection, placeholder + `renderChunk` fill (recommended).** Declare `virtual:vinext-layout-classification` with `resolveId` + `load` returning a typed stub such as `export function getBuildTimeClassifications() { return null; } export function getBuildTimeReasons() { return null; }`. In `renderChunk`, locate the chunk by its stable `facadeModuleId` (we assigned it, no regex scan across chunks) and replace the body with `MagicString` so sourcemaps stay correct. The RSC entry just imports the named functions. Key wins: - The drift surface shrinks from "the entire RSC entry chunk" to "one small module with a known schema that vinext owns end-to-end". - #843's `__VINEXT_CLASS_REASONS` becomes a sibling export from the same module, gated by the same build-time flag. One schema, multiple typed exports. - Tests split cleanly: unit-test codegen (given manifest + layer2, what JS does it produce?) and unit-test runtime contract (given the imported functions, does the probe loop behave correctly?). The integration test's `vm.runInThisContext` regex-extraction goes away. **2. `renderChunk` on the RSC entry with MagicString.** Keep the current shape (stub lives in the RSC entry, patched by the plugin) but move from `generateBundle` string replace to `renderChunk` + MagicString. Preserves incremental sourcemaps and works with the bundler's abstractions. Still couples to the RSC entry's emitted text; a partial improvement, not a full fix. **3. Emit the dispatch table as a separate chunk via `emitFile`.** Have `generateBundle` call `this.emitFile({ type: 'chunk', id: 'virtual:vinext-layout-classification', ... })` and rely on the RSC entry's existing import to resolve to the emitted chunk. Removes the stub entirely but adds a chunk boundary, a runtime module import, and lifecycle ordering concerns between `emitFile` and the importer's finalization. **4. AST-based patch.** Parse the chunk with `@oxc/parser` (already an indirect dependency through Vite+), locate the function declaration by name, replace its body node. Heavier but immune to whitespace/formatter drift. Useful only if 1b proves infeasible for some reason not yet identified. ## Follow-ups affected - #843 adds the `__VINEXT_CLASS_REASONS` sibling. Option 1b lets it become a second named export of the same virtual module rather than a parallel stub + regex. - The layout-skip mechanism being reworked after #840/#841 (see #842 context section) will likely want its own build-time dispatch surface. Landing 1b before that lands avoids shipping a third parallel regex. ## Non-goals - This is a refactor of the patch mechanism, not a change to the classification pipeline itself. Layer 1 (segment-config) and Layer 2 (module-graph) analysis stays where it is today. - The runtime probe loop in `app-page-execution.ts` does not need to change.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/vinext#187
No description provided.