[PR #387] [MERGED] perf: cache compiled regex patterns in matchConfigPattern #532

Closed
opened 2026-05-06 13:08:36 +02:00 by BreizhHardware · 0 comments

📋 Pull Request Information

Original PR: https://github.com/cloudflare/vinext/pull/387
Author: @james-elicx
Created: 3/9/2026
Status: Merged
Merged: 3/9/2026
Merged by: @james-elicx

Base: mainHead: perf/cache-compiled-config-patterns


📝 Commits (4)

  • f376632 perf: cache compiled regex patterns in matchConfigPattern
  • 4651974 fmt
  • ff5971a perf: cache compiled regexes in matchHeaders and checkSingleCondition
  • d26652b fmt

📊 Changes

2 files changed (+292 additions, -41 deletions)

View changed files

📝 packages/vinext/src/config/config-matchers.ts (+113 -41)
📝 tests/shims.test.ts (+179 -0)

📄 Description

Problem

Profiling identified per-request regex recompilation across three functions in `config-matchers.ts`:

1. `matchConfigPattern` — 2.4 s self-time, 53% of CPU (dominant bottleneck)

_handleRequest
  └─ matchRedirect              (iterates all redirect rules)
       └─ matchConfigPattern    (per rule)
            └─ safeRegExp       (tokeniser walk + ReDoS scan)
                 └─ isSafeRegex ← 2.4 s SELF TIME

Every rule whose `source` contains `(`, `\`, or certain `:param` suffix patterns enters the regex branch. For each such rule, on every request: full tokeniser walk → `isSafeRegex` scan → `new RegExp()`. Locale-prefixed rules like `/:locale(en|es|fr|...)?/security` all contain `(`, so 88 rules × 3 passes per request.

2. `matchHeaders` — same pattern

`escapeHeaderSource(rule.source)` + `safeRegExp()` called for every header rule on every request. Both are pure functions of `rule.source`.

3. `checkSingleCondition` — same pattern

`safeRegExp(condition.value)` called on every request for every `has`/`missing` condition (header, cookie, query, host branches). `condition.value` comes from `next.config.js` and never changes.

All three functions receive data that is static — sourced from `next.config.js`, resolved once at Vite plugin init, and frozen into the bundle.

Fix

Three module-level `Map` caches, one per call site:

Cache Key Value
`_compiledPatternCache` `pattern` string `{ re, paramNames }` or `null`
`_compiledHeaderSourceCache` `rule.source` string `RegExp` or `null`
`_compiledConditionCache` `condition.value` string `RegExp` or `null`

A private `_cachedConditionRegex(value)` helper centralises the lookup for the four `checkSingleCondition` branches. `null` is stored for patterns rejected by `safeRegExp` so `isSafeRegex` also runs at most once per pattern.

All changes are contained in `packages/vinext/src/config/config-matchers.ts`. No changes to generated entries or the middleware code-gen path.

Tests

Six new `describe` blocks in `tests/shims.test.ts` covering:

  • `matchConfigPattern` cache hit path (locale-group pattern called multiple times)
  • `matchConfigPattern` cached-null path (ReDoS-rejected pattern)
  • `matchHeaders` cache hit path (plain `:param` source and regex-bearing source)
  • `checkHasConditions` cache hit path (same value pattern across repeated calls)
  • `checkHasConditions` cache shared across all four condition types (header/cookie/query/host)

All 48 tests in the affected suites pass.

Follow-up

Issue #389 tracks moving compilation fully to build time (emitting regex literals into generated virtual modules), which would eliminate even the first-request compile cost per isolate. The module-level cache is sufficient for warm isolates — Workers reuses isolates aggressively for popular routes.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/cloudflare/vinext/pull/387 **Author:** [@james-elicx](https://github.com/james-elicx) **Created:** 3/9/2026 **Status:** ✅ Merged **Merged:** 3/9/2026 **Merged by:** [@james-elicx](https://github.com/james-elicx) **Base:** `main` ← **Head:** `perf/cache-compiled-config-patterns` --- ### 📝 Commits (4) - [`f376632`](https://github.com/cloudflare/vinext/commit/f3766328c5c94b93aa3f7b977502ecd14afe755f) perf: cache compiled regex patterns in matchConfigPattern - [`4651974`](https://github.com/cloudflare/vinext/commit/4651974c624e9371c32fcf50be545c5f93ab0b73) fmt - [`ff5971a`](https://github.com/cloudflare/vinext/commit/ff5971a699ec77aee8e731aee25301139d867bcf) perf: cache compiled regexes in matchHeaders and checkSingleCondition - [`d26652b`](https://github.com/cloudflare/vinext/commit/d26652bfbb8d67df11e86214ea94386ab50c0e3a) fmt ### 📊 Changes **2 files changed** (+292 additions, -41 deletions) <details> <summary>View changed files</summary> 📝 `packages/vinext/src/config/config-matchers.ts` (+113 -41) 📝 `tests/shims.test.ts` (+179 -0) </details> ### 📄 Description ## Problem Profiling identified per-request regex recompilation across three functions in \`config-matchers.ts\`: ### 1. \`matchConfigPattern\` — 2.4 s self-time, 53% of CPU (dominant bottleneck) ``` _handleRequest └─ matchRedirect (iterates all redirect rules) └─ matchConfigPattern (per rule) └─ safeRegExp (tokeniser walk + ReDoS scan) └─ isSafeRegex ← 2.4 s SELF TIME ``` Every rule whose \`source\` contains \`(\`, \`\\\`, or certain \`:param\` suffix patterns enters the regex branch. For each such rule, on every request: full tokeniser walk → \`isSafeRegex\` scan → \`new RegExp()\`. Locale-prefixed rules like \`/:locale(en|es|fr|...)?/security\` all contain \`(\`, so 88 rules × 3 passes per request. ### 2. \`matchHeaders\` — same pattern \`escapeHeaderSource(rule.source)\` + \`safeRegExp()\` called for every header rule on every request. Both are pure functions of \`rule.source\`. ### 3. \`checkSingleCondition\` — same pattern \`safeRegExp(condition.value)\` called on every request for every \`has\`/\`missing\` condition (header, cookie, query, host branches). \`condition.value\` comes from \`next.config.js\` and never changes. All three functions receive data that is static — sourced from \`next.config.js\`, resolved once at Vite plugin init, and frozen into the bundle. ## Fix Three module-level \`Map\` caches, one per call site: | Cache | Key | Value | |---|---|---| | \`_compiledPatternCache\` | \`pattern\` string | \`{ re, paramNames }\` or \`null\` | | \`_compiledHeaderSourceCache\` | \`rule.source\` string | \`RegExp\` or \`null\` | | \`_compiledConditionCache\` | \`condition.value\` string | \`RegExp\` or \`null\` | A private \`_cachedConditionRegex(value)\` helper centralises the lookup for the four \`checkSingleCondition\` branches. \`null\` is stored for patterns rejected by \`safeRegExp\` so \`isSafeRegex\` also runs at most once per pattern. All changes are contained in \`packages/vinext/src/config/config-matchers.ts\`. No changes to generated entries or the middleware code-gen path. ## Tests Six new \`describe\` blocks in \`tests/shims.test.ts\` covering: - \`matchConfigPattern\` cache hit path (locale-group pattern called multiple times) - \`matchConfigPattern\` cached-null path (ReDoS-rejected pattern) - \`matchHeaders\` cache hit path (plain \`:param\` source and regex-bearing source) - \`checkHasConditions\` cache hit path (same value pattern across repeated calls) - \`checkHasConditions\` cache shared across all four condition types (header/cookie/query/host) All 48 tests in the affected suites pass. ## Follow-up Issue #389 tracks moving compilation fully to build time (emitting regex literals into generated virtual modules), which would eliminate even the first-request compile cost per isolate. The module-level cache is sufficient for warm isolates — Workers reuses isolates aggressively for popular routes. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
BreizhHardware 2026-05-06 13:08:36 +02:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/vinext#532
No description provided.