15-EFT.WP.Methods.Falsification v1.0 | Chapter 13: Change Management & Regression Defense

Home ／ Docs-Technical WhitePaper (V6.0) ／ 15-EFT.WP.Methods.Falsification v1.0

Chapter 13: Change Management & Regression Defense

I. Scope & Objectives

Specify semantic versioning, channel release, and rollback playbooks; establish a regression-defense spine composed of the non-regression gate, double-run comparison, and drift gating. The scope includes any change to data / model / code / configuration / EnvLock, and to the dependency graph Graph.sig.
Objective: along the canary → stable → LTS channel path, ensure — with minimal blast radius and auditable traceability — that GateDecision ∈ {pass, hold, block} is formed in a way that satisfies falsification repeatability and the compliance templates (Chapter 10).

II. Terms & Symbols

Versions & channels
ver = MAJOR.MINOR.PATCH, channel ∈ {canary, stable, LTS}, baseline, candidate.
Change classes: chg.code, chg.data, chg.config, chg.env; impact blast_frac ∈ (0,1].
Regression & consistency
Metrics: score_base, score_cand,
delta_baseline = ( score_cand - score_base ), non-regression margin tau_nonreg.
Double-run equivalence:
eq_rate = ( 1/N ) * Σ 1[ y_cand == y_base ]; distributional drift: D_KS, MMD.
Online consistency: delta_offon, R_infer = 1 - delta_offon; SLO: TS.latency, TS.error.
Risk & budgets
alpha/beta, power (Chapter 7), alpha-spending (Chapter 7), risk_budget.change, rollback_window, rollback_trigger.
Signatures & locking
EnvLock, Graph.sig, ParamCard.sig, InferPipelineCard.sig, DiffCard, CHANGELOG.

III. Postulates & Minimal Equations

P51-71 (Immutable baseline postulate)
Once a baseline is released and its EnvLock / Graph.sig / ParamCard.sig are registered, all regression decisions reference that frozen baseline only.
P51-72 (Channel conservation postulate)
Any candidate that fails the non-regression, double-run equivalence, or drift gates must not be promoted from canary to a more stable channel.
S52-71 (Non-regression gate)
With delta_baseline = ( score_cand - score_base ):
Pass if: delta_baseline ≥ - tau_nonreg;
Block if: P( delta_baseline < - tau_nonreg | D ) ≥ alpha.
S52-72 (Double-run equivalence gate)
Equivalence rate
eq_rate = ( 1/N ) * Σ 1[ y_cand == y_base ],
or distance
dist = ( 1/N ) * Σ L( y_cand_i , y_base_i ).
Gate: eq_rate ≥ tau_eq or dist ≤ tau_dist, and power ≥ power_min.
S52-73 (Drift gating)
For feature-distribution drift statistic D_KS or MMD, with rejection region C_alpha:
require P( D ∈ C_alpha | H0: no_drift ) ≤ alpha_drift. If H0 is rejected and impact(score) is negative, hold/block.
S52-74 (Blast radius & progressive exposure)
With a monotone increase curve blast_frac(t) such that blast_frac(0) = f0 and lim_{t→∞} blast_frac(t) = 1, enforce the risk constraint E[ loss(t) ] ≤ risk_budget.change.
Example: blast_frac(t) = min( 1 , f0 * r^t ).
S52-75 (Rollback triggers)
Trigger set
T = { TS.error > tau_error ,
TS.latency > tau_latency ,
delta_offon > tau_offon_max ,
eq_rate < tau_eq_min } .

If any holds and alpha_spent is within limits, rollback and append to AuditTrail.

IV. Data & Manifest Conventions

ChangeProposal (required)
ver & channel; CHANGELOG entry; DiffCard (Graph.sig / ParamCard.sig / InferPipelineCard.sig deltas); schema_ver and compat_api matrix.
Non-regression parameters: tau_nonreg, primary score definition & unit, loss L(•,•), target power_min, alpha/beta.
Double-run design: shadow_pct, sample routing, and time-base alignment strategy ts = alpha + beta * tau_mono.
Drift gate: statistic choice, alpha_drift, and impact(score) estimation.
Rollout & rollback: blast_frac(t), rollback_trigger, rollback_window.
Traceability & compliance
All eval sets registered via hash(•) and fingerprint; ingest CoverageReport, RegressionReport, GateLogs. External publication requires check_dim(expr) pass.

V. Algorithms & Implementation Bindings

New prototypes (building on I50-10 regress_guard)
- I50-60 diff_signatures(base:any, cand:any) -> DiffCard
- I50-61 plan_rollout(f0:float, r:float, policy:dict) -> {blast_frac:callable}
- I50-62 dual_run_compare(stream:any, router:dict, loss:str) -> {eq_rate:float, dist:float}
- I50-63 drift_guard(X_base:any, X_cand:any, stat:str, alpha:float) -> {reject:bool, p_value:float}
- I50-64 rollback_controller(triggers:dict, window:int) -> {decision:str, reason:str}
- I50-65 nonregression_matrix(metrics:list, tau:dict) -> RegressionReport
Pseudocode (abridged regress_guard)
1. DiffCard ← diff_signatures(base, cand)
2. (eq_rate, dist) ← dual_run_compare(stream, router, loss)
3. nr_ok ← ( delta_baseline ≥ - tau_nonreg ) ∧ power_ok
4. drift ← drift_guard(X_base, X_cand, stat, alpha_drift)
5. if nr_ok ∧ eq/dist_ok ∧ ¬drift.reject → pass
else → hold/block
Exceptions
E_SCHEMA_MISMATCH, E_ENV_MISMATCH, E_NONDETERMINISM, E_POWER_INSUFFICIENT, E_RESOURCE_EXCEEDED.

VI. Metrology Flows & Run Diagram

Mx-55 Baseline lock & diff analysis
- Lock EnvLock and signatures.
- Produce DiffCard and impact classification.
- Preconfigure rollout strategy and rollback triggers.
Mx-56 Double-run & non-regression evaluation
- Route shadow_pct of real-time or replay traffic.
- Estimate delta_baseline, eq_rate / dist, and power.
- Produce RegressionReport and GateDecision_pre.
Mx-57 Canary exposure & drift gating
- Increase exposure via blast_frac(t).
- Consume alpha via sequential tests; monitor TS.* and delta_offon.
- Trigger rollback_controller or promote the channel.
Mx-58 Stabilization & LTS archival
Freeze baseline updates, publish CHANGELOG and AuditTrail; produce a cross-domain report (Chapter 11) to request LTS entry.

VII. Verification & Test Matrix

Required
- Non-regression primary: delta_baseline ≥ - tau_nonreg, power ≥ power_min.
- Double-run parity: eq_rate ≥ tau_eq or dist ≤ tau_dist.
- Drift gate: p_value ≥ alpha_drift or impact(score) acceptable.
- Coverage & mutation: cov_spec ≥ tau_cov, kill_rate ≥ tau_kill (Chapter 5).
- Online consistency: R_infer ≥ tau_R, delta_offon ≤ tau_offon_max (Chapter 9).
- SLO: TS.error ≤ tau_error, TS.latency ≤ tau_latency.
Multiplicity
Gate multiple metrics with FDR ≤ q_star or a gatekeeping program (Chapter 7); use FWER control for critical assertions.

VIII. Cross-References & Dependencies

Statistical tests & power (Chapter 7); confidence & risk budgets (Chapter 8); online gating & rollback (Chapter 9); compliance & audit trail (Chapter 10); cross-domain equivalence & device deltas (Chapter 11); release & continuous falsification (Chapter 12).

IX. Risks, Limitations & Open Questions

Risks
Stale baseline yields the illusion of “no regression yet degraded”; canary sampling bias inflates or deflates eq_rate; unmodeled dependence in multiple testing introduces hidden regressions.
Limitations
Under strong nonstationarity, shadow and production paths are not fully i.i.d.; oracle noise undermines non-regression power.
Open questions
Joint optimization of adaptive blast_frac(t) and alpha-spending; minimal explanation-set search for regression localization; robust eq_rate estimation under cross-device quantization noise.

X. Deliverables & Versioning

Deliverables
ChangeProposal.json, DiffCard, RegressionReport, RolloutPlan (with blast_frac(t)), RollbackPlaybook, CoverageReport, GateLogs, CHANGELOG, AuditTrail updates.
Versioning policy
- MAJOR: provide compatibility matrix and migration playbook.
- MINOR: must include full Mx-55 → Mx-57 evidence.
- PATCH: must pass at least the non-regression and SLO gates.
- Entry into LTS requires a stability window and cross-domain equivalence (Chapter 11).

Copyright & License: Unless otherwise stated, the copyright of “Energy Filament Theory” (including text, charts, illustrations, symbols, and formulas) is held by the author (屠广林).
License (CC BY 4.0): With attribution to the author and source, you may copy, repost, excerpt, adapt, and redistribute.
Attribution (recommended): Author: 屠广林｜Work: “Energy Filament Theory”｜Source: energyfilament.org｜License: CC BY 4.0
Call for verification: Independent and self-funded—no employer and no sponsorship. Next, we will prioritize venues that welcome public discussion, public reproduction, and public critique, with no country limits. Media and peers worldwide are invited to organize verification during this window and contact us.
Version info: First published: 2025-11-11 ｜ Current version: v6.0+5.05