Home / Docs-Technical WhitePaper / 15-EFT.WP.Methods.Falsification v1.0
Chapter 13: Change Management & Regression Defense
I. Scope & Objectives
- Specify semantic versioning, channel release, and rollback playbooks; establish a regression-defense spine composed of the non-regression gate, double-run comparison, and drift gating. The scope includes any change to data / model / code / configuration / EnvLock, and to the dependency graph Graph.sig.
- Objective: along the canary → stable → LTS channel path, ensure — with minimal blast radius and auditable traceability — that GateDecision ∈ {pass, hold, block} is formed in a way that satisfies falsification repeatability and the compliance templates (Chapter 10).
II. Terms & Symbols
- Versions & channels
ver = MAJOR.MINOR.PATCH, channel ∈ {canary, stable, LTS}, baseline, candidate.
Change classes: chg.code, chg.data, chg.config, chg.env; impact blast_frac ∈ (0,1]. - Regression & consistency
Metrics: score_base, score_cand,
delta_baseline = ( score_cand - score_base ), non-regression margin tau_nonreg.
Double-run equivalence:
eq_rate = ( 1/N ) * Σ 1[ y_cand == y_base ]; distributional drift: D_KS, MMD.
Online consistency: delta_offon, R_infer = 1 - delta_offon; SLO: TS.latency, TS.error. - Risk & budgets
alpha/beta, power (Chapter 7), alpha-spending (Chapter 7), risk_budget.change, rollback_window, rollback_trigger. - Signatures & locking
EnvLock, Graph.sig, ParamCard.sig, InferPipelineCard.sig, DiffCard, CHANGELOG.
III. Postulates & Minimal Equations
- P51-71 (Immutable baseline postulate)
Once a baseline is released and its EnvLock / Graph.sig / ParamCard.sig are registered, all regression decisions reference that frozen baseline only. - P51-72 (Channel conservation postulate)
Any candidate that fails the non-regression, double-run equivalence, or drift gates must not be promoted from canary to a more stable channel. - S52-71 (Non-regression gate)
With delta_baseline = ( score_cand - score_base ):
Pass if: delta_baseline ≥ - tau_nonreg;
Block if: P( delta_baseline < - tau_nonreg | D ) ≥ alpha. - S52-72 (Double-run equivalence gate)
Equivalence rate
eq_rate = ( 1/N ) * Σ 1[ y_cand == y_base ],
or distance
dist = ( 1/N ) * Σ L( y_cand_i , y_base_i ).
Gate: eq_rate ≥ tau_eq or dist ≤ tau_dist, and power ≥ power_min. - S52-73 (Drift gating)
For feature-distribution drift statistic D_KS or MMD, with rejection region C_alpha:
require P( D ∈ C_alpha | H0: no_drift ) ≤ alpha_drift. If H0 is rejected and impact(score) is negative, hold/block. - S52-74 (Blast radius & progressive exposure)
With a monotone increase curve blast_frac(t) such that blast_frac(0) = f0 and lim_{t→∞} blast_frac(t) = 1, enforce the risk constraint E[ loss(t) ] ≤ risk_budget.change.
Example: blast_frac(t) = min( 1 , f0 * r^t ). - S52-75 (Rollback triggers)
Trigger set - T = { TS.error > tau_error ,
- TS.latency > tau_latency ,
- delta_offon > tau_offon_max ,
- eq_rate < tau_eq_min } .
If any holds and alpha_spent is within limits, rollback and append to AuditTrail.
IV. Data & Manifest Conventions
- ChangeProposal (required)
ver & channel; CHANGELOG entry; DiffCard (Graph.sig / ParamCard.sig / InferPipelineCard.sig deltas); schema_ver and compat_api matrix.
Non-regression parameters: tau_nonreg, primary score definition & unit, loss L(•,•), target power_min, alpha/beta.
Double-run design: shadow_pct, sample routing, and time-base alignment strategy ts = alpha + beta * tau_mono.
Drift gate: statistic choice, alpha_drift, and impact(score) estimation.
Rollout & rollback: blast_frac(t), rollback_trigger, rollback_window. - Traceability & compliance
All eval sets registered via hash(•) and fingerprint; ingest CoverageReport, RegressionReport, GateLogs. External publication requires check_dim(expr) pass.
V. Algorithms & Implementation Bindings
- New prototypes (building on I50-10 regress_guard)
- I50-60 diff_signatures(base:any, cand:any) -> DiffCard
- I50-61 plan_rollout(f0:float, r:float, policy:dict) -> {blast_frac:callable}
- I50-62 dual_run_compare(stream:any, router:dict, loss:str) -> {eq_rate:float, dist:float}
- I50-63 drift_guard(X_base:any, X_cand:any, stat:str, alpha:float) -> {reject:bool, p_value:float}
- I50-64 rollback_controller(triggers:dict, window:int) -> {decision:str, reason:str}
- I50-65 nonregression_matrix(metrics:list, tau:dict) -> RegressionReport
- Pseudocode (abridged regress_guard)
- 1. DiffCard ← diff_signatures(base, cand)
- 2. (eq_rate, dist) ← dual_run_compare(stream, router, loss)
- 3. nr_ok ← ( delta_baseline ≥ - tau_nonreg ) ∧ power_ok
- 4. drift ← drift_guard(X_base, X_cand, stat, alpha_drift)
- 5. if nr_ok ∧ eq/dist_ok ∧ ¬drift.reject → pass
- else → hold/block
- Exceptions
E_SCHEMA_MISMATCH, E_ENV_MISMATCH, E_NONDETERMINISM, E_POWER_INSUFFICIENT, E_RESOURCE_EXCEEDED.
VI. Metrology Flows & Run Diagram
- Mx-55 Baseline lock & diff analysis
- Lock EnvLock and signatures.
- Produce DiffCard and impact classification.
- Preconfigure rollout strategy and rollback triggers.
- Mx-56 Double-run & non-regression evaluation
- Route shadow_pct of real-time or replay traffic.
- Estimate delta_baseline, eq_rate / dist, and power.
- Produce RegressionReport and GateDecision_pre.
- Mx-57 Canary exposure & drift gating
- Increase exposure via blast_frac(t).
- Consume alpha via sequential tests; monitor TS.* and delta_offon.
- Trigger rollback_controller or promote the channel.
- Mx-58 Stabilization & LTS archival
Freeze baseline updates, publish CHANGELOG and AuditTrail; produce a cross-domain report (Chapter 11) to request LTS entry.
VII. Verification & Test Matrix
- Required
- Non-regression primary: delta_baseline ≥ - tau_nonreg, power ≥ power_min.
- Double-run parity: eq_rate ≥ tau_eq or dist ≤ tau_dist.
- Drift gate: p_value ≥ alpha_drift or impact(score) acceptable.
- Coverage & mutation: cov_spec ≥ tau_cov, kill_rate ≥ tau_kill (Chapter 5).
- Online consistency: R_infer ≥ tau_R, delta_offon ≤ tau_offon_max (Chapter 9).
- SLO: TS.error ≤ tau_error, TS.latency ≤ tau_latency.
- Multiplicity
Gate multiple metrics with FDR ≤ q_star or a gatekeeping program (Chapter 7); use FWER control for critical assertions.
VIII. Cross-References & Dependencies
Statistical tests & power (Chapter 7); confidence & risk budgets (Chapter 8); online gating & rollback (Chapter 9); compliance & audit trail (Chapter 10); cross-domain equivalence & device deltas (Chapter 11); release & continuous falsification (Chapter 12).IX. Risks, Limitations & Open Questions
- Risks
Stale baseline yields the illusion of “no regression yet degraded”; canary sampling bias inflates or deflates eq_rate; unmodeled dependence in multiple testing introduces hidden regressions. - Limitations
Under strong nonstationarity, shadow and production paths are not fully i.i.d.; oracle noise undermines non-regression power. - Open questions
Joint optimization of adaptive blast_frac(t) and alpha-spending; minimal explanation-set search for regression localization; robust eq_rate estimation under cross-device quantization noise.
X. Deliverables & Versioning
- Deliverables
ChangeProposal.json, DiffCard, RegressionReport, RolloutPlan (with blast_frac(t)), RollbackPlaybook, CoverageReport, GateLogs, CHANGELOG, AuditTrail updates. - Versioning policy
- MAJOR: provide compatibility matrix and migration playbook.
- MINOR: must include full Mx-55 → Mx-57 evidence.
- PATCH: must pass at least the non-regression and SLO gates.
- Entry into LTS requires a stability window and cross-domain equivalence (Chapter 11).
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/