15-EFT.WP.Methods.Falsification v1.0 | Chapter 12: Acceptance & Continuous Falsification Release

Home ／ Docs-Technical WhitePaper ／ 15-EFT.WP.Methods.Falsification v1.0

Chapter 12: Acceptance & Continuous Falsification Release

I. Scope & Objectives

Define an end-to-end acceptance workflow that spans offline → online, including score synthesis and confidence-interval conventions, release gating and the rolling falsification loop, plus compliance templates for the announcement bundle and third-party review. The scope covers full-chain changes across algorithms & parameters, data & conventions, inference pipelines, and the execution environment EnvLock.
Objective: under alpha_release, beta_release, and q_star control, make the formation of GateDecision ∈ {pass, hold, block} auditable, traceable, and forensically verifiable, while sustaining a canary → stable → LTS continuous falsification cycle.

II. Terms & Symbols

Release & evidence
ReleasePlan, Evidence.bundle, AuditTrail, EnvLock, anchor, canon_json(•).
Scores & confidence
score_i, var_i, w_i, score_agg, se_agg, CI_agg = [L,U],
risk_release = P( score_true < tau_accept | D ).
Gates & budgets
tau_accept, tau_nonreg, q_star, alpha_release, beta_release, power_min, tau_cov, tau_kill.
Non-regression & consistency
delta_baseline = ( score_cand - score_base ) (non-inferiority), delta_offon, R_infer = 1 - delta_offon.
Coverage & mutation
cov_spec = ( |C_hit| / |C_total| ), kill_rate = ( |mut_killed| / |mut_all| ) (see Chapter 5).
Online metrics
TS.latency, TS.thrpt, TS.error, GateDecision (see Chapter 9).
Multiplicity & sequentiality
FDR, FWER, TOST, alpha-spending (see Chapter 7).
Sites & devices
site_id, device_id, cross-domain/device delta delta_dev (see Chapter 11).

III. Postulates & Minimal Equations

P51-31 (Acceptance is falsifiable)
For any candidate version under a locked EnvLock, there exists an observable metric vector score_vec and a gate vector tau_vec such that if any component violates the assertion score_k ≥ tau_k, the version is falsified and rejected.
S52-61 (Multi-source score synthesis & interval)
For sources i = 1..m with positive weights w_i:
score_agg = ( Σ w_i * score_i ) / ( Σ w_i ) ;
se_agg = sqrt( Σ ( w_i^2 * var_i ) ) / ( Σ w_i ) ;
CI_agg = [ score_agg - z_{1 - alpha/2} * se_agg , score_agg + z_{1 - alpha/2} * se_agg ] .

Gate: L ≥ tau_accept.

S52-62 (Non-regression gate)
With a non-inferiority claim delta_baseline ≥ - tau_nonreg; if the rejection region includes delta_baseline < - tau_nonreg, trigger block.
S52-63 (Composite gate under FDR control)
For an assertion family A = {A_k}, control FDR at q_star; release on the rejection set R only if FDR ≤ q_star and all k ∈ critical pass.
S52-64 (Sequential gate & risk budget)
In the online phase, use alpha-spending:
alpha_spent(t) = Σ_{k=1..t} g(k) , with Σ g(k) ≤ alpha_release ;

release-risk constraint: risk_release ≤ beta_release.

S52-65 (Dual gates for consistency & regression)
Consistency gate: R_infer ≥ tau_R.
Regression gate: delta_offon ≤ tau_offon.
If either is violated, only hold is allowed and additional testing must be scheduled.

IV. Data & Manifest Conventions

Pre-registration (into ReleasePlan)
H0/H1, metric definitions & units, primary gate tau_accept, non-regression margin tau_nonreg, multiplicity strategy (q_star), sample-size & power target power_min, alpha_release / beta_release, canary fraction & rollback strategy, EnvLock, rng.seed / rng_family.
Evidence & lineage
- Evidence.bundle = { SpecCard, DataCard, SiteCard/DeviceCard, CoverageReport, CEReport, AttackReport, MetaAnalysisReport, GateLogs, Signatures }.
- Key hashes: Graph.sig, ParamCard.sig, InferPipelineCard.sig, golden_set_hash, adv_set_hash(epsilon).
- Time-base alignment ts = alpha + beta * tau_mono must be recorded in the AuditTrail.
Privacy & compliance
All samples traced via fingerprint and hash(•); sensitive fields are de-identified with a recorded policy_id.

V. Algorithms & Implementation Bindings

New prototypes (extending Chapter 6 & Appendix B)
- I50-51 compute_release_score(reports:list, weights:dict, alpha:float) -> {score_agg:float, CI_agg:tuple}
- I50-52 noninferiority_guard(base:any, cand:any, tau_nonreg:float) -> {delta_baseline:float, decision:str}
- I50-53 fdr_gate(assertions:list, q_star:float) -> {FDR:float, R:set}
- I50-54 alpha_spend_scheduler(rule:dict) -> {alpha_spent:float, proceed:bool}
- I50-55 release_risk_posterior(scores:any, prior:any, tau_accept:float) -> {risk_release:float}
- I50-56 build_evidence_bundle(plan:dict, artifacts:list) -> Evidence.bundle
- I50-57 publish_announcement(evidence:Evidence.bundle, channel:str) -> Ack
- I50-58 schedule_continuous_fals(stream:any, probes:list) -> JobId
Contracts & exceptions
E_POWER_INSUFFICIENT (fails power_min), E_MULTITEST_UNCONTROLLED (FDR/FWER not controlled), E_ENV_MISMATCH (inconsistent EnvLock), E_NONDETERMINISM, E_RESOURCE_EXCEEDED, E_ORACLE_AMBIGUOUS.

VI. Metrology Flows & Run Diagram (Mx-51 → Mx-54)

Mx-51 Pre-registration & freeze
- Register the ReleasePlan with gates, sample size, alpha_release / beta_release.
- Lock EnvLock and artifact signatures.
- Generate SpecCard / DataCard; publish the pre-registration summary.
Mx-52 Offline falsification evaluation
- Run coverage & mutation; produce cov_spec, kill_rate, and CEReport / AttackReport (see Chapters 5 & 6).
- Use I50-51 to compute score_agg and CI_agg.
- Use I50-52 to verify non-inferiority delta_baseline.
- Use I50-53 to control FDR and derive the accepted assertion set.
- Emit GateDecision_pre ∈ {pass, hold, block} for offline.
Mx-53 Canary release & sequential gating
- On the canary channel, run online evaluation with alpha-spending; continuously compute risk_release and delta_offon.
- If alpha_spent ≤ alpha_release and risk_release ≤ beta_release and R_infer ≥ tau_R, promote to stable; otherwise hold or rollback.
Mx-54 Announcement & archiving
Use I50-56 to assemble and sign the Evidence.bundle; I50-57 to publish the announcement (gates, intervals, and lineage). Append all artifacts and GateLogs to the AuditTrail. Launch I50-58 as the continuous falsification job.

VII. Verification & Test Matrix

Minimum required
- Primary non-regression: test delta_baseline ≥ - tau_nonreg with power ≥ power_min.
- Coverage & mutation: cov_spec ≥ tau_cov, kill_rate ≥ tau_kill.
- Multiplicity: on critical assertions, FDR ≤ q_star.
- Confidence gate: lower bound L ≥ tau_accept (CI_agg).
- Online consistency: R_infer ≥ tau_R, delta_offon ≤ tau_offon.
- Realtime SLO: TS.latency / TS.error within bounds (see Chapter 9).
Sampling & efficiency
Sample sizes follow Chapter 7 power calculations; the minimum canary observation window is determined jointly by alpha-spending and beta_release.

VIII. Cross-References & Dependencies

Statistical testing & error control: Chapter 7.
Uncertainty propagation & risk: Chapter 8.
Online gating & rollback: Chapter 9.
Compliance templates & audit trail fields: Chapter 10.
Cross-domain consistency & delta_dev: Chapter 11.
Regression defense & channel policy: Chapter 13.

IX. Risks, Limitations & Open Questions

Risks
Offline–online distribution shift underestimates risk_release; heterogeneous canary cohorts overspend alpha; FDR bias from unmodeled dependence across many assertions.
Limitations
When the oracle is ambiguous or label variance is high, both CI_agg and non-inferiority effectiveness degrade; strongly correlated assertions may require more conservative FDR control.
Open questions
Joint optimization of adaptive tau_accept(t) with dynamic q_star; online Bayesian risk budgeting under changing cross-device traffic shares.

X. Deliverables & Versioning

Deliverables
ReleasePlan.json, CoverageReport, CEReport / AttackReport,
OfflineGateSummary (with score_agg / CI_agg / FDR),
CanaryGateSummary (with alpha_spent / risk_release / delta_offon),
Evidence.bundle with signatures, AuditTrail updates, Announcement.md.
Versioning policy
- Pass Mx-52 but fail Mx-53: mark hold and enter a patch remediation loop.
- Full pass Mx-51 → Mx-54: release to stable and start the continuous falsification job.
- Promotion to LTS requires long-term stability evidence and cross-domain equivalence (see Chapter 11).

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published： 2025-11-11｜Current version：v5.1
License link：https://creativecommons.org/licenses/by/4.0/