Home / Docs-Technical WhitePaper / 30-EFT.WP.Propagation.TensionPotential v1.0
Chapter 11 — Validation Experiments & Benchmark Datasets
- I. One-Sentence Aim
Validate the correctness of modeling for Phi_T(x,t), n_eff(x,t,f), and arrival time T_arr using reproducible experiments and benchmark datasets. Quantify two-gauge consistency, lower-bound feasibility, separability of the path term, inter-layer interface matching, and anisotropic responses, and define falsification criteria and audit procedures. - II. Scope & Non-Goals
- Covered: validation dimensions and metrics, general experimental design, six experiment groups (A…F), benchmark datasets (D1…D5), statistical methods, reproducibility and audit standards, falsification lines and reporting requirements.
- Non-goals: no re-derivation of Chapter 3 equations; no replacement of Chapter 7 metrology and uncertainty flows; no device-level mechanical/electrical details.
- III. Minimal Terms & Symbols
- Observations & model: T_arr_obs(f, gamma), T_arr_mod(f, gamma), Residual = T_arr_obs − T_arr_mod.
- Two-gauge consistency: eta_T = | T_arr^{const} − T_arr^{gen} |.
- Lower bound: LB = L_path / c_ref (semantics of the constant-factored gauge; the general gauge has the equivalent bound embedded in the integrand).
- Band differencing: ΔT_arr(f1,f2) to isolate the path term.
- Interfaces & matching: Sigma, C_sigma, J_sigma, R_sigma, T_trans (strictly distinct from T_fil).
- Uncertainty: u_stat, u_sys, combined u_c, guardband GB = k_guard · u_c.
- IV. Validation Dimensions & Metric Definitions
- M1 Arrival-time residuals: Residual(f, gamma) = T_arr_obs − T_arr_mod; report mean(Residual), std(Residual), and the distribution of normalized z = Residual / u_c.
- M2 Two-gauge consistency: eta_T; require eta_T ≤ threshold; if exceeded, revisit c_ref calibration and n_eff decomposition (see Chapter 7).
- M3 Lower-bound check: verify T_arr_obs ≥ LB; edge samples may be included within a tolerance band k · u_c but must be listed separately.
- M4 Band-differencing identifiability: correlation coefficient and slope between measured ΔT_arr(f1,f2) and model differentials; should be linear over the target band or match the polynomial order set in Chapter 5.
- M5 Anisotropy significance: regression of ΔT_arr against dot( grad_Phi_T , t_hat ) across differently oriented path sets; report significance and effect size.
- M6 Interface consistency: difference in T_arr before/after segmented integration, trigger rate and magnitude of interface corrections, and energy consistency R_sigma + T_trans + A_sigma = 1.
- M7 Convergence & stability: error slope under refinement ratio r, with | T_arr^{(fine)} − T_arr^{(coarse)} | ≤ eps_T.
- M8 Reproducibility & auditability: hash consistency, complete logs, replayable RNG seeds, and coordinate/unit contracts.
- V. General Experimental Design
- Gauge preference: use the constant-factored gauge by default; if c_ref exhibits measurable weak dependence on band/state, use the general gauge and provide c_ref(x,t,f) estimates and uncertainties at input.
- Multi-band × multi-path: same-path multi-band data separate n_common vs. n_path; multi-path data identify geometric vs. medium differences and anisotropy.
- Explicit interfaces: perform segmented integration at interfaces; record { ell_i } and correction terms; do not interpolate across interfaces.
- Dual uncertainty methods: compute u_c via both GUM and MC; MC must use seedable randomness.
- Minimal logging set: hash(Phi_T), hash(grad_Phi_T), hash(n_eff), hash(gamma), SolverCfg, mode, eps_T, eta_T, GB, u_c, and coordinate/unit contracts.
- VI. Experiment Group A | Lower Bound & Two-Gauge Consistency
- Goal: verify T_arr ≥ LB and the threshold for eta_T.
- Steps:
- A1: calibrate c_ref using benchmark path gamma_ref.
- A2: measure multiple paths in uniform/slowly varying media.
- A3: compute T_arr_mod with both gauges and compare eta_T.
- Pass criteria: T_arr_obs − LB ≥ −k · u_c and eta_T ≤ threshold; failing samples enter the falsification list.
- VII. Experiment Group B | Band Differencing to Isolate the Path Term
- Goal: on the same path, use ΔT_arr(f1,f2) to cancel n_common and test the band structure of n_path.
- Steps:
- B1: choose frequencies f1,f2,… within the bandwidth assumptions of Chapter 5.
- B2: measure T_arr_obs(f_m) and compute differentials.
- B3: fit polynomial coefficients of n_path and compare with model ΔT_arr_mod.
- Pass criteria: correlation and slope within thresholds; out-of-band leakage residuals enter u_sys; overall differentials within GB.
- VIII. Experiment Group C | Anisotropy Detection
- Goal: determine whether the term b1 · dot( grad_Phi_T , t_hat ) (see Chapter 5) is significant.
- Steps:
- C1: design a sector of paths { gamma_a } with varied incidence directions.
- C2: regress ΔT_arr against dot( grad_Phi_T , t_hat ).
- C3: compare BIC/AIC between isotropic and directional models.
- Pass criteria: statistical significance met with controlled overfitting metrics; otherwise revert to the isotropic approximation.
- IX. Experiment Group D | Inter-Layer Interfaces & Segmented Integration
- Goal: validate matching conditions and correction terms on Sigma.
- Steps:
- D1: identify interfaces; classify as continuous / potential-jump / flux-jump / anisotropic.
- D2: compare segmented integration vs. zero-thickness interface corrections.
- D3: test energy consistency R_sigma + T_trans + A_sigma = 1.
- Pass criteria: differences between segmented and corrected results below threshold; energy consistency holds; n_eff ≥ 1 on both sides.
- X. Experiment Group E | Reference-Speed Stability & Drift
- Goal: test the stability of c_ref calibration and its environmental drift.
- Steps:
- E1: repeat c_ref calibration across environmental blocks (temperature/humidity, timebase sources).
- E2: cross-apply calibrated c_ref to independent path sets.
- E3: monitor drift over time.
- Pass criteria: drift within specified band; cross-application maintains eta_T within threshold.
- XI. Experiment Group F | Noise Injection & Robustness (including TBN)
- Goal: assess the impact of TBN(x,t) on estimates of n_eff and T_arr, and robustness to noise.
- Steps:
- F1: inject calibrated broadband noise in simulation, controlling SNR.
- F2: re-estimate n_common and n_path and compare parameter drift.
- F3: report the increment of u_sys and the triggered clamping ratio.
- Pass criteria: key metrics M1…M4 remain within GB; clamping trigger rate controlled and logged.
- XII. Benchmark Datasets (D1…D5)
- D1 Uniform-medium set: n_eff ≡ 1; end-to-end dimensional and lower-bound checks.
- D2 Linear-potential-gradient set: Phi_T = Phi_0 + a · x; has analytic approximations; tests step-size and error control.
- D3 Two-layer-interface set: two constant n_eff layers with optional corrections; validates segmentation and matching.
- D4 Anisotropic-channel set: specified grad_Phi_T and angular distribution of t_hat; detects significance of b1.
- D5 Band-dispersion set: multiple frequency points across the band; generate n_path per Chapter 5 polynomial order; evaluates differencing and out-of-band suppression.
- Minimal metadata: coords_spec, units_spec, f_grid, gamma[k], Δell[k], optional t_hat[k], Sigma with interface labels, c_ref or its calibration pairs, hash(Phi_T), hash(n_eff), licensing and citation conventions.
- XIII. Statistical Methods & Falsification Criteria
- Hypotheses:
- H0: two gauges are consistent; E[eta_T] does not exceed the threshold.
- H0: T_arr_obs ≥ LB.
- H0: ΔT_arr matches model differentials in-band (linear region or prescribed polynomial order).
- H0: interface energy consistency and n_eff ≥ 1 both hold.
- Pass / falsification rules:
- If any H0 is rejected and metrology/implementation errors are excluded, record a falsification sample.
- Three independent repeat falsifications on the same dimension trigger a review at the model/axiom level (Chapter 2 P20-*).
- XIV. Traceability & Reproducibility
- Artifact packaging: bundle minimal sets of data, code, parameters, RNG seeds, contracts, and logs; produce a hash manifest.
- Replay entry: provide a runlist and SolverCfg snapshot for one-click replay.
- Audit views: metric time series, falsification sample lists, interface trigger statistics, and energy-consistency margins.
- XV. Audit & Reporting
- Report structure:
- Overview: datasets, paths, bands, modes, thresholds.
- Metrics: M1…M8.
- Falsification lines: failing cases and categorized causes.
- Reproducibility: hashes, seeds, contracts, and log references.
- Release convention: report results as mean ± k · u_c; when using differentials, also report correlation coefficients and out-of-band residuals.
- XVI. Cross-References
- EFT.WP.Propagation.TensionPotential v1.0 Chapters 3, 5, 6, 7, 8, 9
- EFT.WP.Core.Metrology v1.0 M05-, M10-
- EFT.WP.Core.Errors v1.0 M20-*
- EFT.WP.Core.Equations v1.1 S06-*
- XVII. Deliverables
- Validation checklist and scripting conventions: steps and I/O definitions for experiment groups A…F.
- Benchmark dataset specifications: metadata fields and packaging conventions for D1…D5.
- Audit templates: metric dashboards, falsification lists, replay instructions, and hash manifests.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/