Home / Docs-Technical WhitePaper / 12-EFT.WP.Methods.Repro v1.0
Chapter 1 Positioning, Scope, and Reproducibility Levels
I. Scope and Objectives
- This chapter defines the methodological boundary of the volume: treating data and algorithmic artifacts as first-class objects, it sets the levels, metrics, and pass gates for reproducible, replicable, and portable results, thereby establishing the minimal compliance required across environments and sites.
- Objectives
- Establish a tiered reproducibility system and a metric family, with core measures delta_rep and R_coef.
- Provide a unified gauge and calibration flow for the time-base mapping ts = alpha + beta * tau_mono.
- Specify the minimal manifest and audit-trail fields so that traceability and compliance verification are actionable.
- Bind implementation prototypes I30-* and metrology flows Mx-3* to enable automated acceptance.
- Pass criteria (summary)
- Numerical domain: delta_rep <= tau_rep and R_coef >= 1 - tau_rep.
- Spectral domain: | var(x) - ( ∫ S_xx(f) df ) | <= tau_psd.
- Gauge consistency: when arrival time is involved, publish both gauges in parallel, T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell ) and T_arr = ( ∫ ( n_eff / c_ref ) d ell ), report the difference delta_form, and submit the path gamma(ell) with measure d ell.
II. Terminology and Symbols
- Reproducibility tiers
- Reproducible: same site, same EnvLock, same seed; outputs satisfy the delta_rep gate.
- Replicable: different sites or minor environmental deltas; outputs satisfy statistical- and spectral-consistency gates.
- Portable: across hardware/OS/accelerator stacks, interface and metric gates are unchanged and still pass.
- Metrics and gates
- delta_rep = ( norm( y_new - y_ref ) / max( norm( y_ref ), eps_floor ) )
- R_coef = 1 - delta_rep
- gate.rep: the reproducibility gate set, including numerical gate tau_rep, spectral gate tau_psd, and time-base gate tau_tb.
- eps_floor: a positive floor to avoid degeneracy in denominators; default eps_floor = 1e-12 (configurable).
- Time base and randomness
- tau_mono internal monotonic time; ts external publication time; mapping ts = alpha + beta * tau_mono.
- seed random seed; rng_family RNG family; rng_device device-side RNG identifier.
- Environment and signatures
EnvLock environment lock; hash(•) artifact hash; fingerprint aggregated version & dependency hash. - Collision rules and cross-volume gauges
- T_fil denotes tension; T_trans denotes transmission coefficient. Do not mix n and n_eff.
- Path and measure must be written uniformly as gamma(ell) with d ell.
III. Postulates and Minimal Equations
- P31-1 Deterministic replay postulate
With fixed EnvLock and seed, the same pipeline on the same inputs yields identically distributed outputs; if no stochastic branch exists, byte-identical artifacts are produced (hash(outputs) equal). - P31-2 Time-base alignment postulate
For any observation sequence there exists an affine mapping ts = alpha + beta * tau_mono that renders cross-device timelines comparable; alpha, beta are obtained by calibration and drift slowly over stable windows. - P31-3 Artifact immutability postulate
Any ingested object is content-addressed: oid = hash(bytes(obj)). Any modification yields a new oid and fingerprint. - S32-1 Result-difference minimal equation
delta_rep = ( norm( y_new - y_ref ) / max( norm( y_ref ), eps_floor ) ), R_coef = 1 - delta_rep. - S32-2 Spectral consistency and energy conservation
| var( x ) - ( ∫ S_xx(f) df ) | <= tau_psd, with integration domain, window, and ENBW declared explicitly. - S32-3 Parallel dual-gauge discrepancy
When T_arr is involved, publish delta_form = | T_arr(form1) - T_arr(form2) | / max( |T_arr(form2)|, eps_floor ), together with c_ref and the medium manifest.
IV. Data and Manifest Gauges
- Minimal ingestion fields (example; Core.DataSpec naming)
- Basics: project_id, dataset_id, schema.version, submit_ts
- Source: source.uri, source.oid = hash(•), fingerprint
- Environment: EnvLock.os, EnvLock.kernel, EnvLock.driver, EnvLock.accel, EnvLock.libs[]
- Randomness: seed, rng_family, rng_device
- Time base: alpha, beta, tau_mono_origin, ts_origin
- Window & measure: window = [t0, t1], fs, window_fn, U_w, ENBW
- Path gauge: if path integrals are used, provide gamma(ell) parameterization & support, measure d ell
- Arrival time: when using dual gauges, include c_ref, the n_eff gauge, and delta_form
- Metrics: delta_rep, R_coef, tau_rep, tau_psd, pass (bool)
- Units and dimensions
Execute check_dim(expr) for all quantities. Dimensionless: delta_rep, R_coef, alpha, beta (if ts and tau_mono share units, then beta is dimensionless). - Traceability and compliance
The manifest must support single-point restoration: {EnvLock, PipelineCard, ParamCard, inputs} → outputs. Any missing item must be explicitly marked via missing.* fields.
V. Algorithms and Implementation Bindings
- Prototypes (aligned with Appendix B)
- I30-1 freeze_environment(config:dict) -> EnvLock
- I30-2 emit_pipeline_card(state:any) -> dict
- I30-3 run_benchmark_suite(card:dict) -> BenchReport
- I30-4 verify_reproduction(golden:any, candidate:any, metrics:dict) -> RepReport
- I30-6 align_timebase(trace:any, reference:any) -> {alpha:float, beta:float, fit:dict}
- Idempotency and exceptions
- All I30-* calls are idempotent: repeated calls return the same artifact identifier or are explicit no-ops.
- Exceptions: E_ENV_DRIFT, E_DATA_MISMATCH, E_TIMEBASE_SKEW, E_NONDETERMINISM, E_SEED_INVALID, E_SCHEMA_MISMATCH.
VI. Metrology Workflow and Run Graph
- Mx-31 Repro bootstrap
- Run I30-1 to freeze the environment and produce EnvLock.
- Emit PipelineCard and ParamCard (I30-2).
- Mx-32 Time-base alignment
- Collect alignment segments; compute alpha, beta = I30-6(...).
- Write the results to the manifest and set the gate tau_tb.
- Mx-33 Benchmark runs
- Replay the benchmark suite under EnvLock (I30-3).
- Produce candidate artifacts and intermediate TS.* metrics.
- Mx-34 Reproduction verification and release
- Compute delta_rep, R_coef, and spectral consistency.
- Evaluate gate.rep, generate RepReport = I30-4(...).
- If qualified, ingest and publish; otherwise execute the rollback playbook.
- Key observations and alerts
- Time-base drift alert: |beta - 1| > tau_tb or |alpha| > tau_tb_shift.
- Nondeterminism alert: with the same seed, variance of repeated delta_rep exceeds the threshold.
VII. Verification and Test Matrix
- Minimum cases
- Deterministic case with fixed seed: expect delta_rep = 0.
- Statistical case with random noise: expect E[delta_rep] <= tau_rep with confidence >= 1 - p_alpha.
- Spectral-consistency case: expect tau_psd to pass.
- Edge and extreme scenarios
- GPU/atomic-order/parallel reduction order changes.
- Sampling-rate drift and missing packets.
- Data-type switches (float32 ↔ float64).
- SLOs and gates (starting points; project-level overrides allowed)
- tau_rep <= 2e-2, tau_psd <= 1e-3, tau_tb <= 5e-4.
- Statistical power: sample size sufficient to detect delta_rep >= tau_rep with power >= 0.8.
VIII. Cross-References and Dependencies
- Core.Threads: concurrency semantics, TS.*, hb, bp.
- Core.Sea: tau_mono, ts, and time-base alignment.
- Core.Metrology: S_xx(f), U_w, ENBW gauges.
- Core.DataSpec: manifests and schema evolution.
- Core.Errors: gates, statistical power, and alert levels.
- Core.Equations, Core.DrawingKinetics: alignment requirements when T_arr dual gauges and path measures are involved.
IX. Risks, Limits, and Open Questions
- Sources of nondeterminism (thread scheduling, hardware instructions, library code paths) cannot be fully eliminated; statistical gates are the safety net.
- Cross-site EnvLock cannot be strictly equivalent; publish difference lists and impact assessments.
- Long-term drift (drivers, microcode, compilers) can break binary compatibility; maintain LTS artifacts and rebuild playbooks.
- Open problems: cross-accelerator numerical equivalence, and budget allocation for mixed-precision impacts on delta_rep.
X. Deliverables and Versioning
- Output checklist
- EnvLock, PipelineCard, ParamCard
- BenchReport, RepReport, AuditTrail (with hash(•) and fingerprint)
- Release bundle and long-term snapshot (including alpha, beta, tau_*)
- Version policy
- Semantic versioning; channels canary / stable / LTS.
- Change classes: ADD/MOD/FIX/PERF/SEC/DOC.
- Dual-run strategy: run old and new versions in parallel, compute delta_rep and R_coef; auto-rollback if not passed.
- Publication requirements
All artifacts must be single-point restorable. When arrival time is involved, publish both gauges in parallel with delta_form.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/