47-PTN Template v1.0 | Chapter 10 — Results Presentation & Comparative Scoring

Home ／ Docs-Technical WhitePaper (V6.0) ／ 47-PTN Template v1.0

Chapter 10 — Results Presentation & Comparative Scoring

I. Metrics Definition

Primary metrics
- Arrival-time residual: ΔT_arr = T_arr(obs) − T_arr(ref) (unit s); report mean(ΔT_arr), std(ΔT_arr), and interval x̂ ± U (coverage factor k).
- Phase consistency: r_phi = corr( Phi_ref , Phi_obs ); interval via Fisher-z and back-transform; report r_phi and p.
- Paraxial conservation error: ε_flux (dimensionless; → 0 at O(θ^2)).
- Dimensional-closure rate: p_dim ∈ [0,1] (pass fraction).
- Robust residual metric: Q_res ∈ [0,1] (lower is better; robust quantile gap or Huber surrogate).
Secondary metrics
- Mass-conservation deviation: ΔM = |∫ ρ dV|_{t2} − |∫ ρ dV|_{t1} (unit follows ρ).
- Coherence-window adequacy: κ_coh = f(T_coh, L_coh, B_coh ; |∇ n_eff|, SNR, σ_y) (dimensionless).
Normalization & score mapping
- Z-normalization: z_m = ( m − m_baseline ) / σ_baseline.
- Sigmoid score: q_m = 1 / ( 1 + exp( a z_m + b ) ) (default a=1, b=0; invert sign if “larger is better”).
- Aggregate score: Q = ( ∑_i w_i q_{m_i} ) / ( ∑_i w_i ), weights w fixed in the evaluation sheet.
Explicit path/measure
Unified arrival-time display:
T_arr = ( ∫ ( n_eff / c_ref ) d ell ) = ( 1 / c_ref ) * ( ∫ n_eff d ell ); show gamma(ell) and d ell in text; record delta_form in exports.
Units & dimensions
Wrap any division/integral/composite operator in parentheses; all variables/symbols in backticks; attach check_dim report for closure.

II. Benchmarks & Comparators

Benchmark definition
- Baseline model: baseline_id, baseline.version, train/calibration dates, fixed seeds and config.
- Data splits: train/val/test or k-fold; stratify by device/region/batch to keep balance.
Comparator setup
- Paired design: per record_id, compare method_A vs baseline with paired difference Δm = m_A − m_base.
- Statistical tests: two-sided paired tests or permutation for core metrics; multiple testing via Benjamini–Hochberg with FDR ≤ 0.1.
Power & sample size
Target power 1−β = 0.9, α_core = 0.01; follow Chapter 4’s preregistered plan.
Scorecard fields (publishable)
method_id, dataset_id, metrics{ΔT_arr,r_phi,ε_flux,p_dim,Q_res,...}, score.Q, seeds, references[], version.
Decision thresholds (aligned with Chapter 4)
- Positive: all core gates pass (e.g., improvement on |ΔT_arr| in the correct direction, r_phi ≥ 0.6, p_dim = 1.0); aggregate Q exceeds baseline by +δQ_min.
- Negative: any core metric fails or citations/dimensions are non-compliant.

III. Visualization Standards

Dashboard
Cards: ΔT_arr distribution (hist/KDE), r_phi bar with CIs, ε_flux boxplot, Q_res trend, p_dim gauge.
Residual & agreement plots
- Residual-vs-fitted; Bland–Altman with mean bias and 95% limits.
- Phase scatter: Phi_obs vs Phi_ref with y=x reference and interval bands.
Path & geometry
- Path profile: n_eff(ell) vs ell; legend states gamma(ell) step Δell and delta_form.
- Paraxial conservation: cross-section flux heatmap with ε_flux contours.
Error bars & intervals
Means/medians must include ±U or quantile bands; state k or confidence level.
Figure format
Axes with explicit units (s, rad, 1); annotate versions and data time windows; color/linestyle legend fixed to method IDs.
Export
Provide both vector (PDF/SVG) and bitmap (PNG) versions; captions include see[] and version.

IV. Conclusions & Reporting

Conclusion structure
- One-line takeaway: direction and magnitude vs baseline; then core-metric summary with uncertainty.
- Evidence levels: statistical significance, engineering significance, and reproducibility; include FDR-adjusted statements.
Limits & boundaries
State applicability (coherence window, paraxial, small-angle, slowly varying medium); mark “restricted mode” outside domain.
Publication package
Include scorecard.json, results.md, audit.jsonl, check_dim_report.json, and figure bundle.
Citations & compliance
Use the canonical “volume + version + anchor (P/S/M/I)”; keep text and exports consistent; path expressions show gamma(ell), d ell, and record delta_form.

V. Weights & Thresholds (example, drop-in)

Metric	Direction	Weight w_i	Gate	Mapping note
ΔT_arr	lower better	0.35	`	ΔT_arr
r_phi	higher better	0.25	r_phi ≥ 0.6	Fisher-z interval mapping
ε_flux	lower better	0.15	≈0 @ O(θ^2)	Paraxial guard
p_dim	must be 1	0.15	= 1.0	Otherwise hard fail
Q_res	lower better	0.10	per calibration	Robust quantile band

Aggregate: Q = (0.35 q_ΔT + 0.25 q_r + 0.15 q_flux + 0.15 q_dim + 0.10 q_res).

VI. Machine-Readable Templates (ready to commit)

A. scorecard.json

{

"version": "1.0.0",

"dataset_id": "ptn-demo",

"baseline": { "id": "base-001", "version": "1.2.3" },

"method": { "id": "mA-010", "version": "2.0.0" },

"metrics": {

"DeltaT_arr_s": { "mean": -2.3e-9, "std": 4.8e-9, "U_k2": 1.5e-9 },

"r_phi": { "value": 0.72, "ci95": [0.61, 0.80] },

"epsilon_flux": { "median": 0.004, "p95": 0.011 },

"p_dim": 1.0,

"Q_res": 0.13

"score": { "Q": 0.78 },

"tests": {

"paired": { "DeltaT_arr": { "p_perm": 0.004, "B": 10000 } },

"FDR": 0.08

"see": [

"EFT.WP.Core.Equations v1.1:S20-1",

"EFT.WP.Core.Metrology v1.0:check_dim",

"Data.Benchmarks v1.0:PROTO"

"version_lock": true

}

B. results.md (outline)

# PTN Results — v1.0.0

## 1. Summary

- One-liner conclusion; core metrics with uncertainty.

## 2. Core Metrics

- Delta T_arr (s): mean±U, histogram, BA plot.

- r_phi: value + 95% CI; scatter vs identity.

- epsilon_flux: distribution; paraxial guard lines.

## 3. Secondary Metrics

- ΔM, κ_coh …

## 4. Visual Gallery

- Figures (PDF/PNG), legends, units.

## 5. Repro & Audit

- Seeds, configs, manifests; audit.jsonl hash.

C. bench_score.yaml (interface contract)

version: "1.0.0"

call: "I90-bench_score"

inputs:

results: "PTN_EXPORT/results.parquet"

baseline: "PTN_EXPORT/baseline.parquet"

metrics: ["DeltaT_arr_s","r_phi","epsilon_flux","p_dim","Q_res"]

weights: { DeltaT_arr_s: 0.35, r_phi: 0.25, epsilon_flux: 0.15, p_dim: 0.15, Q_res: 0.10 }

thresholds:

tau_T_s: "3*u(T_arr)"

r_phi_min: 0.6

flux_ok: "≈0@O(theta^2)"

p_dim: 1.0

mapping:

type: "sigmoid"

a: 1.0

b: 0.0

exports:

files: ["scorecard.json","results.md","figs/*.pdf","reports/check_dim_report.json"]

VII. Required Items on Results Page (aligned with Chapter 5)

Data & methods: dataset_id, method/baseline IDs and versions, random seeds.
Metrics & intervals: ΔT_arr, r_phi, ε_flux, p_dim, Q_res with units and uncertainties.
Figure list: filenames, sizes, formats (PDF/PNG).
Audit & compliance: audit.jsonl, check_dim_report.json, see[], references[], version.
Decision & gates: positive/negative call, FDR, δQ_min, applicability and limitations.

Copyright & License: Unless otherwise stated, the copyright of “Energy Filament Theory” (including text, charts, illustrations, symbols, and formulas) is held by the author (屠广林).
License (CC BY 4.0): With attribution to the author and source, you may copy, repost, excerpt, adapt, and redistribute.
Attribution (recommended): Author: 屠广林｜Work: “Energy Filament Theory”｜Source: energyfilament.org｜License: CC BY 4.0
Call for verification: Independent and self-funded—no employer and no sponsorship. Next, we will prioritize venues that welcome public discussion, public reproduction, and public critique, with no country limits. Media and peers worldwide are invited to organize verification during this window and contact us.
Version info: First published: 2025-11-11 ｜ Current version: v6.0+5.05