Home / Docs-Technical WhitePaper / 54-Reproducibility Checklist Template v1.0
Chapter 7 — Scripts & Commands (Runbook / Makefile / reproduce.sh)
I. Purpose & Scope
- Deliver a one-click reproduction entrypoint and standard command set—Runbook, Makefile, and reproduce.sh—covering environment bring-up, data snapshot verification, training/inference, evaluation & comparison, UQ and packaging; guarantee idempotent replay, auditable logs, and verifiable artifacts.
- For path quantities (arrival/phase), explicitly show gamma(ell) and d ell in text; record delta_form ∈ {general, factored} on the data side; parenthesize all expressions; publication requires p_dim = 1.0 with check_dim_report.json.
II. Inputs & Dependencies
- Depends on: Ch. 3 (layout & artifacts), Ch. 4 (environment lock), Ch. 5 (data snapshot & lineage), Ch. 6 (weights/params & freshness).
- Citations use “volume + version + anchor (P/S/M/I)”, anchor coverage ≥ 90%; public v1.* only.
III. Runbook (outline)
- Stages: preflight → data_verify → train → infer → eval → compare → pack.
- Idempotency key: each stage accepts --idempotency_key; repeated runs must not change artifact sha256.
- Failure & rollback: any S1–S5 stop writes to audit.jsonl and tags [Restricted] when needed.
RUNBOOK.md (outline)
# RUNBOOK
- Preflight: env_lock / container_spec / seed_policy
- Data Verify: data_refs / split_manifest / lineage / checksums
- Train: train_config; outputs best.ckpt / last.ckpt
- Infer: binding_spec / inference_openapi / inference.proto
- Eval & UQ: bench_plan / scorecard / model_uq / uq_summary
- Compare & Pack: compare_spec / validate_report / report_manifest
IV. Makefile (standard targets)
SHELL := /bin/bash
IDK ?= run$$(date +%Y%m%d%H%M%S)
.PHONY: all preflight data_verify train infer eval compare pack clean
all: preflight data_verify train infer eval compare pack
preflight:
python tools/preflight.py --env env_lock.json --container container_spec.yaml \
--seed seed_policy.yaml --out reports/preflight_report.json
data_verify:
python tools/verify_data.py --refs data/data_refs.yaml --splits data/split_manifest.json \
--lineage data/lineage_graph.json --checksums checksums.txt --out reports/data_verify.json
train:
python tools/train.py --config model/train_config.yaml --idempotency_key $(IDK) \
--out weights/best.ckpt --log reports/train.log
infer:
python tools/infer.py --binding inference/binding_spec.md --weights weights/best.ckpt \
--idempotency_key $(IDK) --out outputs/preds.json
eval:
python tools/eval.py --bench eval/bench_plan.yaml --pred outputs/preds.json \
--out eval/scorecard.json --uq uq/uq_summary.json
compare:
python tools/compare.py --spec eval/compare_spec.yaml --score eval/scorecard.json \
--validate reports/validate_report.json
pack:
python tools/pack.py --root PTN_EXPORT --manifest report_manifest.yaml --sign SIGNATURE.asc
clean:
rm -rf outputs/* tmp/*
V. reproduce.sh (one-click script)
#!/usr/bin/env bash
set -euo pipefail
STAGE="${1:-all}"
IDK="${IDK:-run$(date +%Y%m%d%H%M%S)}"
log(){ echo "[$(date -Iseconds)] $*"; }
case "$STAGE" in
preflight)
log "Preflight..."
python tools/preflight.py --env env_lock.json --container container_spec.yaml \
--seed seed_policy.yaml --out reports/preflight_report.json
;;
data_verify)
log "Verify data..."
python tools/verify_data.py --refs data/data_refs.yaml --splits data/split_manifest.json \
--lineage data/lineage_graph.json --checksums checksums.txt --out reports/data_verify.json
;;
train)
log "Train..."
python tools/train.py --config model/train_config.yaml --idempotency_key "$IDK" \
--out weights/best.ckpt --log reports/train.log
;;
infer)
log "Infer..."
python tools/infer.py --binding inference/binding_spec.md --weights weights/best.ckpt \
--idempotency_key "$IDK" --out outputs/preds.json
;;
eval)
log "Eval..."
python tools/eval.py --bench eval/bench_plan.yaml --pred outputs/preds.json \
--out eval/scorecard.json --uq uq/uq_summary.json
;;
compare)
log "Compare & validate..."
python tools/compare.py --spec eval/compare_spec.yaml --score eval/scorecard.json \
--validate reports/validate_report.json
;;
pack)
log "Pack..."
python tools/pack.py --root PTN_EXPORT --manifest report_manifest.yaml --sign SIGNATURE.asc
;;
all)
"$0" preflight && "$0" data_verify && "$0" train && "$0" infer && "$0" eval && "$0" compare && "$0" pack
;;
*)
echo "Usage: $0 {preflight|data_verify|train|infer|eval|compare|pack|all}"
exit 2
;;
esac
python tools/audit.py --event "$STAGE" --idk "$IDK" --out reports/audit.jsonl
VI. Path Alignment & Metrics
- Before evaluation & alerts, align time → path → phase; arrays len(gamma_ell)=len(d_ell)=len(n_eff)≥2.
- Unified forms (two equivalent):
T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
T_arr = ( ∫ ( n_eff / c_ref ) d ell )
Phase: Phi = ( 2π / λ_ref ) * ( ∫ n_eff d ell ). - infer/eval/compare stages must echo delta_form into artifact metadata; require p_dim = 1.0.
VII. Logging & Audit
- Each stage emits reports/*.log and appends to audit.jsonl: timestamp, idempotency_key, input/output sha256, versions & signatures.
- On failure, record reason and remediation; tag [Restricted] in figures/manifests when needed.
VIII. Gate Mapping
- G1 Schema completeness: scripts & entry params match manifests.
- G2 Citation compliance: command help & RUNBOOK.md anchor coverage ≥ 90%.
- G3 Path conventions: gamma/measure/delta_form required and step checked.
- G4 Dimensional closure: check_dim_report.json passed, p_dim = 1.0.
- G5 Freshness: clock_state="locked", τ_calib valid.
- G6 Coverage consistency: coverage.mode ∈ {k, alpha, quantile} unified across Data/Model/Error/Pipeline/this volume.
- G7 Covariance consistency: Σ PD.
- G8 Uniqueness & acyclicity: artifact sha256 unique, lineage acyclic.
- Trigger S1–S5 to stop & rollback with diagnostics.
IX. Anti-Patterns & Fixes
- Anti: T_arr = ∫ n_eff / c_ref d ell (no parentheses) → Fix: use parenthesized unified form.
- Anti: scripts do not echo delta_form or arrays unequal → Fix: complete in alignment step and enforce equality.
- Anti: missing idempotency_key → Fix: require --idempotency_key and handle conflicts.
- Anti: comparing means only → Fix: compare U=k·u_c or quantile bands with convergence diagnostics.
X. Machine-Readable Artifacts
RUNBOOK.md, Makefile, reproduce.sh, stage logs reports/*.log, alignment/compare reports reports/*.json, audit.jsonl.XI. Cross-References
- Ch. 3 (layout & artifacts), Ch. 4 (env lock), Ch. 5 (data snapshot), Ch. 6 (weights/params), Ch. 9 (metrics & gates), Ch. 10 (reproduction flow).
- Model Card Ch. 6/10/12; Dataset Card Ch. 10; Error Budget Card Ch. 8/9; Pipeline Card Ch. 7/12.
XII. Checklist
- RUNBOOK.md / Makefile / reproduce.sh archived and aligned with report_manifest.yaml.
- Each stage requires --idempotency_key; audit.jsonl records input/output sha256.
- Path alignment explicit gamma/measure/delta_form; len(path) ≥ 2, Δell compliant; p_dim = 1.0.
- Evaluation compares point estimates and intervals (k/alpha/quantile) with convergence diagnostics.
- /validate passed G1–G8; non-compliances tagged [Restricted] and handled; anchor coverage ≥ 90%.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/