Home / Docs-Technical WhitePaper / 18-EFT.WP.Methods.CrossStats v1.0
Appendix C — Manifest Templates & Examples (stats manifest)
One-line objective: Provide the minimal key set, field semantics, validation rules, and multi-scenario examples for the cross-statistics publication manifest manifest.stats, ensuring estimates, tests, drift monitors, experiments, and audits are reproducible, traceable, and alignable across systems.
I. Scope & Targets
- This appendix defines the persisted structure of statistical deliverables, applicable to offline evaluation, online experimentation, drift monitoring, causal estimation, and calibration transfer.
- Inputs: metrics, intervals, and diagnostics produced throughout this volume; cross-volume mandatory metadata (see Methods.Cleaning v1.0 and Methods.Imaging v1.0).
- Output: a single JSON document manifest.stats containing TraceID, versioning, window Delta_t, time-base mappings, contract evaluations, and signature.
II. Minimal Key Set (must exist)
- schema_version, book_ref, release_tag
- TraceID, repro_hash, signature
- timebase.tau_mono_range, timebase.ts_range, timebase.offset/skew/J
- arrival.two_forms.delta_form, arrival.two_forms.tol_Tarr
- window.Delta_t, dataset.N, weights.W_norm
- metrics.core[*] (name, estimate, interval or posterior quantiles, units)
- contracts[*] (id, status, severity, evidence)
- actions[*] (strategy-card decisions and dispositions)
- provenance (data/code origins and environment summary)
III. Field & Type Specifications
- schema_version : string (e.g., "1.0.0")
- book_ref : string (fixed: "EFT.WP.Methods.CrossStats v1.0")
- TraceID : string (cross-system trace ID)
- repro_hash : string (hash_sha256(code+params+data_fingerprint))
- signature : string (publisher’s signature)
- timebase : object
- tau_mono_range : [int,int] (internal monotone time-base range)
- ts_range : [string,string] (ISO8601 external publication range)
- offset/skew/J : {offset: double, skew: double, J: double}
- arrival.two_forms : {delta_form: double, tol_Tarr: double}
- window.Delta_t : string (e.g., "PT24H")
- dataset : {N: int, N_eff?: double, sampling?: string}
- weights : {W_norm: double, cap_w?: double, p_trim?: double}
- metrics.core[*] : {name: string, est: double, se?: double, ci?: [double,double], posterior?: {q05: double, q50: double, q95: double}, unit?: string, dim?: string, notes?: string}
- metrics.drift? : {W1?: double, KL?: double, psi?: double}
- metrics.ab? : {lift: double, se: double, ci: [double,double], mde?: double, alpha_spent?: double}
- metrics.causal? : {ATE: double, U?: double, SMD_max?: double, overlap_min?: double}
- contracts[*] : {id: string, status: string, severity: string, evidence: object}
- actions[*] : {policy_id: string, decision: string, reason: string, at: string}
- provenance : {data_uri: string, code_uri: string, env: {python?: string, pkg?: object}}
IV. Template: Minimal Publishable Manifest
{
"schema_version": "1.0.0",
"book_ref": "EFT.WP.Methods.CrossStats v1.0",
"release_tag": "stats-prod-2025-08-31T12:00Z",
"TraceID": "trc_01HXYZ...",
"repro_hash": "sha256:REPRO_HASH",
"signature": "SIG_BASE64",
"timebase": {
"tau_mono_range": [1725062400, 1725148800],
"ts_range": ["2025-08-31T00:00:00Z", "2025-09-01T00:00:00Z"],
"offset/skew/J": {"offset": 0.0012, "skew": 2.3e-6, "J": 0.0031}
},
"arrival": {
"two_forms": {
"delta_form": 2.1e-6,
"tol_Tarr": 5.0e-6
}
},
"window": {"Delta_t": "PT24H"},
"dataset": {"N": 125034},
"weights": {"W_norm": 0.9996},
"metrics": {
"core": [
{"name": "conversion_rate", "est": 0.0842, "ci": [0.0831, 0.0853], "unit": "1", "dim": "[]"},
{"name": "avg_order_value", "est": 56.73, "se": 0.42, "unit": "USD", "dim": "[M]"}
]
},
"contracts": [
{"id": "C30-000", "status": "pass", "severity": "info", "evidence": {"checks": 128}},
{"id": "C30-001", "status": "pass", "severity": "info"},
{"id": "C30-004", "status": "pass", "severity": "info", "evidence": {"W_norm": 0.9996}},
{"id": "C30-342", "status": "pass", "severity": "info", "evidence": {"coverage_rate": 0.949}}
],
"actions": [
{"policy_id": "SC-SLO-01", "decision": "ship", "reason": "all guardrails pass", "at": "2025-08-31T12:01:05Z"}
],
"provenance": {
"data_uri": "s3://bucket/ds/2025-08-31/",
"code_uri": "git+https://repo/commit/abcdef",
"env": {"python": "3.11.5", "pkg": {"numpy": "2.0.1", "scipy": "1.14.0"}}
}
}
V. Example A: Online A/B Experiment (sequential alpha, guardrail SLOs)
{
"schema_version": "1.0.0",
"book_ref": "EFT.WP.Methods.CrossStats v1.0",
"release_tag": "ab-exp-42-int-07",
"TraceID": "trc_AB42_07",
"repro_hash": "sha256:HASH_AB42_07",
"signature": "SIG_BASE64",
"timebase": {
"tau_mono_range": [1725148800, 1725235200],
"ts_range": ["2025-09-01T00:00:00Z", "2025-09-02T00:00:00Z"],
"offset/skew/J": {"offset": 0.0007, "skew": 2.0e-6, "J": 0.0025}
},
"arrival": { "two_forms": {"delta_form": 1.8e-6, "tol_Tarr": 5.0e-6} },
"window": {"Delta_t": "PT24H"},
"dataset": {"N": 98023, "sampling": "online_randomized"},
"weights": {"W_norm": 1.0002},
"metrics": {
"core": [
{"name": "lift_cr_B_vs_A", "est": 0.0124, "se": 0.0039, "ci": [0.0048, 0.0200], "unit": "1", "dim": "[]"},
{"name": "guardrail_latency_ms_p99", "est": 245.0, "unit": "ms", "dim": "[T]"}
],
"ab": {
"lift": 0.0124,
"se": 0.0039,
"ci": [0.0048, 0.0200],
"mde": 0.01,
"alpha_spent": 0.043
}
},
"contracts": [
{"id": "C30-382", "status": "pass", "severity": "info", "evidence": {"latency_ms_p99": 245}},
{"id": "C30-383", "status": "pass", "severity": "info", "evidence": {"alpha_spent": 0.043, "alpha_budget": 0.05}},
{"id": "C30-381", "status": "pass", "severity": "info", "evidence": {"p_t": 0.501, "p_c": 0.499, "eps_exp": 0.01}}
],
"actions": [
{"policy_id": "SC-AB-01", "decision": "ship", "reason": "sequential boundary crossed; guardrails pass", "at": "2025-09-02T00:00:30Z"}
],
"provenance": {
"data_uri": "kafka://topic/exp42/day=2025-09-01",
"code_uri": "git+https://repo/commit/1122aabb",
"env": {"python": "3.11.5", "pkg": {"pandas": "2.2.2", "statsmodels": "0.14.2"}}
}
}
VI. Example B: Distribution Drift Monitoring (alignment & recalibration trigger)
{
"schema_version": "1.0.0",
"book_ref": "EFT.WP.Methods.CrossStats v1.0",
"release_tag": "drift-week-2025W36",
"TraceID": "trc_DRIFT_W36",
"repro_hash": "sha256:HASH_DRIFT_W36",
"signature": "SIG_BASE64",
"timebase": {
"tau_mono_range": [1725148800, 1725753600],
"ts_range": ["2025-09-01T00:00:00Z", "2025-09-08T00:00:00Z"],
"offset/skew/J": {"offset": 0.0011, "skew": 2.6e-6, "J": 0.0030}
},
"arrival": { "two_forms": {"delta_form": 2.5e-6, "tol_Tarr": 5.0e-6} },
"window": {"Delta_t": "P7D"},
"dataset": {"N": 705_211},
"weights": {"W_norm": 1.0000},
"metrics": {
"core": [
{"name": "score_calibration_ece", "est": 0.024, "unit": "1", "dim": "[]"}
],
"drift": {"W1": 0.095, "KL": 0.021, "psi": 0.14}
},
"contracts": [
{"id": "C30-370", "status": "fail", "severity": "warn", "evidence": {"W1": 0.095, "W1_max": 0.08}},
{"id": "C30-373", "status": "fail", "severity": "error", "evidence": {"r_win": 3}}
],
"actions": [
{"policy_id": "SC-DRIFT-01", "decision": "align_then_recalibrate", "reason": "persistent W1 breach; psi elevated", "at": "2025-09-08T00:01:00Z"},
{"policy_id": "SC-CAL-01", "decision": "canary_10pct", "reason": "ECE_after improves >= delta_min", "at": "2025-09-09T12:00:00Z"}
],
"provenance": {
"data_uri": "s3://bucket/weekly_snap/2025W36",
"code_uri": "git+https://repo/commit/55cc66dd",
"env": {"python": "3.11.5", "pkg": {"scikit-learn": "1.5.1"}}
}
}
VII. Example C: Causal Estimation (doubly-robust with overlap checks)
{
"schema_version": "1.0.0",
"book_ref": "EFT.WP.Methods.CrossStats v1.0",
"release_tag": "causal-ATE-geoQ3",
"TraceID": "trc_CAUSAL_GEO_Q3",
"repro_hash": "sha256:HASH_CAUSAL_GEO_Q3",
"signature": "SIG_BASE64",
"timebase": {
"tau_mono_range": [1719782400, 1727568000],
"ts_range": ["2024-07-01T00:00:00Z", "2024-09-30T23:59:59Z"],
"offset/skew/J": {"offset": 0.0009, "skew": 1.8e-6, "J": 0.0021}
},
"arrival": { "two_forms": {"delta_form": 1.2e-6, "tol_Tarr": 5.0e-6} },
"window": {"Delta_t": "P92D"},
"dataset": {"N": 40211, "N_eff": 27894.5, "sampling": "observational"},
"weights": {"W_norm": 1.0007, "cap_w": 20.0, "p_trim": 0.7},
"metrics": {
"core": [
{"name": "ATE", "est": 1.84, "ci": [0.95, 2.73], "unit": "USD", "dim": "[M]"}
],
"causal": {"ATE": 1.84, "U": 0.62, "SMD_max": 0.06, "overlap_min": 0.04}
},
"contracts": [
{"id": "C30-400", "status": "pass", "severity": "info", "evidence": {"overlap_min": 0.04, "eps_ol": 0.02}},
{"id": "C30-401", "status": "pass", "severity": "info", "evidence": {"SMD_max": 0.06, "smd_max": 0.10}},
{"id": "C30-402", "status": "pass", "severity": "info", "evidence": {"ATE_IPW": 1.79, "ATE_OR": 1.88}},
{"id": "C30-350", "status": "pass", "severity": "info", "evidence": {"B": 2000}}
],
"actions": [
{"policy_id": "SC-COVER-01", "decision": "publish_readonly", "reason": "coverage met; trimmed weights applied", "at": "2024-10-01T10:00:00Z"}
],
"provenance": {
"data_uri": "warehouse://table/geo_q3",
"code_uri": "git+https://repo/commit/aa77bb88",
"env": {"python": "3.10.14", "pkg": {"econml": "0.15.0", "pymc": "5.13.1"}}
}
}
VIII. Validation & Assertions (automated)
- Structure checks
required_keys ⊆ manifest.stats.keys(); required fields in arrays must not be omitted. - Dimensional consistency
For metrics.core[*], verify unit(x) and dim(x) consistency: check_dim( est - ref_unit_transform(est) ) = true. - Time-base & arrival-time
non_decreasing(tau_mono_range); delta_form ≤ tol_Tarr. - Weights & samples
|W_norm - 1| ≤ tol_w; if N_eff is provided, validate N_eff = ( ( ∑ w )^2 ) / ( ∑ w^2 ). - Contracts & strategies
contracts[*].id belongs to the C30-* namespace; status ∈ {pass, fail}; severity ∈ {info, warn, error, fatal}; actions[*].policy_id belongs to the strategy-card namespace (e.g., SC-DRIFT-01).
IX. Mapping to the Contract Library
- Each contracts[*] item maps to a C30-* entry from Appendix B; evidence carries the triggering metrics (e.g., coverage_rate, alpha_spent, W1/KL/psi, SMD_max).
- Each actions[*] item maps to an Appendix B strategy card SC-*, recording decision, rationale, and timestamp.
- Failure items must align with the release gates in Methods.Cleaning v1.0, Chapter 10: assert_contract(ds, tests) -> report.
X. Traceability & Signature Specification
- repro_hash = hash_sha256( code_uri ∥ params ∥ data_fingerprint )
- signature = sign( private_key, repro_hash ); verify with verify(public_key, signature, repro_hash) = true
- Record provenance.env so containers and package versions are reconstructible.
XI. Versioning & Compatibility
- schema_version follows MAJOR.MINOR.PATCH.
MAJOR changes may break readers; MINOR adds optional fields; PATCH refines descriptive notes. - Backward-compatibility policy
Readers should backfill defaults for newly added optional fields (e.g., optional subkeys under metrics.*) across MINOR increments.
Summary
- This template consolidates core metrics, dual T_arr formulations, time-base semantics, and the contract–strategy loop into a unified persisted manifest.stats.
- With the minimal key set and worked examples (A/B, drift, causal), it enables cross-system reuse of statistical results, compliance auditing, and rollback-ready publication.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/