Home / Docs-Technical WhitePaper / 16-EFT.WP.Methods.Cleaning v1.0
Appendix C Manifest Templates and Examples
One-Sentence Goal
Provide the minimal key set, field conventions, and ready-to-use templates/samples for publication-grade manifests that enable traceable, auditable, and revertible cleansing releases.
I. Scope & Targets
- Applicable objects
Batch dataset releases, event-stream window releases, and online snapshot releases. - Output forms
manifest.json (machine-readable) and manifest.sig (signature block); an optional human-readable manifest.yaml where needed. - Baseline constraints
- All internal computation on tau_mono; external publication on ts.
- Arrival time must include both forms T_arr and delta_form.
- Dimensional conservation must pass check_dim(expr).
II. Structure & Keying Conventions
Recommended top-level keys (stable names)- meta: version, generator, and timestamp
- lineage: sources, jobs, commits, and TraceID
- schema: schema reference and hash
- units_dim: unit system and dimensional-check summary
- timing: offset/skew/J and time-base mapping
- path_arrival: gamma(ell), both forms, and delta_form
- missing_impute: mask m ∈ {0,1} coverage and imputation records
- quality: q_score, coverage, and P99 metrics
- outlier_drift: outlier rate and drift metrics
- integrity: uniqueness, referential checks, and dedup results
- env_correction: RefCond and corr_env(x; RefCond)
- contracts: assert_contract test list and results
- release: freeze information, signature, and rollback anchors
- artifacts: exported payloads and checksums
- audit: execution log and chained hashes
III. Minimal Required Keys
- meta.version = "1.0"
- meta.generated_at = ts
- lineage.TraceID
- schema.ref and schema.hash
- units_dim.system and units_dim.pass
- timing.offset, timing.skew, timing.J
- path_arrival.c_ref, path_arrival.T_arr_form1, path_arrival.T_arr_form2, path_arrival.delta_form, path_arrival.tol_Tarr
- missing_impute.missing_ratio, missing_impute.methods[]
- integrity.unique_pass, integrity.fk_pass
- contracts.tests[] and contracts.pass
- release.tag, release.freeze_at, release.hash_sha256, release.signature
IV. Field Conventions (Selected)
- Arrival time, two forms
- T_arr_form1 = ( 1 / c_ref ) * ( ∫_{gamma(ell)} n_eff d ell )
- T_arr_form2 = ( ∫_{gamma(ell)} ( n_eff / c_ref ) d ell )
- delta_form = | T_arr_form1 - T_arr_form2 |; must assert delta_form ≤ tol_Tarr.
- Time-base indicators
offset (mean phase offset), skew (frequency offset, ppm), J (jitter at P95 or P99). - Dimensions & units
unit(t_arr)="s", dim(t_arr)="[T]"; all fields must pass check_dim before publication. - Path & length
non_decreasing(ell); L_gamma = ( ∫_gamma 1 d ell ) recorded under path_arrival. - Missingness & imputation
Mask coverage for m and all imputation methods with RefCond must be persisted. - Signature & traceability
release.hash_sha256(blob) and release.signature; audit chain linked via audit.prev_hash.
V. Template A (Batch Release manifest.json)
VI. Template B (Event-Window Release manifest.window.json)
VII. Template C (Online Snapshot Release manifest.api.json)
VIII. Filled Sample (Batch Release)
{
"meta": {
"version": "1.0",
"title": "D_clean daily snapshot",
"generated_at": "2025-08-30T02:10:45Z",
"generator": "EFT.cleaning.freeze_release/1.3.2"
},
"lineage": {
"source_uris": ["s3://lab/raw/2025-08-29/"],
"job_id": "7d8c1f1e-1c3a-4bd7-91d2-7c2b3d1e0a77",
"commit": "a9b3c4d",
"TraceID": "tr-01HZY2P6Z9"
},
"schema": {
"ref": "EFT.WP.Core.DataSpec v1.0:SRef",
"hash": "3f1c0f7b1b7a2c...e9d"
},
"units_dim": {
"system": "SI",
"checks": [
{"expr": "t_arr", "dim": "[T]", "pass": true},
{"expr": "n_eff", "dim": "[]", "pass": true}
],
"pass": true
},
"timing": {
"timebase_in": "tau_mono",
"timebase_out": "ts",
"offset_ms": 1.8,
"skew_ppm": 27.0,
"J_ms_p99": 2.6
},
"path_arrival": {
"gamma_param": "ell",
"L_gamma": 1243.7,
"c_ref": 2.99792458e8,
"T_arr_form1_s": 4.150002e-06,
"T_arr_form2_s": 4.150006e-06,
"delta_form_s": 4.0e-12,
"tol_Tarr_s": 5.0e-06,
"p99_delta_form_s": 6.0e-12
},
"missing_impute": {
"missing_ratio": 0.032,
"mask_field": "m",
"methods": [
{"field": "Xi", "method": "linear", "RefCond": "T=293K,P=1atm"}
]
},
"quality": {
"q_score_mean": 0.982,
"q_score_p99": 0.998,
"coverage": {"records": 18423321, "fields": 57}
},
"outlier_drift": {
"outlier_rate": 0.008,
"drift_metric": "PSI",
"drift_value": 0.06,
"window": "7d"
},
"integrity": {
"unique_keys": ["pk"],
"unique_pass": true,
"fk_checks": [
{"child": "pid", "parent": "pid_ref", "pass": true, "orphan": 0}
],
"dedup": {"conflicts_resolved": 271, "residual_conflicts": 0}
},
"env_correction": {
"RefCond": "T=293K,P=1atm",
"fields": ["T_arr"],
"corr": "corr_env(T_arr; RefCond)",
"uncertainty_U": 1.4e-07
},
"contracts": {
"tests": [
"UNIQUE(pk)",
"DIM(\"t_arr\",\"[T]\")",
"ARRIVAL_FORMS(c_ref=2.99792458e8, tol=5e-6, tolP99=1e-5)",
"MANIFEST_SIGNED()"
],
"pass": true,
"failed": []
},
"release": {
"tag": "clean-2025-08-30",
"freeze_at": "2025-08-30T02:12:11Z",
"hash_sha256": "b7c1f6c4...aa12",
"signature": "MEYCIQDZ...AB",
"public_key_id": "kid-ops-2025Q3",
"prev_hash": "8aa9d7...01fe"
},
"artifacts": [
{"name": "D_clean.parquet", "uri": "s3://lab/clean/2025-08-30/D_clean.parquet", "sha256": "1d77...9e"}
],
"audit": {
"operator": "batch-runner",
"events": [
{"ts": "2025-08-30T02:10:50Z", "action": "assert_contract", "result": "pass"},
{"ts": "2025-08-30T02:12:10Z", "action": "sign", "result": "pass"}
],
"prev_hash": "a1f2...ccd"
}
}
IX. Generation & Verification Flow (Interfaces Coupling I10-*)
- Prepare inputs
Run standardize_names and repair_units so schema and dimensions are ready. - Compute & annotate
- align_timebase yields offset/skew/J.
- enforce_arrival_time_convention computes both forms and delta_form.
- handle_missing persists m and RefCond.
- detect_outlier and dedup results feed statistics.
- Execute contracts
Call assert_contract(ds, tests), aggregate pass/fail with sev, and write into contracts. - Freeze & sign
freeze_release(ds, tag) produces artifacts; compute hash_sha256(blob); sign via KMS to signature; chain prev_hash. - Publish & audit
Persist manifest and artifacts to object storage; append audit.events and chain hash.
X. Validation Focus & Rollback Anchors
- Dimensional conservation
Assert check_dim( t_arr - ( 1 / c_ref ) * ( ∫ n_eff d ell ) ) = 0. - Two-form gap
Assert delta_form ≤ tol_Tarr and Q_0.99(delta_form) ≤ tolP99_Tarr. - Monotonicity & referential integrity
Assert non_decreasing(ts) and all foreign_key checks pass; residual_conflicts = 0. - Rollback
Use release.prev_hash to locate the most recent healthy version; execute rollback or replay.
Summary
This appendix standardizes the structure, minimal keys, and templates for three release scenarios, accompanied by a filled sample. Persisting manifests per this template yields cross-volume consistency for two-form arrival, time base, dimensions, referential integrity, and signature traceability—supporting assert_contract, freeze_release, and audit-driven rollback across the end-to-end lifecycle.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/