Home / Docs-Technical WhitePaper / 13-EFT.WP.Methods.SimStack v1.0
Chapter 6: Data Model, Manifests & Persistence
I. Scope & Objectives
- Standardize the data objects, manifest fields, and persistence conventions for simulation and observation so that artifacts are reproducible, comparable, and traceable across layers (continuous kernel, discrete weaving, coupling & advancement) and across devices.
- Objectives: fix a minimal field set with units and measures, clarify windowing and publication rules for spectral quantities S_xx(f), and define quality gates (eps_norm, eps_mass, delta_form, eps_time_map) together with version semantics.
II. Terms & Symbols
- Time bases & paths
- tau_mono (internal monotonic clock), ts (published time base), linear mapping ts = alpha * tau_mono + beta + epsilon(t).
- gamma(ell), d ell, path length L_gamma = ( ∫{gamma(ell)} 1 d ell ).
- Two T_arr formulations & discrepancy
T_arr = ( ∫ ( n_eff / c_ref ) d ell ); T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell );
delta_form = | ( ∫ ( n_eff / c_ref ) d ell ) - ( 1 / c_ref ) * ( ∫ n_eff d ell ) |. - Conservation measures
eps_norm (normalization error), eps_mass (conservation residual). - Spectra & windows
S_xx(f), U_w (window normalization constant), ENBW (equivalent noise bandwidth).
III. Postulates & Minimal Equations (P61-/S62-)
- P61-8 (Explicit measures and units)
Any integrable quantity must declare domain and measure in the manifest, e.g., dV/dS/d ell. All fields are published in SI units; if non-SI is used, provide a conversion factor. - P61-9 (Time-base–coherent publication)
External timestamps are published in ts while persisting the mapping parameters alpha/beta and their uncertainties to support replay back to tau_mono. - P61-10 (Two-form parallelism with discrepancy reporting)
Any artifact involving arrival time must publish both T_arr forms and delta_form. - S62-30 (Windowing and ENBW)
For a discrete window w[n] at sampling rate f_s,
ENBW = f_s * ( Σ w[n]^2 ) / ( Σ w[n] )^2.
Spectral publications must include window.type, U_w, and ENBW. - S62-31 (Dimensional conservation checks)
- check_dim( ( n_eff / c_ref ) * d ell ) = [T]; check_dim( ∫_V rho dV ) = [M].
- Artifacts failing checks must not be ingested.
IV. Data Objects & Units (Object Layer)
- Scalars & fields
- scalar: e.g., c_ref.
- field: e.g., n_eff(x,t), rho(x,t), with explicit grid and interpolation conventions.
- Paths & events
- path: parameterization of gamma(ell), node set and quadrature weights.
- event: causally ordered discrete events under hb, carrying an idempotency_key.
- Spectra & statistics
- spectrum: S_xx(f) with window attributes.
- stats: mean/std/quantile, each tied to a declared window in the manifest.
V. Minimal Manifest Field Set & Schema Versioning (Publication Layer)
- Top-level identity
artifact.id, artifact.hash, schema.semver, producer, pipeline.rev, run.seed. - Time base & windowing
time.ts_unit = s, time.alpha, time.beta, time.r_rms, window.size, window.step, window.type, ENBW. - Domain & measures
domain.type ∈ [path, volume, area, time, freq], measure ∈ [d ell, dV, dS, dt, df], support and boundary-condition description. - Continuous-kernel fields
n_eff.source/context, c_ref.source, rho.units, S_xx.units, U_w. - Paths & arrival time
gamma.param (node coordinates and parameterization), T_arr.general, T_arr.factorized, delta_form. - Threads & metrics
TS.latency.p50/p90/p99, TS.throughput.rps, TS.queue.backlog, TS.hb.violations. - Quality gates
eps_norm, eps_mass, eps_time_map, each with threshold and status ∈ [pass, fail]. - Compliance & provenance
audit.trail (event and compensation summary), config.hash, env.lock, deps.lock, provenance.inputs[].
VI. Quality & Missingness Annotations, Near-Independence Assumption (Quality Layer)
- Missingness & masks
- Numerical missing as NaN; non-numeric missing as MISSING; clipped as CLIPPED; interpolation/extrapolation as INTERP/EXTRAP.
- quality.mask.cases enumerates counts and proportions of each.
- Near-independent assumption
When window averages or subsampling intervals exceed the correlation length, mark near_iid = true and provide evidence via autocorrelation estimates rho_lag[k]. - Outliers & thresholds
Run outlier detection (e.g., zscore or quantile gates) pre-publication; record outlier.fraction and the handling policy (retain/clip/impute).
VII. Algorithms & Implementation Bindings (I60-*)
- I60-11 validate_manifest(manifest:dict, strict:bool) -> QualityReport
Validate units, measures, time-base mapping, two T_arr forms, and quality gates. Return status=pass with a summary on success. - I60-12 write_artifact(obj:any, manifest:dict, fmt:str) -> Uri
Persist object and manifest in fmt ∈ [ndjson, parquet, arrow]; return a content-addressed Uri. - I60-13 read_artifact(uri:Uri) -> (obj:any, manifest:dict)
Read back object and manifest, ensuring round-trip consistency (hash and statistical summaries match). - I60-14 index_partition(manifest:dict, policy:dict) -> Partitions
Generate partition keys and indices per policies like time/day, scenario, domain.type.
VIII. Metrology Flows & Run Diagrams (Aligned with Mx-6*)
- Mx-62 conservation-check
Process- Compute eps_norm and eps_mass;
- Run check_dim and unit audits;
- Decide quality gates and write back QualityReport;
- On failure, block persistence and roll back.
- Mx-64 publish-and-ingest
Process- Produce manifest and config.hash;
- Validate via I60-11;
- Persist via I60-12 and index with I60-14;
- Emit audit.trail and dashboard snapshot (TS.*).
IX. Publication Guidance (Naming, Units, Versioning)
- Naming & encoding
Keys use lower_snake_case; separate magnitude and unit, e.g., value and unit. - Units & dimensions
Prefer SI: m, s, kg, A, K, mol, cd; frequency Hz; spectral quantities as unit^2/Hz. - Versioning & compatibility
schema.semver = MAJOR.MINOR.PATCH; breaking changes only at MAJOR increments; manifests must include a compatibility section compat.notes.
X. Verification & Test Matrix
- Minimum required
- Units & dimensions: craft passing/failing check_dim examples and verify blocking policy.
- Two forms: delta_form ≈ 0 in constant media; stable discrepancy across segmented paths.
- Round-trip: after write_artifact, read_artifact reproduces identical statistical summaries.
- Boundary & extreme cases
High missingness and mask ratios; high-jitter scenarios stressing the eps_time_map gate; very narrow windows producing high ENBW. - Regression & thresholds
With a fixed baseline scenario, compare Δeps_mass, Δdelta_form, ΔTS.latency.p99, ΔENBW, and outlier-rate shifts.
XI. Cross-References & Dependencies
- With the Continuous Kernel (Chapter 2)
Publishing n_eff(x,t), rho(x,t), and S_xx(f) must carry explicit measures and window conventions, consistent with T_arr computation. - With the Thread Network (Chapter 3)
Persist TS.* metrics and event hb records consistently; include idempotency and compensation summaries in audit.trail. - With Coupled Advancement (Chapter 4)
Map StepReport fields into the manifest, including dt_used/dt_next, eps_norm/eps_mass, and both T_arr forms. - With Time Calibration (Chapter 5)
time.alpha/beta/r_rms and eps_time_map are mandatory; anchors and two-way estimates are part of the manifest.
XII. Risks, Limitations & Open Questions
- Risks
Mixed units or implicit measures break cross-volume comparability; omitting two-form publication makes arrival times untraceable; poor window normalization biases spectra. - Limitations
File-format internals are not prescribed, but required fields and quality gates are mandatory. - Open questions
Online estimation of adaptive windows and ENBW; unified uncertainty-propagation models across heterogeneous domains.
XIII. Deliverables & Versioning
- Deliverables
- manifest.template (minimal field set), quality.policy, ingest.script, dashboard.config (TS.*).
- Sample artifacts: path arrival-time manifest, spectral-quantity manifest, queue & SLO metrics manifest.
- Versioning
From v1.0, freeze key field names and gate semantics; add fields in a backward-compatible manner with migration guidance.
XIV. New Terms & Symbols (to memorize)
- Manifests & quality: artifact.id, schema.semver, audit.trail, eps_time_map, QualityReport.
- Windows & spectra: window.size, window.step, window.type, U_w, ENBW, S_xx(f).
- Paths & arrival time: gamma(ell), d ell, T_arr.general, T_arr.factorized, delta_form.
- Time mapping: alpha, beta, r_rms; events & idempotency: hb, idempotency_key.
- Conservation & gates: eps_norm, eps_mass, check_dim(expr), TS.*.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/