HomeDocs-Technical WhitePaper06-EFT.WP.Core.DataSpec v1.0

Chapter 3 — Metadata and the Trace Chain


I. Objectives and Scope


II. Core Definitions and Symbols


III. Minimal Required Manifest Set


IV. Trace Model and Evidence-Chain Structure

  1. Node types
    • source: artifacts from raw acquisition or external provision;
    • method: deterministic or stochastic processing step, recording version and params;
    • artifact: a processing output bound to checksum and schema_ref.
  2. Normalization requirements
    • Every method node records code_rev and params;
    • A given artifact’s checksum uniquely determines its content; signature binds keyref;
    • The Trace must be a DAG; compute TraceID via hash_sha256.
  3. Minimal closure of the evidence chain
    At any point, EvidenceChain must include parent fingerprints, the current checksum, signature, and TraceID, and it must be replayable to any ancestor.

V. Fingerprints, Signatures, and Reproducibility

  1. Fingerprint workflow (Mx-1)
    • Produce canon(ds) by ordering with order(pk) and conforming to field specifications;
    • Compute checksum = hash_sha256(canon(ds));
    • Generate signature = Sign(checksum, keyref);
    • Write checksum, signature, and keyref to the manifest, appending parents.
  2. The reproducibility triple
    ReproTriple def= <checksum, schema_version, code_rev>; only when all three are present do we claim strong reproducibility.
  3. Verification steps
    • Verify signature against keyref;
    • Recompute hash_sha256(canon(ds)) and compare with checksum;
    • Check schema_version compatibility with the local schema;
    • Replay method steps in the Trace, expecting identical checksum.

VI. Metadata Namespaces and Field Dictionary

  1. MD.core.*
    • MD.core.dataset_id : string
    • MD.core.schema_ref : string
    • MD.core.schema_version : string
    • MD.core.pk : array<string>
    • MD.core.idx : array<array<string>>
  2. MD.env.*
    MD.env.os : string, MD.env.cpu : string, MD.env.gpu : string, MD.env.libs : array<string>, MD.env.locale : string
  3. MD.trace.*
    MD.trace.parents : array<string>, MD.trace.TraceID : string, MD.trace.code_rev : string, MD.trace.params : string
  4. MD.sec.*
    MD.sec.checksum_sha256 : string, MD.sec.signature : string, MD.sec.keyref : string
  5. MD.quality.*
    MD.quality.q_score : float, MD.quality.drift : float, MD.quality.completeness : float
  6. MD.arrival.*
    • MD.arrival.pid : string, MD.arrival.CRS : string, MD.arrival.orientation : {"forward"|"reverse"}
    • MD.arrival.L_gamma : float, MD.arrival.formulation : {"factored"|"general"}, MD.arrival.delta_form : float

VII. Metadata for Arrival-Time Two-Form Consistency

  1. Formulation declaration
    • formulation="factored" means T_arr = ( 1 / c_ref ) * ( ∫_gamma n_eff d ell );
    • formulation="general" means T_arr = ( ∫_gamma ( n_eff / c_ref ) d ell )。
  2. Discrepancy recording
    • delta_form = | ( 1 / c_ref ) * ( ∫_gamma n_eff d ell ) - ( ∫_gamma ( n_eff / c_ref ) d ell ) |;
    • The manifest must include delta_form and threshold tol_Tarr, and the contract must assert delta_form ≤ tol_Tarr.
  3. Path consistency
    Persist pid; ensure non-decreasing ell; declare CRS; record L_gamma = ( ∫_gamma 1 d ell ).

VIII. Auditable Manifest Template (Text)


IX. Contract Mapping and Validation Interfaces

  1. assert_contract typical assertions
    • unique(dataset_id);
    • non_decreasing(ts) and non_decreasing(ell);
    • check_dim( y - f(x; theta) );
    • range(q_score, 0, 1);
    • delta_form ≤ tol_Tarr;
    • exists(MD.sec.checksum_sha256) and verify(signature, keyref)。
  2. Interface crosswalk
    • attach_provenance(ds, trace) → writes MD.trace.* and TraceID;
    • compute_checksum(ds,"sha256") → produces checksum;
    • sign_data(ds,keyref) → produces signature;
    • export_manifest(ds) → emits the key set in this chapter’s template.

X. Incorporating Drift and Quality Metadata

  1. Quality dimensions
    completeness = N_observed / N_expected, validity = N_valid / N_observed, consistency ∈ [0,1], timeliness = now - created_ts.
  2. Drift recording
    • drift = monitor_drift(ds_ref, ds_new, fields, method="KL")["score"];
    • Record under MD.quality.* in the manifest, including ref_window and threshold.

XI. Arrival-Time Use Case: End-to-End Traceability

Steps

XII. Governance, Postulates, and Compliance Essentials


XIII. Implementation Checklist for Cross-Volume Binding


XIV. Publication and Freeze

Freeze workflow

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/