Home / Docs-Technical WhitePaper / 52-Dataset Card Template v1.0
Chapter 1 — Scope of Use & Dependencies
I. Scope of Use
- Dataset types: Batch, Streaming, Near-RT, Hybrid.
- Object coverage: raw observation/experiment/simulation data and derivatives (features, labels, metadata, splits & evaluation pairs).
- Path-quantity constraint: for arrival/phase (path quantities), the text must explicitly show gamma(ell) and the measure d ell, and the data side must record delta_form ∈ {general, factored}; publication requires p_dim = 1.0.
- Out of scope: datasets lacking units/dimensions, lineage, or access proof; datasets that fail gates (G1–G8) without a [Restricted] note.
II. Dataset Scenarios Matrix
Scenario | Typical Source | Core Fields | Key KPIs | Gates | Notes |
|---|---|---|---|---|---|
Batch | Partitioned files/objs | record_id, ts, schema.* | Completeness, dedup rate, p_dim | G1/G4/G8 | Versioned schema & splits |
Streaming | Events/message queues | event_id, window, flags | Latency, loss rate, σ_y(τ) | G3/G5/G6 | Exactly-Once/At-Least-Once note |
Near-RT | Shared-mem/micro-batch | ts_align, slice | Jitter, Latency_P95 | G5/G6 | Sync chain & watermark policy |
Hybrid | Online + Offline | contract.snapshot | Alignment consistency, Q_res | G1–G8 | Online backflow & offline checks |
Unified path forms:
T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell ) or T_arr = ( ∫ ( n_eff / c_ref ) d ell );
Phase: Phi = ( 2π / λ_ref ) * ( ∫ n_eff d ell ).
III. Dependencies & Alignment
- Core (required): EFT.WP.Core.Terms v1.0, EFT.WP.Core.Equations v1.1, EFT.WP.Core.Metrology v1.0, EFT.WP.Core.DataSpec v1.0.
- Companion templates: Parameter Registration Card Template v1.0 (Ch.9), Error Budget Card Template v1.0 (Ch.5/6/8/9), Pipeline Card Template v1.0 (Ch.12/9/11), Experimental Protocol Card Template v1.0 (Ch.5).
- Contract alignment: field/unit/dimension aligned with TARR; uncertainty & covariance aligned with Error Budget (cov_group/Σ/coverage).
- Sync & freshness: satisfy Metrology.TimeBase/Sync v1.0 (clock_state="locked", |ts_start − calib.timestamp| ≤ τ_calib).
IV. Citations & Versioning
- Fixed syntax: See "<Volume> vX.Y" <Chapter> <Anchor>, prioritize P/S/M/I; anchor coverage ≥ 90%, no external links/aliases.
- Version policy: public releases cite v1.* only; if v0.* is referenced, mark “draft / non-committal”.
- Dependency closure: cross-volume see[]/references[]/version and manifest synchronized; baseline versions for splits/comparisons must be declared in manifests/.
V. Compliance Prerequisites
- Dimensional closure (G4): pass I70-dim_check and attach check_dim_report.json; require p_dim = 1.0.
- Schema completeness (G1): schema.json/contract.yaml match data; primary key/time/path blocks present.
- Path conventions (G3): gamma(ell)/d ell/delta_form present; len(gamma_ell)=len(d_ell)=len(n_eff)≥2; Δell ≤ ( c_ref / f_s ) / max(n_eff).
- Freshness (G5): clock_state="locked", τ_calib valid; expired samples tagged & isolated.
- Coverage (G6): coverage ∈ {k, alpha, quantile} aligned with publication.
- Covariance consistency (G7): cov_group/Σ aligned with Error Budget; Σ PD.
- Uniqueness (G8): unique record_id/checksum; lineage acyclic.
VI. Outputs & Interfaces
- Required: schema.json, contract.yaml, split.yaml, validate_report.json, check_dim_report.json, audit.jsonl, report_manifest.yaml, SIGNATURE.asc.
- Suggested interfaces:
- I81-dataset_ingest(package) -> validate_report.json
- I82-dataset_validate(gates[]) -> report
- I83-dataset_manifest_export(fmt) -> report_manifest.yaml
- I84-dataset_split(plan.yaml) -> split_manifest.json
- I89-dataset_publish(channel) -> release_id
VII. Checklist
- Scenarios & scope defined; see[]/references[] compliant, anchor coverage ≥ 90%.
- For path quantities: explicit gamma(ell)/d ell; delta_form recorded; len(path) ≥ 2, Δell compliant.
- I70-dim_check passed (p_dim = 1.0); clock_state="locked", τ_calib valid.
- schema.json/contract.yaml/split.yaml consistent; cov_group/coverage aligned with Error Budget.
- /validate passes G1–G8; manifest, checksums & signature complete; [Restricted] tagging applied where required.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/