Home / Docs-Technical WhitePaper / 52-Dataset Card Template v1.0
Chapter 2 — Terms, Symbols & Units (Dataset Minimal Set)
I. Terms
- sample: the minimal record unit in the dataset.
- feature: a field usable for modeling/statistics.
- label: target/supervision field.
- metadata: auxiliary information describing the data and its generation process.
- split: dataset subsets train/val/test/holdout/slice_k.
- window/partition: aggregation unit in time/space/entity.
- contract/schema: declaration of fields, types, units, dimensions, and constraints.
- coverage: k-coverage / alpha significance / quantile[p_lo,p_hi].
- lineage: provenance chain & dependency DAG.
II. Symbols (minimal)
- Scale: N (samples), M (features), |split| (split size).
- Time & sampling: f_s (sampling rate), T_win (window), ts_start/ts_end (ISO-8601).
- Path quantities: gamma(ell) (path), d ell (measure), n_eff(ell) (effective refractive index), c_ref (reference propagation limit), λ_ref (reference wavelength), T_arr (arrival time), Phi (phase).
- Quality & uncertainty: Q_res, u(x), u_c, U = k·u_c, Σ (covariance), p_dim.
- Sync health: δt_abs (absolute time offset), Δτ_ch (inter-channel skew), σ_y(τ) (Allan deviation).
- Resource/perf (if applicable): Latency_P95, Throughput, ρ (utilization).
III. Units & Dimensions
- SI/international symbols: m, s, rad, 1, m/s, 1/m, Pa, N, J, Hz.
- Field tables must provide unit and/or declare dim in the contract.
- Dimensional closure: before publication, pass I70-dim_check, require p_dim = 1.0, and attach check_dim_report.json.
IV. Normative Path Forms
- Arrival time (two equivalent forms):
T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
T_arr = ( ∫ ( n_eff / c_ref ) d ell ) - Phase accumulation:
Phi = ( 2π / λ_ref ) * ( ∫ n_eff d ell )
In text, explicitly show gamma(ell) and d ell; record delta_form ∈ {general, factored} on the data side; arrays satisfy len(gamma_ell)=len(d_ell)=len(n_eff)≥2.
V. Mandatory Conventions
- Wrap inline symbols with backticks (e.g., T_arr, Phi, n_eff, c_ref).
- Parentheses required for any expression with division/integrals/composites; use ln/exp/conv in functional form.
- Conflict names: T_fil (tension) ≠ T_trans (transmittance); n (number density) ≠ n_eff (effective refractive index).
- Forbidden bare symbols: c, T, n.
- Coverage consistency: choose one of k/alpha/quantile and keep consistent between data and publication.
VI. Field Table (minimal template)
field | type | unit | dim | domain/shape | nullable | description | see |
|---|---|---|---|---|---|---|---|
record_id | string | 1 | 1 | ULID/UUIDv4 | no | primary key | — |
acq.ts_start/ts_end | string | 1 | 1 | ISO-8601 | no | acquisition time | — |
path.gamma_ell | array | m | L | N≥2 | no | path parameter | Core.DataSpec:TARR |
path.d_ell | array | m | L | N≥2 | no | path measure | ibid. |
medium.n_eff_profile | array | 1 | 1 | N≥2 | no | effective index | S20-1 |
ref.c_ref | number | m/s | L·T^-1 | (2.9e8,3.1e8) | no | reference limit | Terms P10-* |
ref.lambda_ref | number | m | L | >0 | opt. | reference wavelength | S21-2 |
obs.T_arr | number | s | T | — | opt. | arrival time | S20-1 |
obs.Phi | number | rad | 1 | — | opt. | phase | S21-2 |
quality.flags | array | 1 | 1 | — | yes | quality flags | — |
quality.score_Q | number | 1 | 1 | [0,1] | no | robust quality | — |
see/references/version | array/string | 1 | 1 | — | no | citations & version | — |
VII. Machine-Readable Contracts (excerpts)
A. schema.json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Dataset v1.0.0 (minimal)",
"type": "object",
"required": ["record_id","acq","path","medium","ref","see","version"],
"properties": {
"record_id": { "type": "string" },
"acq": {
"type": "object",
"required": ["ts_start","ts_end"],
"properties": {
"ts_start": { "type": "string", "format": "date-time" },
"ts_end": { "type": "string", "format": "date-time" }
}
},
"path": {
"type": "object",
"required": ["gamma_ell","d_ell"],
"properties": {
"gamma_ell": { "type": "array", "items": { "type": "number" }, "minItems": 2 },
"d_ell": { "type": "array", "items": { "type": "number" }, "minItems": 2 }
}
},
"medium": {
"type": "object",
"required": ["n_eff_profile"],
"properties": {
"n_eff_profile": { "type": "array", "items": { "type": "number" }, "minItems": 2 }
}
},
"ref": {
"type": "object",
"properties": {
"c_ref": { "type": "number" },
"lambda_ref": { "type": "number" }
}
},
"see": { "type": "array", "items": { "type": "string" }, "minItems": 1 },
"version": { "type": "string" }
}
}
B. contract.yaml (path & coverage)
version: "1.0.0"
path:
required: true
gamma: "gamma(ell)"
measure: "d ell"
delta_form: "general" # or "factored"
coverage:
mode: "k" # k | alpha | quantile
k: 2
units:
T_arr: "s"
Phi: "rad"
c_ref: "m/s"
lambda_ref: "m"
VIII. Normative Examples
- Path consistency check:
len(gamma_ell)=len(d_ell)=len(n_eff)≥2; Δell ≤ ( c_ref / f_s ) / max(n_eff). - Phase interval reporting: use Fisher–z in z-space, back-transform to r_phi, report [LB, UB].
- Dimensional validation:
T_arr = ( ∫ ( n_eff / c_ref ) d ell ) ⇒ [1]/[m·s^-1]·[m] = [s].
IX. Anti-Patterns & Fixes
- Anti: T_arr = ∫ n_eff / c_ref d ell (missing parentheses) → Fix: T_arr = ( ∫ ( n_eff / c_ref ) d ell ).
- Anti: declaring only gamma(ell) without d ell/delta_form → Fix: add both and align with n_eff.
- Anti: unit % as text → Fix: use unit 1 and note “percent” in comments.
- Anti: references without version/anchor → Fix: See "EFT.WP.Core.Equations v1.1" Ch.2 S20-1.
X. Checklist
- Field table consistent with schema.json/contract.yaml; all numeric fields explicitly show units & dimensions.
- For path quantities, explicit gamma(ell)/d ell recorded with delta_form; len(path) ≥ 2, Δell compliant.
- Unified forms used for arrival/phase with parentheses.
- Coverage mode consistent between data & publication (k/alpha/quantile); p_dim = 1.0 and check_dim_report.json attached.
- see[]/references[]/version compliant with anchor coverage ≥ 90%; no external links/aliases.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/