HomeDocs-Technical WhitePaper43-EFT.WP.Data.DatasetCards v1.0

Chapter 3 Field Inventory


I. Chapter Purpose & Scope

Provide the layered field taxonomy (Required / Conditionally Required / Optional), naming and type constraints, minimal regex, and examples. All keys use snake_case; arrays denote plural entities with [].

II. Field Layers & Naming Conventions

  1. Layers:
    • Required: Must exist at release and pass type/regex/dependency checks.
    • Conditionally Required: Must exist when trigger conditions apply (stated in “Conditions” column).
    • Optional: Validate if present; absence is not an error.
  2. Naming: Keys use snake_case. Reserved names must not be redefined (e.g., dataset_id, version, license, access, splits).
  3. Citations & Anchors: Dependency-carrying fields use see[] with "Volume vX.Y:Anchor"; export artifacts include references[] and version.

III. Required Fields

Key

Type

Constraint/Regex

Description

Dependencies

dataset_id

string

^[a-z0-9_\\-.]+$

Unique dataset identifier

Organization & release per Core.DataSpec v1.0.

title

string

length ≥ 3

Human-readable name

version

string

^v\\d+\\.\\d+(\\.\\d+)?$

Semantic version

Export must carry version.

summary

string

100–300 word abstract

modality

string[]

enum

e.g., radio/optical/image/time_series/text/tabular

sources

string[]

URL/identifier

Upstream sources or card refs

Release & file org in Core.DataSpec v1.0.

license

string

enum

License policy

Public posture per Core.DataSpec v1.0.

access

string

`open

restricted

closed`

provenance

object

schema

Collection/source record

Methods-series references.

splits

object

required: train/validation/test

Split definitions & ratios

Export must include hashes.

checksums

object

sha256

Package- and shard-level integrity

Export policy.

metrology

object

schema

Units & dimensional baseline

Dimensional check check_dim.

quality

object

schema

Quality gates & coverage metrics

Align to quality/baseline chapter.

export_manifest

object

schema

Export manifest with version, references[], artifacts

Machine-readable citations.


IV. Conditionally Required Fields

Key

Type

Condition

Constraint/Regex

Description

Dependencies

sensor_profile

object

Physical sensing/instrumentation

schema

Sensor/station/channel config

Metrology & instrumentation series.

labels

object

Supervised/annotated data

schema

Label ontology, class_map, multilingual mapping

Terms harmonized with Core.Terms.

privacy

object

PII/sensitive data

enum/policy

Anonymization & compliance

DataSpec compliance posture.

uncertainty

object

Measured/inferred quantities

schema

Error budget & propagation

Errors/Metrology workflows.

path_dependence

object

Path-dependent quantities (e.g., T_arr)

schema

delta_form, path="gamma(ell)", measure="d ell"

Arrival-time dual forms & dimensional consistency.

ethics

object

Ethical/risk posture

schema

Risk disclosure & usage limits

Release strategy.


V. Optional Fields

Key

Type

Constraint

Description

lineage

string[]

upstream dataset_id@version list

Provenance lineage

related_artifacts

string[]

files/scripts/baselines

Related assets

notes

string

Non-normative notes

mirrors

string[]

URL

Distribution mirrors

shards

object

schema

Sharding strategy & sizes


VI. Metrology & Path-Related Fragments (metrology and path_dependence)

metrology:

units: "SI"

c_ref: 299792458 # m/s

check_dim: true

path_dependence:

applies_to: ["T_arr"]

delta_form: "const-factor" # or "general"

path: "gamma(ell)"

measure: "d ell"

see:

- "EFT.WP.Core.Equations v1.1:S20-1"

- "EFT.WP.Core.Metrology v1.0:check_dim"

(Registration of path/measure and machine-readable see[]; export strategy aligned.)


VII. Export Manifest & References (export_manifest fragment)

export_manifest:

version: "v1.0"

artifacts:

- path: "datasets/foo/train-000.tgz"

sha256: "…"

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Equations v1.1:S20-1"

- "EFT.WP.Core.Metrology v1.0:check_dim"

(Exports must include version and references[]; references carry volume+version+anchor.)


VIII. Key Patterns & Minimal Regex


IX. see[] & Cross-Volume Mapping (example)

see:

- "EFT.WP.Core.Terms v1.0:P10-*"

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Equations v1.1:S20-1"

(Use fixed format and anchor classes P/S/M/I for clause-level citations.)


X. Chapter Compliance Checklist


Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/