46-EFT.WP.Data.Benchmarks v1.0 | Chapter 15 Machine-readable Schema & Lint

Home ／ Docs-Technical WhitePaper (V6.0) ／ 46-EFT.WP.Data.Benchmarks v1.0

Chapter 15 Machine-readable Schema & Lint

I. Chapter Purpose & Scope

.no Chinese and portal/CI auto-validation. Keys use snake_case; cross-volume citations use “Volume vX.Y:Anchor”; math uses backticks with parentheses and pre-release blocking for benchmark suites, covering structure/type/regex/dependencies/cross-volume citation anchors/dimensional checks/frozen splits & leakage guardrails/scoring–normalization–significance minima/compliance minima; used for Lint ruleset and normative JSON SchemaProvide the

II. Normative Artifacts (Release-Critical)

artifacts:

- path: "schema/benchmark.schema.json"

- path: "schema/lint_rules.yaml"

- path: "schema/examples/minimal.yaml"

- path: "schema/examples/full.yaml"

These artifacts must be listed in export_manifest.artifacts[] with sha256; citation anchors match this volume’s posture.

III. Normative JSON Schema (Core Excerpt)

JSON json

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://eift.org/schema/benchmark.schema.json",
  "title": "EFT Data Benchmark Suite",
  "type": "object",
  "required": [ "suite", "tasks", "metrology", "export_manifest" ],
  "properties": {
    "suite": {
      "type": "object",
      "required": [ "id", "title", "version", "modalities" ],
      "properties": {
        "id": { "type": "string", "pattern": "^[a-z0-9_.\\-]+$" },
        "title": { "type": "string", "minLength": 3 },
        "version": { "type": "string", "pattern": "^v\\d+\\.\\d+(\\.\\d+)?$" },
        "modalities": { "type": "array", "items": { "type": "string" } },
        "risks": { "type": "array", "items": { "type": "string" } },
        "coverage_matrix": { "type": "object" }
      }
    },
    "tasks": {
      "type": "array",
      "items": {
        "type": "object",
        "required": [ "id", "io_mode", "dataset_ref", "splits", "protocol", "metrics", "leakage_guard" ],
        "properties": {
          "id": { "type": "string", "pattern": "^[a-z0-9_.\\-]+$" },
          "io_mode": { "type": "string", "enum": [ "offline", "online", "stream", "interactive" ] },
          "evaluatee": { "type": "string", "enum": [ "model", "system", "pipeline" ] },
          "dataset_ref": { "type": "string", "pattern": "^datasets/[a-z0-9_\\-]+@v\\d+\\.\\d+$" },
          "splits": {
            "type": "object",
            "required": [ "train", "val", "test" ],
            "properties": {
              "train": {
                "type": "object",
                "required": [ "frozen", "index" ],
                "properties": {
                  "frozen": { "type": "boolean", "const": true },
                  "index": { "type": "string" },
                  "sha256": { "type": "string" }
                }
              },
              "val": {
                "type": "object",
                "required": [ "frozen", "index" ],
                "properties": {
                  "frozen": { "type": "boolean", "const": true },
                  "index": { "type": "string" },
                  "sha256": { "type": "string" }
                }
              },
              "test": {
                "type": "object",
                "required": [ "frozen", "index" ],
                "properties": {
                  "frozen": { "type": "boolean", "const": true },
                  "index": { "type": "string" },
                  "sha256": { "type": "string" }
                }
              },
              "ratio": {
                "type": "object",
                "properties": { "train": { "type": "number" }, "val": { "type": "number" }, "test": { "type": "number" } }
              },
              "freeze_indices": { "type": "boolean", "const": true }
            }
          },
          "leakage_guard": { "type": "array", "items": { "type": "string" } },
          "protocol": { "type": "object" },
          "metrics": { "type": "array", "items": { "type": "object" } },
          "aggregation": { "type": "object" },
          "significance": { "type": "object" }
        }
      }
    },
    "metrology": {
      "type": "object",
      "required": [ "units", "check_dim" ],
      "properties": {
        "units": { "type": "string", "const": "SI" },
        "check_dim": { "type": "boolean", "const": true }
      }
    },
    "export_manifest": {
      "type": "object",
      "required": [ "version", "artifacts", "references" ],
      "properties": {
        "version": { "type": "string" },
        "artifacts": { "type": "array", "items": { "type": "object" } },
        "references": {
          "type": "array",
          "minItems": 1,
          "items": { "type": "string", "pattern": "^[^:]+ v\\d+\\.\\d+:[A-Z].+$" }
        }
      }
    }
  },
  "additionalProperties": false
}

The references[] regex enforces “Volume vX.Y:Anchor”; metrology.units="SI" and check_dim=true are mandatory.

IV. Lint Rules (Normative)

version: "v1.0"

rules:

- id: STRUCT.REQUIRED

when: "$"

assert: "has_keys(suite,tasks,metrology,export_manifest)"

level: error

- id: SUITE.VERSION.SEMVER

when: "$.suite.version"

assert: "matches('^v\\d+\\.\\d+(\\.\\d+)?$')"

level: error

- id: SUITE.ID_FORMAT

when: "$.suite.id"

assert: "matches('^[a-z0-9_.\\-]+$')"

level: error

- id: TASK.REQUIRED_KEYS

when: "$.tasks[*]"

assert: "has_keys(id, io_mode, dataset_ref, splits, protocol, metrics, leakage_guard)"

level: error

- id: DATASET.REF_FORMAT

when: "$.tasks[*].dataset_ref"

assert: "matches('^datasets/[a-z0-9_\\-]+@v\\d+\\.\\d+$')"

level: error

- id: SPLITS.FROZEN_REQUIRED

when: "$.tasks[*].splits"

assert: "splits.train.frozen and splits.val.frozen and splits.test.frozen and splits.freeze_indices == true"

level: error

- id: SPLITS.RATIO_SUM

when: "$.tasks[*].splits.ratio"

assert: "abs(value.train + value.val + value.test - 1) <= 1e-6"

level: error

- id: LEAKAGE.GUARD_ALLOWED

when: "$.tasks[*].leakage_guard"

assert: "contains_any(['per-object','per-timewindow','per-scene'])"

level: error

- id: METRICS.FAMILY_UNIT

when: "$.tasks[*].metrics[*]"

assert: "has_keys(name, family, unit, higher_is_better)"

level: error

- id: METRICS.UNIT_SI_OR_DIMLESS

when: "$.tasks[*].metrics[*].unit"

assert: "all_units_in_SI(value) or value in ['—','%']"

level: error

- id: PROTOCOL.MODE_ALLOWED

when: "$.tasks[*].protocol.mode"

assert: "value in ['offline','online','stream','interactive']"

level: error

- id: SIG.PARAMS

when: "$.tasks[*].significance"

assert: "has_keys(method, alpha)"

level: error

- id: SCORE.AGG_LEVELS

when: "$.tasks[*].aggregation.levels"

assert: "contains_all(['task'])"

level: error

- id: SCORE.NORM_SCHEME

when: "$.scoring.normalization.scheme"

assert: "value in ['zscore','minmax','fixed-anchor']"

level: warn

- id: ROBUST.THRESHOLDS_MIN

when: "$.robustness.thresholds"

assert: "has_keys(drop_rel_max, acc_robust_min)"

level: warn

- id: FAIR.THRESHOLDS_MIN

when: "$.fairness_ethics.thresholds"

assert: "has_keys(fairness_warn, fairness_block)"

level: warn

- id: METROLOGY.SI_AND_CHECKDIM

when: "$.metrology"

assert: "units == 'SI' and check_dim == true"

level: error

- id: REFERENCES.FORMAT

when: "$.export_manifest.references[*]"

assert: "matches('^[^:]+ v\\d+\\.\\d+:[A-Z].+$')"

level: error

: STRUCT.REQUIRED, SUITE.VERSION.SEMVER, SUITE.ID_FORMAT, TASK.REQUIRED_KEYS, DATASET.REF_FORMAT, SPLITS.FROZEN_REQUIRED, SPLITS.RATIO_SUM, LEAKAGE.GUARD_ALLOWED, METRICS.FAMILY_UNIT, METRICS.UNIT_SI_OR_DIMLESS, PROTOCOL.MODE_ALLOWED, METROLOGY.SI_AND_CHECKDIM, REFERENCES.FORMAT.Blocking

V. Failure Examples & Diagnostics (Excerpt)

fail_examples:

- case: "bad reference"

input: {export_manifest:{references:["Core.DataSpec:EXPORT"]}}

expect: {rule:"REFERENCES.FORMAT", level:"error",

fix:"Use 'EFT.WP.Core.DataSpec v1.0:EXPORT'"}

- case: "splits not frozen"

input: {tasks:[{id:"cls", io_mode:"offline", dataset_ref:"datasets/core@v1.0",

splits:{train:{frozen:false,index:"..."}, val:{frozen:true,index:"..."}, test:{frozen:true,index:"..."}, freeze_indices:false},

protocol:{mode:"offline"}, metrics:[], leakage_guard:["per-object"]}]}

expect: {rule:"SPLITS.FROZEN_REQUIRED", level:"error",

fix:"Set all splits to frozen and freeze_indices=true"}

- case: "metric without unit"

input: {tasks:[{id:"cls", io_mode:"offline", dataset_ref:"datasets/core@v1.0",

splits:{train:{frozen:true,index:"..."}, val:{frozen:true,index:"..."}, test:{frozen:true,index:"..."}, freeze_indices:true},

protocol:{mode:"offline"}, metrics:[{name:"F1_macro"}], leakage_guard:["per-object"]}]}

expect: {rule:"METRICS.FAMILY_UNIT", level:"error",

fix:"Provide family/unit/higher_is_better for each metric"}

Lint outputs must include rule/path/message/fix to enable one-click remediation.

VI. Minimal Working Example (Validates)

suite:

id: "eift.bench.core"

title: "EIFT Core Benchmarks"

version: "v1.0"

modalities: ["text"]

tasks:

- id: "cls.binary"

io_mode: "offline"

dataset_ref: "datasets/core_cls@v1.0"

splits:

train: {frozen:true, index:"splits/train.index", sha256:"..."}

val: {frozen:true, index:"splits/val.index", sha256:"..."}

test: {frozen:true, index:"splits/test.index", sha256:"..."}

freeze_indices: true

ratio: {train:0.8, val:0.1, test:0.1}

leakage_guard: ["per-object"]

protocol: {mode:"offline", seed:1701, repeats:5}

metrics:

- {name:"F1_macro", family:"classification", unit:"—", higher_is_better:true, agg:"macro"}

aggregation: {levels:["task"], weights:{scheme:"uniform"}}

significance: {method:"bootstrap", alpha:0.05}

metrology: {units:"SI", check_dim:true}

export_manifest:

version: "v1.0"

artifacts: [{path:"benchmark.yaml", sha256:"..."}]

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

VII. Coupling with Export Manifest (Normative)

export_manifest:

artifacts:

- {path:"schema/benchmark.schema.json", sha256:"..."}

- {path:"schema/lint_rules.yaml", sha256:"..."}

- {path:"schema/examples/minimal.yaml", sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

- "EFT.WP.Data.ModelCards v1.0:Ch.11"

and must be listed and verifiable; references carry “Volume vX.Y:Anchor”.blockingSchema & Lint are

VIII. Validation Interfaces (Ixx-?; Unified Return)

def validate_benchmark(spec: dict) -> dict: ...

def lint_benchmark(spec: dict, rules: dict) -> dict: ...

def check_units(spec: dict) -> dict: ... # uses Core.Metrology v1.0:check_dim

def verify_references(spec: dict) -> dict: ...# regex + anchor reachability

Return {"ok": bool, "errors":[...], "warnings":[...], "metrics":{...}} for portal/CI use.

IX. Chapter Compliance Checklist

benchmark.schema.json and lint_rules.yaml produced and registered in export_manifest with sha256.
Schema enforces metrology.units="SI"&check_dim=true and the anchor regex in export_manifest.references[]; Lint blocks unfrozen splits, missing leakage guardrails, metrics without units, and invalid protocol/references.
Each task includes dataset_ref/splits/protocol/metrics/leakage_guard, with ratios summing to 1±1e-6.
Scoring/normalization/significance/fairness & robustness minima enabled; cross-volume citations resolvable.
Minimal example validates once under Schema & Lint; validation interfaces integrated and returning the unified structure.

Copyright & License: Unless otherwise stated, the copyright of “Energy Filament Theory” (including text, charts, illustrations, symbols, and formulas) is held by the author (屠广林).
License (CC BY 4.0): With attribution to the author and source, you may copy, repost, excerpt, adapt, and redistribute.
Attribution (recommended): Author: 屠广林｜Work: “Energy Filament Theory”｜Source: energyfilament.org｜License: CC BY 4.0
Call for verification: Independent and self-funded—no employer and no sponsorship. Next, we will prioritize venues that welcome public discussion, public reproduction, and public critique, with no country limits. Media and peers worldwide are invited to organize verification during this window and contact us.
Version info: First published: 2025-11-11 ｜ Current version: v6.0+5.05