HomeDocs-Technical WhitePaper46-EFT.WP.Data.Benchmarks v1.0

Chapter 16 Implementation Binding & Evaluation API


I. Chapter Purpose & Scope

: interface prototypes, request/response envelopes, error codes, auth & idempotency, rate limits and version negotiation; cover suite loading, task execution, scoring & normalization, significance & uncertainty computation, leaderboard publish/revoke; align with data contracts, metrology posture, cross-volume anchors, and the export manifest.evaluation APIs and normative implementation bindingsProvide

II. Service Surface (Normative)

services:

benchmarks.v1:

- POST /api/v1/benchmarks/load_suite

- POST /api/v1/benchmarks/list_tasks

- POST /api/v1/benchmarks/get_task

- POST /api/v1/benchmarks/evaluate

- POST /api/v1/benchmarks/score

- POST /api/v1/benchmarks/significance

- POST /api/v1/benchmarks/uncertainty

- POST /api/v1/benchmarks/robustness

- POST /api/v1/benchmarks/fairness_ethics

- POST /api/v1/benchmarks/runtime/metrics

- POST /api/v1/benchmarks/runtime/lineage

- POST /api/v1/benchmarks/runtime/replay

- POST /api/v1/benchmarks/submit

- POST /api/v1/benchmarks/publish

- POST /api/v1/benchmarks/revoke

- POST /api/v1/benchmarks/hash_artifact

- POST /api/v1/benchmarks/sign_artifact


III. Common Request/Response & Auth

request_envelope:

headers:

Authorization: "Bearer <oidc-token> | HMAC <key>:<sig>"

x-eift-idempotency: "<uuid>"

content-type: "application/json"

body:

suite?: { ... }

task_id?: "<suite.task>"

spec?: { ... }

payload?: {artifacts:[{path, bytes_b64?, sha256?}]}

options?: {dry_run?: true, strict?: true}

filters?: {run_id?: "<id>", since?: "<ISO8601>", until?: "<ISO8601>"}

response_envelope:

status: "ok" | "warn" | "error"

errors: [{code, message, path?, see?}]

warnings:[{code, message, path?, see?}]

metrics: { ... }

data?: { ... }

version: "benchmarks.v1"

security:

auth: "OIDC bearer | HMAC"

tls: "TLS1.2+"

scope: ["load","evaluate","metrics","lineage","submit","publish","admin"]

rate_limits:

per_key_per_minute: 120

burst: 60


IV. Normative OpenAPI Excerpt

openapi: 3.0.3

info: {title: "EFT Benchmarks API", version: "v1"}

paths:

/api/v1/benchmarks/load_suite:

post:

summary: Validate and load a benchmark suite

requestBody: {required:true, content: {"application/json": {schema: {$ref: "#/components/schemas/SuiteEnvelope"}}}}

responses:

"200": {description: "Result", content: {"application/json": {schema: {$ref: "#/components/schemas/Result"}}}}

/api/v1/benchmarks/evaluate:

post:

summary: Execute evaluation (offline/online/stream/interactive)

requestBody: {required:true, content: {"application/json": {schema: {$ref: "#/components/schemas/EvalRequest"}}}}

responses:

"200": {description: "Run accepted", content: {"application/json": {schema: {$ref: "#/components/schemas/EvalResult"}}}}

components:

schemas:

SuiteEnvelope: {type: object, properties: {suite: {}, options:{type:object}}}

EvalRequest:

type: object

properties:

task_id: {type: string}

spec: {type: object}

options: {type: object, properties:{mode:{type:string, enum:["sync","async"]}}}

EvalResult:

type: object

properties:

run_id: {type: string}

state: {type: string, enum: ["queued","running","succeeded","failed"]}

scores: {type: object}

ci: {type: object}

artifacts: {type: array, items:{type: object}}

Result:

type: object

properties:

status: {type: string, enum: [ok, warn, error]}

errors: {type: array, items: {$ref: "#/components/schemas/Issue"}}

warnings:{type: array, items: {$ref: "#/components/schemas/Issue"}}

metrics: {type: object}

data: {type: object}

Issue:

type: object

properties:

code: {type: string}

message: {type: string}

path: {type: string}

see: {type: array, items: {type: string}}


V. Endpoint Semantics (Essentials)


VI. Error Codes (Normative)

errors:

- {code:"ESCHEMA001", message:"suite schema violation", path:"$.suite"}

- {code:"EREF001", message:"invalid reference format", path:"$.export_manifest.references[*]"}

- {code:"EDIM001", message:"units must be SI and check_dim", path:"$.metrology"}

- {code:"ESPLIT001", message:"splits must be frozen and frozen indices enabled", path:"$.tasks[*].splits"}

- {code:"ELEAK000", message:"cross-split leakage detected", path:"$.tasks[*].leakage_guard"}

- {code:"EPROTO001", message:"protocol mode invalid", path:"$.tasks[*].protocol.mode"}

- {code:"EMETRIC001", message:"metric missing family/unit/higher_is_better", path:"$.tasks[*].metrics[*]"}

- {code:"ESIG001", message:"significance params incomplete", path:"$.tasks[*].significance"}

- {code:"EPUB001", message:"publish gate not met", path:"$.scoring.stability"}


VII. Idempotency, Versioning & Compatibility

idempotency:

header: "x-eift-idempotency"

window_hours: 24

versioning:

api: "benchmarks.v1" # breaking change → bump MAJOR

minor: "backward-compatible additions"

compatibility:

request_backward: "minor+patch"

response_fields: "additive only; no removals"


VIII. Security, Audit & Compliance


IX. Machine-Readable Implementation Snippets (Ixx-? Prototypes)

def load_suite(suite: dict) -> dict: ...

def list_tasks(suite_id: str) -> dict: ...

def get_task(suite_id: str, task_id: str) -> dict: ...

def evaluate(task_id: str, spec: dict, mode: str = "async") -> dict: ...

def score(results: dict, aggregation: dict, normalization: dict) -> dict: ...

def significance(a: dict, b: dict, method: str = "bootstrap", B: int = 10000) -> dict: ...

def uncertainty(model: str, components: list[dict], policy: dict) -> dict: ...

def robustness(spec: dict) -> dict: ...

def fairness_ethics(spec: dict) -> dict: ...

def runtime_metrics(run_id: str, since: str|None=None, until: str|None=None) -> dict: ...

def lineage(spec: dict|None=None, run_id: str|None=None) -> dict: ...

def replay(run_id: str, policy: str="strict") -> dict: ...

def hash_artifact(path: str|bytes) -> dict: ...

def sign_artifact(path: str|bytes, key_id: str) -> dict: ...

def submit(payload: dict) -> dict: ...

def publish(entry: dict) -> dict: ...

def revoke(tag: str, reason: str) -> dict: ...


X. Example Invocations (Ready-to-use)

curl -s -X POST https://api.eift.org/api/v1/benchmarks/load_suite \

-H "Authorization: Bearer <token>" \

-H "x-eift-idempotency: 7b7a0b1e-0a21-4f3f-9d0b-3b1e9b1f3c22" \

-H "Content-Type: application/json" \

-d @benchmark.json

curl -s -X POST https://api.eift.org/api/v1/benchmarks/evaluate \

-H "Authorization: Bearer <token>" -H "Content-Type: application/json" \

-d '{"task_id":"cls.binary","spec":{...},"options":{"mode":"async"}}'

curl -s -X POST https://api.eift.org/api/v1/benchmarks/score -d @scores.json

curl -s -X POST https://api.eift.org/api/v1/benchmarks/significance -d @pair.json


XI. Coupling with Export Manifest (Normative)

export_manifest:

artifacts:

- {path:"api/openapi.yaml", sha256:"..."}

- {path:"api/clients/python.tar.gz", sha256:"..."}

- {path:"runs/RUN-123/scores.json", sha256:"..."}

- {path:"runs/RUN-123/ci.json", sha256:"..."}

- {path:"runs/RUN-123/leaderboard.csv",sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

- "EFT.WP.Data.Benchmarks v1.0:Ch.6"

- "EFT.WP.Data.Benchmarks v1.0:Ch.8"

- "EFT.WP.Data.Benchmarks v1.0:Ch.9"


XII. Chapter Compliance Checklist


Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/