Home / Docs-Technical WhitePaper / 14-EFT.WP.Methods.Inference v1.0
Chapter 9: Inference Pipeline Card & Parameter Card
I. Scope & Objectives
- Define and standardize two core artifacts—InferPipelineCard (IPC) and ParamCard (PC)—as the unified, offline/online contracts for inference. The specification spans the object model, operator graph, data conventions, calibration and uncertainty, SLO references, and audit signatures.
- Provide the minimum required fields, constraints, and conflict-resolution rules. Produce a computable fingerprint and verifiable signature to guarantee cross-environment replayability and evidentiary compliance.
- Target outputs
- Schema and field dictionary for IPC/PC.
- Validation and composition algorithms.
- Change impact analysis, rollback playbooks, and release strategy.
- Metrology flow Mx-53 → Mx-58 and the audit artifact checklist.
II. Terms & Symbols
- InferPipelineCard (IPC): declares inference topology, data & feature interfaces, operators & precision policy, calibration & uncertainty, observability & SLO references, and the environment anchor.
- ParamCard (PC): declares runtime parameters and hyperparameters (e.g., batch_size, device, dtype_policy, quant_scheme, rng.seed, window), with parameter constraints and unit conventions attached.
- EnvLock: environment lock (compiler/driver/firmware/library versions including CUDA/cuDNN/MKL, etc.), which, together with the cards, determines replayability.
- anchor: logical anchor bundling model_id, artifacts, dataset_ref, preproc/postproc pointers.
- fingerprint: hash( canon( IPC || PC || EnvLock ) ) for version freezing and audit.
- signature: sign( fingerprint, privkey ) for verifiable release.
- Conflict-name enforcement: T_fil vs. T_trans may not be mixed; n vs. n_eff must be distinguished; formulas and symbols use English/Latin only.
III. Postulates & Minimal Equations
- P41-31 Card-replayability postulate
With EnvLock fixed and the input distribution unchanged, the same fingerprint for the (IPC + PC) pair yields output distribution y_hat ~ F( fingerprint, x ) that is equivalent to baseline. - P41-32 Card-closure postulate
After canonicalization canon(•), ( IPC, PC, EnvLock ) deterministically maps to a unique runtime configuration: ( IPC, PC, EnvLock ) → Runtime is single-valued. - S42-41 Fingerprint & signature
fingerprint = hash( canon( IPC || PC || EnvLock ) )
signature = sign( fingerprint, privkey ) with verification
verify( signature, fingerprint, pubkey ) = true. - S42-42 Effective configuration synthesis
Merge precedence increases left-to-right:
rt.opts = merge( defaults, env, ipc.opts, pc.opts, override ) (rightmost wins).
Key-wise rule:
effective[k] = override[k] if k∈override else pc[k] if k∈pc else ipc[k] if k∈ipc else env[k] if k∈env else defaults[k]. - S42-43 Constraint consistency
Any parameter θ must pass three checks: check_dim(θ.unit), check_range(θ.min, θ.max), check_enum(θ.enum). On failure, raise E_SCHEMA_MISMATCH or E_PRECISION_LOSS.
IV. Data & Manifest Conventions
- IPC — minimum required fields (names in English; units/dimensions explicit)
- meta: { name, version, owner, created_at }
- anchor: { model_id, artifacts, dataset_ref, preproc_ref, postproc_ref }
- graph: { inputs, outputs, opset, constraints, determinism }
- features: { mapping, window, alignment: ts = alpha + beta * tau_mono }
- precision: { dtype_policy, quant_scheme, calibration_ref }
- uncertainty: { method, outputs: [mean,var,quantile], coverage: 1-delta }
- observability: { metrics: [TS.latency, TS.thrpt, TS.error], buckets, window }
- slo_ref: { SLO.id, targets: { p99, avail, cost_u } }
- env_req: { compiler, driver, runtime_libs, device_class }
- security: { allowlist_ops, denied_ops }
- opts: { batch_mode, parallelism, warmup }
- PC — minimum required fields
- meta: { name, version, parent_fingerprint }
- opts: { batch_size, device, num_threads, rng.seed, rng_family, timeout_ms }
- io: { max_bytes_in, max_bytes_out, compress, serialize }
- precision: { dtype_policy, quant_scheme, int8_calib, fp16_safety }
- time: { window, stride, watermark, clock_align }
- constraints: [{ key, unit, min, max, enum }]
- overrides: { preproc: {…}, postproc: {…} }
- Field-convention inheritance
features.window aligns with Chapter 4; observability.metrics with Chapter 8; calibration_ref with Chapter 7; graph.inputs/outputs comply with Chapter 5 data-type norms.
V. Algorithms & Implementation Bindings
- Prototypes
- I40-20 validate_pipeline_card(ipc:dict) -> ValidateReport
- I40-21 validate_param_card(pc:dict, ipc:dict) -> ValidateReport
- I40-22 compose_runtime(ipc:dict, pc:dict, env:dict, override:dict) -> Runtime
- I40-23 fingerprint_and_sign(ipc:dict, pc:dict, env:dict, key:any) -> {fingerprint:str, signature:str}
- I40-24 diff_cards(a:dict, b:dict) -> DiffReport
- I40-25 impact_analysis(diff:DiffReport, slo:SLOSpec) -> ImpactReport
- Validation summary
- Graph & ops: graph.opset entries must be in allowlist_ops and not in denied_ops. Nondeterministic ops must be declared in determinism.exempt with a fallback.
- Dimensions: run check_dim(expr) and range checks over io/precision/time/opts.
- Calibration: if quant_scheme ∈ {int8}, int8_calib must point to a valid calibration_ref, else E_CALIBRATION_FAIL.
- Time: clock_align must specify {alpha,beta} or reference Chapter 6 alignment artifacts.
- Composition highlights
The resulting configuration must respect invariants: the cap on TS.error must be consistent with timeout_ms, batch_size, and parallelism via the queue consistency rule (see S42-34).
VI. Metrology Flows & Run Diagram (Mx-53 → Mx-58)
- Mx-53 Drafting & schema validation
Author IPC/PC per this chapter’s dictionary; run I40-20/21 to produce ValidateReport. Only must checks passing proceed to registry. - Mx-54 Fingerprint & signature
Execute I40-23 to produce fingerprint, signature. Record
fingerprint = hash( canon( IPC || PC || EnvLock ) ) and the public-key reference. - Mx-55 Offline replay & parity dry-run
Compose Runtime via I40-22 and replay on baseline data to generate ConsistencyReport and ScoreReport. Require R_infer >= τ_cons before canary. - Mx-56 Canary rollout & observability
Deploy IPC/PC to the canary channel and bind Chapter 8 SLI computation. Trigger rollback if budget.used > τ_budget or delta_offon > τ_delta. - Mx-57 Impact analysis & threshold convergence
Use I40-24/25 to produce ImpactReport; update ParamCard.constraints and SLO thresholds; mint a PC minor version. - Mx-58 Archival & forensics
Archive IPC.yaml, PC.yaml, ValidateReport, DiffReport, ImpactReport, ConsistencyReport, ScoreReport, with signature and fingerprint attached.
VII. Verification & Test Matrix
- Fingerprint & signature consistency: across nodes, the same IPC/PC/EnvLock produce identical fingerprint, and verify(signature, fingerprint, pubkey) = true.
- Parameter boundaries: for each constraints entry, inject min/max/enum violations and expect E_SCHEMA_MISMATCH or E_PRECISION_LOSS accordingly.
- Dimensional audits: run check_dim(expr) on time.window, io.max_bytes_in, and precision.dtype_policy—all must pass.
- Nondeterminism isolation: introduce a stochastic op; if not declared in determinism.exempt, trigger E_NONDETERMINISM.
- Parity check: compose_runtime yields delta_offon <= τ_delta for the same anchor in offline/online paths.
- Quantization & calibration: for INT8, missing calibration_ref must trigger E_CALIBRATION_FAIL; after loading, ECE must not exceed gate thresholds.
VIII. Cross-References & Dependencies
- Chapter 4: features.mapping, windows, and de-identification lineage.
- Chapter 5: graph, precision, operator stability, and determinism policies.
- Chapter 6: timeline alignment ts = alpha + beta * tau_mono and delta_offon.
- Chapter 7: uncertainty and calibration_ref.
- Chapter 8: observability, slo_ref, ScoreReport, and budget governance.
- Core volumes: Core.DataSpec, Core.Threads, Core.Metrology anchor field schemas and observability conventions.
IX. Risks, Limitations & Open Questions
- Schema drift: uncontrolled field growth and private extensions can destabilize canon(•); freeze the spec and define an extension-bit policy.
- Key management: signature key rotation/revocation must decouple from release channels; introduce key_id and validity windows.
- Cross-device equivalence: switching device_class shifts TS.latency_p99 and R_infer; PC must declare explicit equivalence test strategies.
- Multi-card composition: multi-stage pipeline compositions across IPC boundaries are not fully formalized; consider extending to a PipelineSetCard.
X. Deliverables & Versioning
- Deliverables
- IPC.yaml — inference pipeline card (anchor/graph/features/precision/uncertainty/observability/slo_ref/env_req/opts).
- PC.yaml — parameter card (opts/io/precision/time/constraints/overrides).
- ValidateReport.json, ConsistencyReport.json, ScoreReport.json, DiffReport.json, ImpactReport.json.
- fingerprint.txt and signature.asc; release fingerprint index and public-key reference.
- Versioning policy
- Changing graph/opset/precision/env_req or features.mapping → major version bump and recompute fingerprint.
- Changing opts/constraints/time/overrides without touching graph/precision → minor version bump.
- Documentary fields (description, notes) are excluded from the fingerprint and listed under non_fingerprint_fields.
- Release rule: canary → stable → LTS, with CHANGELOG and impact domain recorded in Appendix C.
Appendix: IPC & PC Reference Skeleton
- InferPipelineCard (excerpt)
- meta: { name, version, owner }
- anchor: { model_id, artifacts: [uri], dataset_ref, preproc_ref, postproc_ref }
- graph: { inputs: [{name, dtype, shape}], outputs: [{name, dtype, shape}], opset, determinism: {exempt:[], fallback:""}, constraints: [] }
- features: { mapping: [{field, expr}], window: {size, stride}, alignment: {alpha, beta} }
- precision: { dtype_policy: "fp16", quant_scheme: "none", calibration_ref: null }
- uncertainty: { method: "mc_dropout", outputs: ["mean","var","q05","q95"], coverage: 0.95 }
- observability: { metrics: ["TS.latency","TS.thrpt","TS.error"], buckets: "kll", window: "5m" }
- slo_ref: { id: "SLO.Infer.v1", targets: { p99: "120ms", avail: 0.999, cost_u: "¥0.0005" } }
- env_req: { compiler: "gcc-11", driver: "nvidia-535", runtime_libs: ["cudnn-8.9"], device_class: "gpu" }
- security: { allowlist_ops: ["matmul","gelu","softmax"], denied_ops: ["random_like"] }
- opts: { batch_mode: "dynamic", parallelism: 2, warmup: 100 }
- ParamCard (excerpt)
- meta: { name, version, parent_fingerprint }
- opts: { batch_size: 16, device: "cuda:0", num_threads: 4, rng: {seed: 1234, family: "philox"}, timeout_ms: 200 }
- io: { max_bytes_in: "2MB", max_bytes_out: "1MB", compress: "gzip", serialize: "msgpack" }
- precision: { dtype_policy: "fp16", quant_scheme: "none", int8_calib: null, fp16_safety: "amp_02" }
- time: { window: "T+0", stride: "1s", watermark: "2s", clock_align: {alpha: 0.0, beta: 1.0} }
- constraints: [ { key: "batch_size", unit: "count", min: 1, max: 256 }, { key: "timeout_ms", unit: "ms", min: 50, max: 1000 } ]
- overrides: { preproc: {normalize: "zscore"}, postproc: {threshold: 0.5} }
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/