Home / Docs-Technical WhitePaper / 44-EFT.WP.Data.ModelCards v1.0
Chapter 7 Architecture & Parameters
I. Chapter Purpose & Scope
of architecture and associated parameters, counting posture and reproducible implementation details, covering backbone/head/modular composition, parameter count M_param, activations/normalization/positional encoding, initialization & precision policy, regularization and linkage to structured compression; ensure consistency with Task I/O, evaluation protocol, and the Metrology chapter.normative definitionFix theII. Terminology & Dependencies
- Terminology source: Comprehensive Template v0.1; this chapter only adds fields directly tied to architecture/parameters.
- Dependent volumes: Core.DataSpec v1.0 (contracts/exports); Core.Metrology v1.0 (units & dimensional checks); Core.Equations v1.1 (if path/arrival-time appears). Inline symbols use backticks; any division/integral/composite operator must use parentheses; no Chinese in formulas/symbols/definitions.
III. Fields & Structure (Normative)
architecture:
version: "v1.0"
backbone: "<string>" # e.g., resnet50 | vit-b | conformer-xs | transformer-base
topology:
stages: # ordered module/stage list (order = topology)
- {name:"stem", type:"conv", params:{out:64, k:7, s:2, norm:"bn", act:"relu"}}
- {name:"stage1", type:"resblk", repeat:3, params:{out:256, bottleneck:true}}
- {name:"stage2", type:"resblk", repeat:4, params:{out:512}}
- {name:"head", type:"linear", params:{out_dim:1000}}
positional_encoding: {type:"sinusoidal|learned|none", dim: 768?}
norm: {type:"bn|ln|rmsnorm", eps:1e-5, affine:true}
act: {type:"relu|gelu|silu|tanh"}
dropout: {p: 0.1}
attention: {type:"msa|lsa|flash", heads:12?, window:16?}
mixed_precision: {train:"fp16|bf16|fp32", infer:"fp16|bf16|fp32", loss_scale:"dynamic|static|none"}
init:
scheme: "kaiming_uniform|xavier_normal|trunc_normal"
seed: 1701
params_report:
M_param: 25.6 # million (M)
FLOPs: 4.1e9 # per-sample inference
T_inf: 3.8 # ms/sample (batch=1; record device/driver elsewhere)
constraints:
grad_ckpt: true
amp_safe_ops: ["conv","gemm"]
see:
- "EFT.WP.Core.Metrology v1.0:check_dim"
(M_param/FLOPs/T_inf units and posture are validated by the Metrology chapter; any I/O-related shapes must match Chapter 6.)
IV. Parameter Counting & Metrology Posture
- M_param: Count trainable parameters; optimizer state excluded by default. If including frozen parameters or sparsity masks, note it under params_report.notes.
- FLOPs: Use single-sample forward as default; if dependent on sequence length/resolution, provide a function form or nominal points (e.g., 224×224, T=16000).
- T_inf: Record device/batch/framework & kernel versions; unit ms; report a reproducible statistic (e.g., median of 5 runs under fixed env).
- Dimensional consistency: Declare units in Schema and pass check_dim.
V. Module Catalog & Constraints (Common Types)
- Conv/Residual blocks: conv/resblk; params: out,k,s,p,bottleneck; standardize residual connections and downsampling triggers.
- Transformer encoder/decoder: self_attn/cross_attn/ffn; enforce heads * head_dim = model_dim; expose masking/causality under constraints.
- Normalization/Activation: Declare at norm/act top level; layer overrides must be explicit in module params.
- Positional encoding: sinusoidal|learned|rope etc.; align with input length/padding policy.
- Attention impl: msa (standard) | lsa (local) | flash (kernel-optimized); state windowing and memory/complexity improvements.
VI. Initialization, Precision & Device Policy
- init.scheme: Specify weight/bias strategy; for distributed/sharded init, provide pseudorandom graph and seed matrix.
- mixed_precision: Training/inference precision and loss scaling; list unsafe ops allow/deny lists and fallback rules.
- Gradient checkpointing/recompute: Flag via constraints.grad_ckpt; document impact on FLOPs/T_inf.
VII. Regularization & Structured Techniques (linked to Chapter 5 Compression)
- Weight decay/dropout: Declare in optimization / architecture.dropout.
- Structured sparsity/channel pruning: If enabled, put details in compression and reflect topology/shape impacts in this chapter’s constraints.
- Distillation alignment: Ensure compression.distillation.teacher is shape-compatible with architecture.backbone.
VIII. Consistency with Task I/O & Evaluation Protocol
- Shape coherence: Spatial/temporal changes in architecture.topology must align with io_schema (Chapter 6).
- Evaluation reproducibility: List checkpoint and weight hashes used for evaluation in export_manifest.artifacts[]; record seed/repeats in evaluation.
IX. Machine-Readable Fragment (Drop-in)
architecture:
version: "v1.0"
backbone: "vit-b"
topology:
- {name:"patchify", type:"conv", params:{k:16, s:16, out:768}}
- {name:"enc1", type:"transformer_block", repeat:12,
params:{dim:768, heads:12, mlp_ratio:4.0, act:"gelu", norm:"ln"}}
- {name:"head", type:"linear", params:{out_dim:1000}}
positional_encoding: {type:"sinusoidal", dim:768}
mixed_precision: {train:"bf16", infer:"bf16", loss_scale:"dynamic"}
init: {scheme:"trunc_normal", seed:1701}
params_report: {M_param: 86.6, FLOPs: 1.8e10, T_inf: 6.2}
X. Linkage to Path-Dependent Quantities (if applicable)
If the architecture contains path-dependent operators/subnets (e.g., learnable refractive-index mapping or delay-estimation heads), register in the Model Card:- path_dependence.delta_form, path="gamma(ell)", measure="d ell".
- Two equivalent T_arr expressions:
- T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
- T_arr = ( ∫ ( n_eff / c_ref ) d ell )
and pass check_dim.
XI. Machine-Readable Schema (Excerpt, Normative)
# I15-7 Architecture & Params (excerpt)
properties:
architecture:
type: object
required: [version, backbone, topology]
properties:
version: {type: string}
backbone:{type: string}
topology: {type: array, items:{type: object, properties:{
name:{type:string}, type:{type:string}, repeat:{type:integer},
params:{type:object}}}}
positional_encoding: {type: object}
norm: {type: object}
act: {type: object}
dropout: {type: object}
attention: {type: object}
mixed_precision: {type: object}
init: {type: object, properties:{scheme:{type:string}, seed:{type:integer}}}
params_report: {type: object, properties:{M_param:{type:number}, FLOPs:{type:number}, T_inf:{type:number}}}
constraints: {type: object}
(Declare units at the Schema layer and validate via Core.Metrology v1.0; citations use “Volume vX.Y:Anchor”.)
XII. Chapter Compliance Checklist
- architecture declares backbone/topology/init/precision and parameter report, consistent with Chapter 6 I/O and evaluation scripts.
- M_param/FLOPs/T_inf units & measurement posture are explicit and pass check_dim; device/batch/environment locks ensure reproducibility.
- If path dependence exists, delta_form/path/measure registered and metrology checks passed; all formulas use backticks and parentheses with no Chinese.
- Interfaces to compression/explainability are clear; export artifacts and hashes are listed in export_manifest.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/