44-EFT.WP.Data.ModelCards v1.0 | Chapter 10 Objectives, Optimization & Hyperparameters

Home ／ Docs-Technical WhitePaper ／ 44-EFT.WP.Data.ModelCards v1.0

Chapter 10 Objectives, Optimization & Hyperparameters

I. Chapter Purpose & Scope

, and the Metrology chapter.Preprocessing & Feature Engineering, Evaluation Protocol & Metrics, Training Data & Sampling Binding, Tasks & I/O of optimization and hyperparams, search space and values, randomness and stopping criteria, regularization and constraints, learning rate and schedulers, mixed precision and gradient clipping, early stopping and rollback; ensure consistency with normative definitionsFix the

II. Fields & Structure (Normative)

optimization:

objective:

name: "<cross_entropy|mse|mae|nll|ctc|triplet|contrastive|custom>"

reduction: "<mean|sum|none>"

weights?: {class:"<inverse_freq|log_inv|custom>", pos_neg: 1.0}

formula?: "L(θ) = ( E_{(x,y)∼D} [ ℓ( f_θ(x), y ) ] )"

regularization:

weight_decay: 0.05

l1: 0.0

label_smoothing: 0.0

grad_clip: {type:"<norm|value>", value: 1.0}

constraints?: ["orthogonal_init","spectral_norm"]

optimizer:

lr: 3.0e-4

betas?: [0.9, 0.999]

momentum?: 0.9

eps?: 1.0e-8

weight_decay?: 0.05

amsgrad?: false

scheduler:

warmup: {steps: 500, mode: "<linear|cosine|none>"}

params?: {step_size: 30, gamma: 0.1}

early_stopping:

monitor: "val/f1_macro"

mode: "max"

patience: 12

min_delta: 0.0

rollback: true

precision:

amp: {train:"<fp16|bf16|fp32>", infer:"<fp16|bf16|fp32>", loss_scale:"<dynamic|static|none>"}

seeds:

global: 1701

per_phase?: {train:[1701,1702,1703], eval:[1701]}

stopping_criteria:

max_epochs: 200

max_steps?: null

wallclock_hours?: null

budget:

gpu_hours: 120

trials: 32

notes?: "<non-normative>"

hyperparams:

batch_size: 256

accum_steps: 1

epochs: 200

grad_accum?: true

dropout: 0.1

label_smoothing?: 0.0

temperature?: null

mixup_cutmix?: {mixup_alpha:0.0, cutmix_alpha:0.0}

search_space?:

lr: {type:"loguniform", low:1.0e-5, high:1.0e-3}

weight_decay: {type:"loguniform", low:1.0e-5, high:1.0e-1}

batch_size: {type:"choice", values:[128,256,512]}

search_algo?: "<grid|random|bayes|evolution|pbt>"

search_seed?: 1701

III. Objective Functions & Weighting Posture

Classification: cross_entropy; optional label_smoothing∈[0,1); class imbalance via weights.class (inverse_freq|log_inv|custom).
Regression/Time-series: mse|mae|nll; declare target units and dimensions consistent with Metrology.
Contrastive/Retrieval: triplet|contrastive; specify pair/miner strategy and margin.
CTC/Seq2Seq: declare blank symbol and alignment strategy.
Custom: provide plain-text L(θ); wrap inline symbols with backticks and use parentheses.

IV. Optimizer, Learning Rate & Schedulers

Optimizer parameters must be explicit (lr/β/ε/momentum, etc.). scheduler specifies warmup and decay strategy; for plateau, provide monitor metric, smoothing window, and minimum LR.
For distributed training, document LR scaling rule (linear/sqrt) and global batch B_global = ( B_local × accum_steps × devices ).

V. Regularization & Gradient Constraints

Weight decay: keep optimizer.weight_decay consistent with regularization.weight_decay.
Gradient clipping: record grad_clip type and threshold.
Augmentation regularization: list mixup/cutmix/specaugment hyperparameters in hyperparams and quantify their impact (with significance) in Evaluation.

VI. Randomness, Stopping & Budget

Fix seeds and provide repeats and the evaluation seed array.
Stopping: record both early stopping (early_stopping) and hard limits (stopping_criteria).
Budget: declare gpu_hours/trials consistent with the search algorithm.

VII. Metrology & Units (physical/time/frequency/perf)

Learning rate, latency, throughput, and power fields declare units and pass check_dim.
If objectives or constraints involve path-dependent quantities (e.g., T_arr), register delta_form, and validate against one of the two equivalent expressions:
- T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
- T_arr = ( ∫ ( n_eff / c_ref ) d ell ).

VIII. Machine-Readable Fragment (Drop-in)

optimization:

objective: {name:"cross_entropy", reduction:"mean", weights:{class:"inverse_freq"}}

regularization: {weight_decay:0.05, label_smoothing:0.0, grad_clip:{type:"norm", value:1.0}}

optimizer: {name:"adamw", lr:3.0e-4, betas:[0.9,0.999], eps:1.0e-8, weight_decay:0.05}

scheduler:

name:"cosine"

warmup: {steps:500, mode:"linear"}

early_stopping: {monitor:"val/f1_macro", mode:"max", patience:12, rollback:true}

precision: {amp:{train:"bf16", infer:"bf16", loss_scale:"dynamic"}}

seeds: {global:1701}

stopping_criteria: {max_epochs:200}

budget: {gpu_hours:120, trials:32}

hyperparams:

batch_size: 256

accum_steps: 1

epochs: 200

dropout: 0.1

search_space:

lr: {type:"loguniform", low:1.0e-5, high:1.0e-3}

weight_decay: {type:"loguniform", low:1.0e-5, high:1.0e-1}

batch_size: {type:"choice", values:[128,256,512]}

search_algo: "bayes"

search_seed: 1701

IX. Consistency with Evaluation, Architecture & Features

The measurement units for optimization.objective align with evaluation.metrics.
batch_size/accum_steps/precision align with resources.T_inf/QPS.
feature_space normalization matches the objective’s input assumptions.

X. Export Manifest & Audit Trail

export_manifest:

artifacts:

- {path:"opt/hparams.yaml", sha256:"..."}

- {path:"opt/search_space.yaml", sha256:"..."}

- {path:"opt/search_trials.csv", sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

be verifiable and consistent with the Model Card.mustSearch space, trial logs, and final hyperparameters

XI. Chapter Compliance Checklist

optimization and hyperparams are complete; objectives/optimizer/scheduler/regularization/search space are explicit.
Randomness, stopping, and budget are clear; evaluation repeats and significance align with Evaluation Protocol & Metrics.
Fields with units pass check_dim; for path quantities, delta_form/path/measure are registered and validated.
Export manifest includes hyperparameter/search artifacts with sha256; references use “Volume vX.Y:Anchor.”

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published： 2025-11-11｜Current version：v5.1
License link：https://creativecommons.org/licenses/by/4.0/