HomeDocs-Technical WhitePaper44-EFT.WP.Data.ModelCards v1.0

Chapter 10 Objectives, Optimization & Hyperparameters


I. Chapter Purpose & Scope

, and the Metrology chapter.Preprocessing & Feature Engineering, Evaluation Protocol & Metrics, Training Data & Sampling Binding, Tasks & I/O of optimization and hyperparams, search space and values, randomness and stopping criteria, regularization and constraints, learning rate and schedulers, mixed precision and gradient clipping, early stopping and rollback; ensure consistency with normative definitionsFix the

II. Fields & Structure (Normative)

optimization:

objective:

name: "<cross_entropy|mse|mae|nll|ctc|triplet|contrastive|custom>"

reduction: "<mean|sum|none>"

weights?: {class:"<inverse_freq|log_inv|custom>", pos_neg: 1.0}

formula?: "L(θ) = ( E_{(x,y)∼D} [ ℓ( f_θ(x), y ) ] )"

regularization:

weight_decay: 0.05

l1: 0.0

label_smoothing: 0.0

grad_clip: {type:"<norm|value>", value: 1.0}

constraints?: ["orthogonal_init","spectral_norm"]

optimizer:

name: "<sgd|adam|adamw|lamb|adagrad|lion|custom>"

lr: 3.0e-4

betas?: [0.9, 0.999]

momentum?: 0.9

eps?: 1.0e-8

weight_decay?: 0.05

amsgrad?: false

scheduler:

name: "<cosine|step|multistep|linear|poly|plateau|onecycle|custom>"

warmup: {steps: 500, mode: "<linear|cosine|none>"}

params?: {step_size: 30, gamma: 0.1}

early_stopping:

monitor: "val/f1_macro"

mode: "max"

patience: 12

min_delta: 0.0

rollback: true

precision:

amp: {train:"<fp16|bf16|fp32>", infer:"<fp16|bf16|fp32>", loss_scale:"<dynamic|static|none>"}

seeds:

global: 1701

per_phase?: {train:[1701,1702,1703], eval:[1701]}

stopping_criteria:

max_epochs: 200

max_steps?: null

wallclock_hours?: null

budget:

gpu_hours: 120

trials: 32

notes?: "<non-normative>"

hyperparams:

batch_size: 256

accum_steps: 1

epochs: 200

grad_accum?: true

dropout: 0.1

label_smoothing?: 0.0

temperature?: null

mixup_cutmix?: {mixup_alpha:0.0, cutmix_alpha:0.0}

search_space?:

lr: {type:"loguniform", low:1.0e-5, high:1.0e-3}

weight_decay: {type:"loguniform", low:1.0e-5, high:1.0e-1}

batch_size: {type:"choice", values:[128,256,512]}

search_algo?: "<grid|random|bayes|evolution|pbt>"

search_seed?: 1701


III. Objective Functions & Weighting Posture


IV. Optimizer, Learning Rate & Schedulers


V. Regularization & Gradient Constraints


VI. Randomness, Stopping & Budget


VII. Metrology & Units (physical/time/frequency/perf)

  1. Learning rate, latency, throughput, and power fields declare units and pass check_dim.
  2. If objectives or constraints involve path-dependent quantities (e.g., T_arr), register delta_form, and validate against one of the two equivalent expressions:
    • T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
    • T_arr = ( ∫ ( n_eff / c_ref ) d ell ).

VIII. Machine-Readable Fragment (Drop-in)

optimization:

objective: {name:"cross_entropy", reduction:"mean", weights:{class:"inverse_freq"}}

regularization: {weight_decay:0.05, label_smoothing:0.0, grad_clip:{type:"norm", value:1.0}}

optimizer: {name:"adamw", lr:3.0e-4, betas:[0.9,0.999], eps:1.0e-8, weight_decay:0.05}

scheduler:

name:"cosine"

warmup: {steps:500, mode:"linear"}

early_stopping: {monitor:"val/f1_macro", mode:"max", patience:12, rollback:true}

precision: {amp:{train:"bf16", infer:"bf16", loss_scale:"dynamic"}}

seeds: {global:1701}

stopping_criteria: {max_epochs:200}

budget: {gpu_hours:120, trials:32}

hyperparams:

batch_size: 256

accum_steps: 1

epochs: 200

dropout: 0.1

search_space:

lr: {type:"loguniform", low:1.0e-5, high:1.0e-3}

weight_decay: {type:"loguniform", low:1.0e-5, high:1.0e-1}

batch_size: {type:"choice", values:[128,256,512]}

search_algo: "bayes"

search_seed: 1701


IX. Consistency with Evaluation, Architecture & Features


X. Export Manifest & Audit Trail

export_manifest:

artifacts:

- {path:"opt/hparams.yaml", sha256:"..."}

- {path:"opt/search_space.yaml", sha256:"..."}

- {path:"opt/search_trials.csv", sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

be verifiable and consistent with the Model Card.mustSearch space, trial logs, and final hyperparameters

XI. Chapter Compliance Checklist


Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/