Home / Docs-Technical WhitePaper / 05-EFT.WP.Core.Errors v1.0
Chapter 6 — Logging, Traceability, and Diagnostics
I. Objectives and Scope
- Objective: define a unified logging and traceability system from error occurrence to root-cause localization, ensuring events are replayable, evidence is verifiable, and diagnostics are quantifiable—aligned with I50 6 (log_event, traceback_summary, attach_traceability).
- Scope: full-chain evidencing and cross-volume binding for model residuals r, error budgets EB, and arrival time T_arr = ( ∫_gamma ( n_eff / c_ref ) d ell ).
II. Postulates (Recording and Traceability)
- P76-1 (Event atomicity): log each event at the minimal decidable unit; do not concatenate across stages. Cross-stage relations are expressed via trace_id and span_id.
- P76-2 (Reproducible evidence): every diagnostic conclusion must be statistically reproducible from the recorded data_fingerprint, code_fingerprint, unit_policy, RefCond, and theta.
- P76-3 (Minimal evidence set): any report involving T_arr must carry a gamma(ell) description, the discretization strategy for the measure d ell, h, p_hat, EB, U, and chi2 = r^T R r.
III. Recording Layer: Event Schema and Fields
- Event key
ts (UTC ISO8601), code (aligned with register_error_code), level ∈ {DEBUG,INFO,WARN,ERROR,CRITICAL}, domain, trace_id, span_id, parent_span_id|None. - Semantics and metrics
message (short English sentence), metrics (key–value, e.g., chi2, RMSE, pass_rate), SLI (e.g., latency_ms, error_ratio), r_summary (mean, std, max|index). - Metrology context
measurand, unit, value, U, k, EB, RefCond = { p_ref, Temp_ref, humidity_ref }, unit_policy. - Model and data
model_id, theta (parameter digest), data_fingerprint (sha256 or blake3), code_fingerprint (commit or package version), seed. - Path / arrival time
path_spec (discretized node spec for gamma(ell)), measure_dell (Δell statistics), h, p_hat, quadrature (e.g., trapezoid|simpson|adaptive). - Example (flattened, plain text)
code=E.TIME.ARRIVAL.BUDGET, level=WARN, measurand="T_arr", value=3.214e-3, unit="s", U=2.7e-5, k=2, chi2=11.3, trace_id=4f1c....
IV. Trace Layer: Chains and Evidence
- Linkage
trace_id identifies the end-to-end flow; span_id identifies a step; cause_id may point to the triggering upstream event (e.g., calibration drift). - Evidence objects
evidence = { artifact_uri, type, hash, created_at }, where artifact_uri may be artifact://report/..., artifact://plot/.... - Attachment convention
attach_traceability(report:dict, chain:list[str]) -> dict embeds traceability_chain = chain into the report; chain is a time-ordered list of artifact_uri created by upstream I40/I50 processes.
V. Diagnostic Layer: From Event to Root Cause
- Statistical diagnostics
- Residual r def= y - f(x; theta); weighted statistic chi2 = r^T R r; under normal assumptions, chi2 / dof ≈ 1 indicates health.
- Pattern recognition: sustained acf(r) implies model omissions; heavy tails in r_bar = r / sigma imply noise-model mismatch (see StudentT(nu)).
- Structural diagnostics
- Dimensional closure: verify check_dim( y - f(x; theta) ); on failure, first attribute to unit policy or RefCond.
- Numerical symptoms: p_hat below method order, elevated E_round, or cancellation-shaped residuals point to discretization or stability issues (see Chapter 5).
- Causal summary
Generate traceback_summary(ex:any) -> str; when multi-factor, output contribution ranking contrib_i aligned with EB.
VI. SLI/SLO and Threshold Conventions
- Produce windowed indicators via sli_slo_compute(SLI:dict, window:str); common definitions:
pass_rate def= (#pass) / N; error_ratio def= (#ERROR or higher) / N_events; latency_p95. - Quality thresholds
chi2 / dof < chi2_max, pass_rate ≥ target, drift_score(p,q,"KL") < drift_max. - Diagnostic triggers
When |z| > 3.5 or |r_bar| > t0, invoke the outlier pipeline (Chapter 4); when drift_score exceeds threshold, trigger re-estimation of theta and RefCond.
VII. Workflow Mx-4 (Logging → Trace → Diagnostics)
- Collect & log: after each measurement or compute, call log_event and fill the minimal set in §III.
- Bind chain: assign the same trace_id to related events; use attach_traceability to embed the evidence chain into the draft report.
- Statistical diagnostics: compute r, chi2, r_bar and windowed SLIs; update Chapter 5’s E_trunc_hat, E_round_hat into EB.
- Decision & routing: if §VI thresholds are met, tag level=INFO; otherwise escalate to WARN/ERROR and enter root-cause analysis.
- Root-cause tree: inspect in order—units → reference → numerics → model → data; produce traceback_summary and contribution ranking.
- Decision & closure: trigger retry, fallback, or parameter re-estimation per findings; finalize the report and persist the traceability_chain.
VIII. Arrival-Time T_arr Logging and Diagnostics Example
- Recording essentials
- measurand="T_arr", path_spec="gamma(ell): polyline with N=512", quadrature="simpson", h=1.25e-3 m, p_hat=3.9.
- value = ( ∑_k ( n_eff,k / c_ref ) * Δell_k ), unit="s", U = k * u_c(T_arr), k=2.
- Diagnostic priorities
- If chi2 / dof > 1.5 and r_bar shows segmentwise bias, first verify RefCond and that corr_env(•; RefCond) was applied;
- If p_hat < 3, refine h or enforce knotting at curvature changes;
- If E_round is elevated, enable compensated summation and segment-wise scaling of n_eff / c_ref.
IX. Interface Mapping and Compliance
- log_event(code:str, level:str, context:dict) -> None
context must include at least: trace_id, measurand, unit, value, U, EB, RefCond, unit_policy, model_id, data_fingerprint, code_fingerprint. - traceback_summary(ex:any) -> str
Output should include top_causes, evidence_refs, suggested_actions. - attach_traceability(report:dict, chain:list[str]) -> dict
Merge traceability_chain = chain into report and return; each chain element must carry artifact_uri and hash.
X. Minimal Report Field Set (Directly Checkable)
- header: ts, trace_id, code, level, domain.
- metrology: measurand, unit, value, U, k, EB, RefCond, unit_policy.
- model: model_id, theta, chi2, r_summary, SLI.
- numerics: h, p_hat, quadrature|solver, E_trunc_hat, E_round_hat.
- path: gamma(ell) description, measure_dell.
- evidence: traceability_chain, data_fingerprint, code_fingerprint, seed.
XI. Safety and Compliance (Minimum Requirements)
- Data minimization: hash personal or sensitive identifiers in context; keep only statistics.
- Integrity: every artifact_uri must carry a hash and a creation timestamp; sign digests for cross-system transfers.
- Retention: enforce log retention and access control via set_error_policy(domain, policy).
XII. Chapter Outputs and Linkage
- Outputs: event schema, evidence-chain specification, workflow Mx-4, thresholds and diagnostic conventions, interface mapping, and minimal report set.
- Next: Chapter 7 will use these traces to trigger retry, fallback, and graceful_degradation at runtime, closing the robust-recovery loop.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/