Home / Docs-Technical WhitePaper / 04-EFT.WP.Core.Metrology v1.0
Chapter 6 — Data Processing and Reporting
I. Objectives and Scope
- Define a unified workflow that transforms raw records y_raw into publishable results y_pub, covering data processing, quality control, unit and environmental normalization, rounding and uncertainty presentation, conformity assessment, and traceability reporting.
- Align with Chapter 4 (measurement models), Chapter 5 (uncertainty evaluation), and implementation interfaces I40 2/3/5/6. All expressions must satisfy check_dim and the rule “combine uncertainty first, then round.”
II. Data Records and Minimal Metadata
- Minimal raw-record structure R_min:
id (unique), time_utc, measurand, value_raw, unit_raw, site, instrument, RefCond_name, scenario, operator. - Validation fields:
hash (optional), sample_rate or Δt, window, filter_spec, notes, version_semver. - Constraints:
dim( value_raw ) = dim( unit_raw ); measurand must belong to a registered register_measurement item; RefCond_name must resolve to { p_ref, Temp_ref, humidity_ref }.
III. Processing Operators and Composition (S96-1)
- Define the processing map P(•):
- y_std = convert( y_raw, U_in, U_target )
- y_env = corr_env( y_std; RefCond )
- y_filt = F( y_env; filter_spec )
- y_hat = A( y_filt; window ) (aggregation or estimation)
- y_pub = re_dim( T_map( y_hat ), L0, t0, T0 ) (apply nondimensionalization and back-conversion if required)
- Dimensional conservation: dim( y_pub ) = dim( measurand ), and check_dim( y_pub - model(inputs) ) = "[1]" identically.
- Typical F and A:
- F = median_t[•; Δt], F = lowpass(•; f_c), F = Hampel(•; k, n);
- A = avg_t[•; Δt], A = quantile_t[•; q, Δt], A = peak_hold[•; Δt].
- Interface mapping: convert (I40 2), corr_env (I40 3), nondim/re_dim (I40 7).
IV. Missing Data and Outlier Handling
- Missing values: mark with NaN or a mask; any imputation must be recorded in notes and version_semver.
- Flag rather than silently drop:
MAD = median( |x - median(x)| ), threshold τ = 3.5; if |x - median(x)| > 1.4826 * τ * MAD, flag as outlier. - Imputation policy (visualization or non-critical statistics only):
Time-series linear interpolation interp_t; if block-missing ratio > 5%, do not use for conformity decisions. - Statistical-window declaration: any avg_t[•; Δt] or median_t[•; Δt] must explicitly state Δt.
V. Units, Reference Conditions, and Nondimensionalization
- Unit convention: always convert to U_target before environmental correction and statistics; aggregation across mixed units is forbidden.
- Environmental correction: y_env = corr_env( y_std; RefCond ) and log the chosen model and args.
- Nondimensionalization: bar_y = nondim( y_env, L0, t0, T0 ); if arrival-time conventions are involved, call enforce_arrival_time_convention(); re-dimensionalize via re_dim.
VI. Uncertainty Companions and On-the-Fly Combination
- Companion storage: for each y_pub, store { u(y_pub), k, U, nu_eff } and provenance { typeA, typeB, Cov }.
- Linear propagation: u_c(y_pub) = combine_uncertainty( J, u_inputs, Cov ); use MC for nonlinearity (see Chapter 5).
- Variance reduction for multi-point aggregation: if the window contains i.i.d. samples, u( avg_t ) = s / sqrt(n_eff); adjust n_eff for correlation.
VII. Rounding and Significant Digits (I40 6 Binding)
- Principle: compute U = k * u_c before rounding; ensure dim( U ) = dim( y_pub ).
- Significant digits for U:
If leading_digit( U ) ∈ {1,2}, keep two significant digits; otherwise keep one. - Alignment rule: let dec = decimals( U ) be the rounding precision of U, then
- y_rep = round_by_unc( y_pub, U ) -> ( value_rounded, dec )
- Report value_rounded ± U with both rounded to dec.
- Example rule: U = 0.013 (leading 1) → U = 0.013 (two significant digits), and value aligned to the 0.001 place.
VIII. Conformity Decision and Guard Bands
- Upper-tolerance “shared-risk” policy:
Guard band g = u_c; decision:- pass if result + g ≤ tol;
- fail if result - g > tol;
- inconclusive otherwise.
- Interface: guard_band(result, U, tol, rule="shared-risk"); for two-sided tolerances, apply to upper and lower limits separately and combine statuses.
IX. Minimal Reporting Schema and Fields
- Minimal report ReportV1:
- measurand, value, unit, u_c, k, U, nu_eff, CI_{1-α} (optional)
- RefCond_name, model, inputs (hash or digest), trace (traceability_chain)
- window, filter_spec, n_samples, missing_ratio, outlier_ratio
- decision_rule, tol, decision, guard_band
- scenario, instrument, calib_cert, version_semver, timestamp.
- Export interfaces: export_units, export_refcond, compare_reports (I40 8).
X. Processing and Reporting Example for Arrival Time T_arr
- Model (general form reiterated): T_arr = ( ∫_gamma ( n_eff / c_ref ) d ell ); declare the path gamma(ell) and measure d ell.
- Normalization and aggregation:
- L_gamma = ( ∫_gamma 1 d ell ), n_eff_avg = ( 1 / L_gamma ) * ( ∫_gamma n_eff d ell );
- T_arr = ( n_eff_avg * L_gamma ) / c_ref, with U composed per Chapter 5.
- Report snippet:
- measurand = "T_arr", unit = "s", value = value_rounded, U = U_rounded;
- RefCond_name = "StdAir", model = "T_arr_general", trace = ["c_ref(certX)", "n_eff(sensorY)", "path(gamma)"];
- window = "avg_t; Δt=10 s", filter_spec = "Hampel(k=3, n=7)";
- decision_rule = "shared-risk", tol = tol_value, decision = pass|fail|inconclusive.
XI. Tables and Visualization Conventions (Cross-Volume Guidance)
- Column order: value, U, unit, u_c, k, nu_eff, CI_{1-α}, RefCond_name, window, filter_spec, decision.
- Visualization (see figures volume): plot value with U error bars, annotate window and n_samples; do not co-plot different units on the same axis.
XII. Audit, Versioning, and Reproducibility
- Audit essentials:
version_semver, hashes of model and filter_spec; immutable references to inputs and trace; compare_reports differentials across ["mean","U","pass_rate"]. - Reproducibility checklist:
Persist RefCond, U_target, window, filter_spec; record seed (if MC used); for nondim/re_dim, record { L0, t0, T0 }.
XIII. Data Processing and Reporting Workflow (Mx-4)
- Receive and validate R_min (units, dimensions, metadata, missing/outlier flags).
- Execute convert -> corr_env -> F -> A -> T_map -> re_dim to produce y_pub.
- Compute u_c and U, evaluate nu_eff and CI_{1-α}; complete rounding via round_by_unc.
- Apply guard_band and tolerances; generate the decision.
- Assemble ReportV1 and persist; export required artifacts and the traceability chain; run compare_reports for regression when needed.
XIV. Interface Anchors to Other Volumes
- With Core.Parameters: statistical window avg_t[•; Δt] and check_dim are consistent; parameter names and transforms T_map follow the conventions.
- With Core.Equations: when a report references gamma(ell), d ell, or n_eff(x,t), adhere to the corresponding Sxx-* path and measure definitions.
- With Core.Metrology Chapters 4–5: the register_measurement model must compose with this chapter’s Mx-4 pipeline; uncertainty budgets embed directly into the reporting chain.
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/