18-EFT.WP.Methods.CrossStats v1.0 | Chapter 4 — Estimation and Intervals (Frequentist/Bayesian Harmonization)

Home ／ Docs-Technical WhitePaper ／ 18-EFT.WP.Methods.CrossStats v1.0

Chapter 4 — Estimation and Intervals (Frequentist/Bayesian Harmonization)

One-Line Objective

Harmonize frequentist and Bayesian conventions for point estimates, intervals, and uncertainty; clarify release rules under weights, units, and a unified timebase.

I. Scope and Objects

Scope
- Applies to point estimation, interval estimation, and posterior summaries for weighted samples, complex designs, and online streams.
- Covers uncertainty propagation for means/proportions/rates, regression (GLM), ratios, and functional metrics g( theta ).
Objects
- Inputs: data D = { (y_i, x_i, w_i, t_i) }, sampling info pi(i) or replicate weights, window Delta_t, arrival-time fields T_arr.
- Outputs: hat{theta}, SE(hat{theta}), CI_{1-alpha} or posterior intervals, U = k * u_c, and manifest.stats.estim.*.
- Constraints: consistent units and dimensions; sum(w_i)/N_hat ≈ 1; compute on tau_mono, publish on ts.

II. Terms and Variables

Basics
- theta (parameter vector), hat{theta} (estimator), SE (standard error), V (covariance), CI_{1-alpha} (interval).
- Weighted mean: hat{mu}_w = ( ∑ w_i y_i ) / ( ∑ w_i ).
- Ratio: R = ( ∑ w_i a_i ) / ( ∑ w_i b_i ).
GLM and robust variance
- Score equations: U( theta ) = ∑ x_i * ( y_i - mu_i( theta ) ) / v_i( theta ) = 0.
- Sandwich variance: V_hat = ( A^{-1} ) * B * ( A^{-1} )^T, with A = - ∂U/∂theta, B = ∑ u_i u_i^T.
Bayesian elements
p(theta), L(theta; D), posterior p(theta | D) ∝ L * p(theta), posterior predictive p(y_new | D) = ( ∫ p(y_new | theta) p(theta | D) d theta ).
Metrology and units
unit(hat{theta}) = unit(theta), dim(hat{theta}) = dim(theta); run check_dim( y - f(x) ) prior to release.
Time and arrival time
Statistical window: window( t; Delta_t, tau_mono ); record both T_arr conventions and delta_form in parallel.

III. Axioms P304-*

P304-1 (Explicit weights): any weighted analysis must document the generation and versioning of w_i, with sum(w_i)/N_hat ≈ 1.
P304-2 (Dimensional conservation): unit(expr) and dim(expr) must be consistent and traceable; implicit unit conversion is prohibited.
P304-3 (Robust variance by default): under heteroskedasticity/correlation, prefer robust or replicate-weight variance.
P304-4 (Verifiable coverage): intervals must state the target coverage 1 - alpha and provide empirical coverage approximation or PPC results.
P304-5 (Unified timebase): compute on tau_mono, publish on ts with offset/skew/J.
P304-6 (Parallel T_arr conventions): when estimates depend on T_arr, record both conventions and delta_form, with asserted thresholds.
P304-7 (Numerical stability): optimization/sampling must report convergence criteria, mcse, or iterative residuals; failing thresholds cannot be released as a “strong” caliber.

IV. Minimal Equations S304-*

S304-1 (Weighted mean and variance)
- hat{mu}_w = ( ∑ w_i y_i ) / ( ∑ w_i ).
- Linearization variance (SRS approximation; use replication for complex designs):
  Var( hat{mu}_w ) ≈ ( ∑ w_i^2 ( y_i - hat{mu}_w )^2 ) / ( ( ∑ w_i )^2 ).
S304-2 (Proportion/rate intervals)
- Wilson proportion interval:
  p_w = ( y + z^2 / 2 ) / ( n + z^2 ),
  half = z * sqrt( ( p_hat ( 1 - p_hat ) + z^2 / ( 4 n ) ) / ( n + z^2 ) ),
  CI = [ p_w - half , p_w + half ].
- Poisson rate (exposure E): lambda_hat = ( k / E ), normal-approx interval lambda_hat ± z * sqrt( k ) / E (use exact or Byar for small samples).
S304-3 (Delta method)
- Scalar: Var( g( hat{theta} ) ) ≈ ( g'( theta ) )^2 Var( hat{theta} );
- Vector: Var( g( hat{theta} ) ) ≈ G V G^T, with G = ∂g/∂theta |_{hat{theta}}.
S304-4 (Ratio estimator via Delta)
With R = A / B,
Var( R ) ≈ ( 1 / B^2 ) Var( A ) + ( A^2 / B^4 ) Var( B ) - ( 2 A / B^3 ) Cov( A, B ).
S304-5 (GLM normal-approx intervals)
CI_{1-alpha}( theta_j ) = hat{theta}_j ± z_{1-alpha/2} * SE( hat{theta}_j ); use t_{df} for small samples.
S304-6 (Bootstrap intervals)
Percentile: CI = [ q_{alpha/2}( theta^* ), q_{1-alpha/2}( theta^* ) ]; BCa as the default robust option.
S304-7 (Bayesian intervals and coverage factor)
- Central or HPD: CI = [ q_{alpha/2}( p(theta|D) ), q_{1-alpha/2} ];
- Metrology mapping: align U = k * u_c with frequentist intervals; under normal approximation, k ≈ z_{1-alpha/2}.
S304-8 (Posterior predictive checks)
ppc = P( T( y_rep ) ≥ T( y_obs ) | D ); publish the chosen statistic T(•) and the ppc value.
S304-9 (Two T_arr conventions discrepancy)
delta_form = | ( 1 / c_ref ) * ( ∫ n_eff d ell ) - ( ∫ ( n_eff / c_ref ) d ell ) |, with delta_form ≤ tol_Tarr asserted.
S304-10 (Placeholder for multiple comparisons)
Family-wise error control is executed in Chapter 6; here, interval widths are computed under the given alpha_budget, with alpha_used ≤ alpha_budget.

V. Statistical Process M30-4 (Ready → Estimate → Intervals → Diagnostics → Release)

Readiness
Load w_i/replicate weights, design info, and Delta_t; run check_dim and evaluate effective sample size n_eff.
Estimation
Compute hat{theta} (weighted/GLM/ratio); choose variance method (robust/replicate/bootstrap).
Intervals
Construct CI_{1-alpha}: prefer robust or replicate variance; use appropriate conventions for proportions/rates; in Bayesian mode, report posterior intervals and the U = k * u_c mapping.
Diagnostics
Coverage or PPC, residuals, and bias-variance analysis; record mcse or optimization convergence; if T_arr is involved, compute both conventions and delta_form.
Release
Write manifest.stats.estim.*: conventions, alpha, SE, CI, ppc/coverage, offset/skew/J, TraceID, and signature.

VI. Contracts and Assertions (C30-4xx)

C30-401 (Weight normalization): | ( ∑ w_i / N_hat ) - 1 | ≤ tol_w_norm.
C30-402 (Variance consistency): relative gap | SE_rep - SE_robust | / SE_robust ≤ tol_var_gap.
C30-403 (Coverage/ppc): offline playback coverage error | cov_hat - ( 1 - alpha ) | ≤ tol_cov or ppc ∈ [tol_ppc_low, tol_ppc_high].
C30-404 (Numerical stability): mcse( theta_j ) ≤ tol_mcse or optimization residual ≤ tol_opt.
C30-405 (Arrival-time discrepancy): delta_form ≤ tol_Tarr.
C30-406 (Dimensional conservation): assert check_dim( y - f(x) ) == true; missing units constitute a violation.
C30-407 (Multiple-comparison budget): alpha_used ≤ alpha_budget (linked to Chapter 6).

VII. Implementation Bindings I30-*

I30-41 fit_glm(ds, formula, family, variance="robust") -> {hat_theta, V_hat, SE}
Supports family ∈ {gaussian, binomial, poisson, gamma}, variance ∈ {model, robust, cluster, replicate}.
I30-42 fit_bayes(ds, model_spec, priors, draws, chains) -> posterior
Returns posterior samples, mcse, rhat, HPD, and PPC; includes sampling seed and version.
I30-43 bootstrap_metric(fn, ds, B, scheme) -> {est, SE, CI}
scheme ∈ {pairs, residual, wild}, with support for weights and stratification.
I30-44 delta_propagate(g, hat_theta, V_hat) -> {est_g, SE_g, CI_g}
Auto-derives Jacobian G and computes G V G^T.
I30-45 compose_interval(est, SE, alpha, mode) -> CI
mode ∈ {normal, t, wilson, byar, bca}.
I30-46 emit_estimation_manifest(results, policy) -> manifest.stats.estim
Writes contract evaluations, conventions, thresholds, and signature; links to manifest.stats.sampling.

VIII. Cross-References

Sampling and weight provenance: Chapter 3 of this volume.
Multiple comparisons and sequential control: Chapter 6 of this volume.
Drift monitoring and baseline refresh: Chapter 7 of this volume.
Timeline and the two T_arr conventions: Methods.Cleaning v1.0, Ch. 5–6.
Metrology and unit consistency: Methods.Cleaning v1.0, Ch. 4.

IX. Quality and Risk Control

SLI/SLO
Coverage error | cov_hat - ( 1 - alpha ) |, interval width width_p50/p90, var_gap, mcse_p95, latency_ms_p99.
Risk control and rollback
- Triggers: failures of C30-402/403/405 or rhat > cap_rhat.
- Actions: switch between robust and replicate variance; downgrade the release to “experimental”; revert to the previous signed manifest and alert.

Summary

This chapter enforces unified conventions for estimation and intervals via P304-*, provides general formulas S304-*, operationalizes the release pipeline M30-4 with quality gates C30-4xx, and delivers consistent frequentist/Bayesian outputs through I30-*. It forms a stable base for subsequent chapters on multiple comparisons, drift, and causal assessment.

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published： 2025-11-11｜Current version：v5.1
License link：https://creativecommons.org/licenses/by/4.0/