HomeDocs-Technical WhitePaper18-EFT.WP.Methods.CrossStats v1.0

Chapter 15 — Use Cases & Reference Implementations


One-line objective: Demonstrate three end-to-end paradigms—offline evaluation, online A/B (sequential/multiplicity), and cross-domain calibration transfer—to operationalize P30x/S30x/M30/I30, forming an auditable, reproducible, and rollback-capable statistical practice loop.


I. Scope & Targets

  1. Scope
    Offline model evaluation and release gates; online experimentation (sequential/multiple) with closed-loop decisioning; cross-domain calibration transfer and baseline updates.
  2. Targets
    • Inputs: D_clean, manifest.*, ref, slo_policy, alpha_budget, bins, sync_ref.
    • Outputs: eval_report, ab_decision, calibration_map, drift_report, slo_attainment, manifest.stats.*.
    • Constraints: check_dim(expr) passes; metric windows computed on tau_mono; if T_arr is involved, record both formulations in parallel and persist delta_form.

II. Terms & Symbols


*III. Axioms P315- **


*IV. Minimal Equations S315- **

  1. S315-1 (Weighted Mean/Variance)
    • hat{mu}_w = ( ∑ w_i y_i ) / ( ∑ w_i )
    • hat{sigma}_w^2 = ( ∑ w_i ( y_i - hat{mu}_w )^2 ) / ( ∑ w_i )
  2. S315-2 (Two-Arm Sample Size Approximation)
    n_per_arm ≈ ( ( z_{1 - alpha/2} + z_{power} )^2 * 2 * sigma^2 ) / MDE^2
  3. S315-3 (ECE)
    ECE = ∑_{b=1..B} ( n_b / N ) * | acc_b - conf_b |
  4. S315-4 (Sequential Stopping)
    tau = inf { t : S_t ≥ h_upper or S_t ≤ h_lower } (with S_t the cumulative log-likelihood ratio).
  5. S315-5 (Drift Metrics)
    W1(p,q), KL(p||q), psi = ∑ ( (q_i - p_i) * ln( q_i / p_i ) ) (binned).
  6. S315-6 (Arrival-Time Gap)
    delta_form = | ( 1 / c_ref ) * ( ∫ n_eff d ell ) - ( ∫ ( n_eff / c_ref ) d ell ) |

V. Metrology Flow M30-15 (Three End-to-End Use Cases)

  1. Use Case A: Offline Evaluation & Release Gate (Batch → Freeze)
    • Ready
      1. Run standardize_names and repair_units (see Methods.Cleaning).
      2. time_align_for_stats(ds, sync_ref); compute_weights(ds, scheme).
    • Estimate
      1. fit_glm(ds, formula, family) or import model scores.
      2. bootstrap_metric(fn, ds, B) to produce {est, CI}; calibration_report(pred, obs, bins).
    • Check
      detect_drift(ref, cur, metrics); evaluate_stat_contracts(metrics, rules); if needed, backtest_coverage(ds, plan).
    • Persist
      emit_stats_manifest(results, policy); coordinate with Methods.Cleaning freeze_release(ds, tag) for artifact release.
  2. Use Case B: Online A/B & Sequential Decisions (Stream → Decide → Rollback/Ship)
    • Ready
      design_experiment(pop, constraints, alpha, power); register alpha_budget and slo_policy.
    • Run
      run_ab_test(stream, metric, alpha_spending) to emit real-time S_t and interim decisions; track_alpha_spending(seq_tests).
    • Guard
      drift_monitor(ref, cur, methods); latency_summary(traces); when exposure bias is detected, estimate_ate(ds, method=DR).
    • Close
      compute_slo_attainment(metrics, slo); audit_decision(trace, manifest); on violation, execute rollback plan and re-experiment.
  3. Use Case C: Cross-Domain Calibration Transfer & Baseline Update (Domain A → Domain B)
    • Ready
      Collect samples from A/B; harmonize units via repair_units and align time via time_align_for_stats.
    • Transfer
      calibration_transfer(src=A, dst=B, method ∈ {Platt, Isotonic, BBQ}) -> map; enforce monotonicity and guard against overfitting.
    • Validate
      On domain B, evaluate ECE, Brier before/after; detect_drift to ensure W1/KL/psi stay within thresholds.
    • Publish
      emit_stats_manifest to manifest.stats.calibration.*, including map.version, bins, ECE_before/after, then sign and archive.

VI. Contracts & Assertions (Use-Case Mapping C30-151x)


VII. Implementation Bindings I30- (Use-Case Subsets)*


VIII. Cross-References


IX. Quality & Risk Control

  1. Use Case A (Offline)
    • SLIs: coverage_rate, ECE, Brier, W1/KL/psi.
    • Rollback: switch to more conservative intervals (bootstrap/Bayesian quantiles), increase B, or roll back to ref.
  2. Use Case B (Online)
    • SLIs: latency_ms_p99, alpha_spent, FDR, decision_sign_stability.
    • Rollback: gray-rollback with traffic reduction, freeze stopping boundaries, withdraw variants while conserving alpha_budget.
  3. Use Case C (Cross-Domain)
    • SLIs: ECE_after - ECE_before, Brier_after, drift_level.
    • Rollback: disable map, revert to in-domain calibration, or trigger resampling.

Summary

The three use cases cover the critical paths for offline evaluation, online decisioning, and cross-domain transfer. Each adopts P315-* as non-negotiable premises, S315-* as computational baselines, M30-15 as the process spine, and C30-151x as release gates. Via I30-* interfaces, statistical gauges are unified with cleaning/time-base/audit systems into an integrated, fully traceable production practice.

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/