Home / Docs-Technical WhitePaper / 06-EFT.WP.Core.DataSpec v1.0
Chapter 7 — Versioning, Release, and Change
I. Scope and Objectives
- This chapter defines the semantic versioning of the schema (schema_version) and data releases (data_release), the processes and decisions for diff / patch / freeze_release, and the minimum requirements for compatibility matrices, rollback, and migration adapters.
- It covers coordinated versioning for D, S, manifest, pk/idx_k, Trace, hash_sha256, signature, and the cross-volume anchors T_arr, gamma(ell), n_eff(x,t), c_ref.
II. Terms and Symbols
- semver def= major.minor.patch, each a non-negative integer.
- schema_version binds to S; data_release binds to D.
- BC (backward compatibility); DC (downward compatibility: “new data can run on old readers”).
- A \ B (set difference), keys(r) def= the field set.
III. Postulates (P67-*)
- P67-1 Semantic Versioning: both schema_version and data_release must adopt semver and be explicitly recorded in the manifest.
- P67-2 Frozen Immutability: once freeze_release(D, tag) succeeds, that release’s content, statistics, and fingerprints are globally immutable; any correction must be published as a new data_release.
- P67-3 Reproducibility: given {schema_version, data_release, manifest, Trace}, it must be possible to reconstruct the same hash_sha256(blob).
- P67-4 Contract-First: any candidate release that fails assert_contract shall not enter the “frozen” state.
- P67-5 Arrival-Time Consistency: any data version containing T_arr must record both formulations and delta_form, and declare tol_Tarr; if delta_form > tol_Tarr, the release fails.
IV. Version Semantics and Comparison (S67-1)
- semver_compare(a,b) compares (major, minor, patch) lexicographically.
- Compatibility conditions:
- BC (schema vX.Y.Z -> vX.Y'.Z') iff the major is unchanged and all changes are additive or ignorable.
- DC has no default guarantee; declare reader_min explicitly.
- Recommended binding:
- Schema versioning: major+1 for breaking changes; minor+1 for field additions or non-breaking index changes; patch+1 for descriptive corrections.
- Data release: any data bit change, correction, or backfill advances at least patch+1.
V. Change Classes and Determination
- Additive & non-breaking (minor):
- Add field f_new with nullable(f_new)=True or a determinate default;
- Add secondary indexes, segment statistics, or Bloom filters;
- Add manifest.notes or Trace elements.
- Conditional non-breaking (minor, adapter required):
- Change unit(f) to an equivalent-dimension explicit unit (provide corr_env(x; RefCond) and a conversion function);
- Adjust the physical clustering order of pk while preserving pk semantics and uniqueness.
- Breaking (major):
- Change pk definition or field type incompatibly (e.g., int32 -> string with no injective mapping);
- Drop a field with no alias backfill;
- Change partition key K or path pid/ell_bucket scheme such that old paths are not parseable;
- Change CRS without an automatic reprojection path.
- Data-level fixes (data patch):
Value corrections, missing backfills, late-arrival window writes, T_arr recomputation (both forms and delta_form must be kept in sync).
VI. Diff Model (S67-2)
- Record diff:
diff_R = { r | r ∈ R_new \ R_old } ∪ { r | r ∈ R_old \ R_new } ∪ { r | keys(r) identical but values differ }. - Schema diff:
diff_S = (Fields_add, Fields_drop, Type_change, Unit_change, Constraint_change). - Index diff:
diff_I = (Index_add, Index_drop, Index_rebuild). - Change summary:
changelog = { scope ∈ {schema, data, index} -> diff_* }.
VII. Patching and Externalization (S67-3)
- Minimal operation set for patch_dataset(ds, patch):
- op=ADD_FIELD(name, type, default|nullable)
- op=DROP_FIELD(name)
- op=ALTER_FIELD(name, type|unit|dim)
- op=UPSERT_RECORD(pk, values)
- op=DELETE_RECORD(pk)
- op=REINDEX(keys)
- Constraint: after patching, both validate_dataset(..., strict=True) and assert_contract must pass; otherwise the patch is rolled back.
VIII. Release Workflow Mx-2
- Preparation:
- Finalize the candidate dataset and manifest, bump schema_version and data_release (bump_version).
- Attach provenance: attach_provenance(D, Trace); compute hash_sha256(blob).
- Validation:
- Run validate_dataset and assert_contract;
- For T_arr: compute both formulations and delta_form, and assert delta_form <= tol_Tarr.
- Metric gates:
Ensure quality_metrics(D) meet thresholds; verify latency baselines and partition-pruning rates for key queries. - Indexes and statistics:
build_index(D, keys); refresh segment stats and distinct_est. - Signing and freeze:
sign_data(D, keyref); freeze_release(D, tag); mark immutable. - Publish and announce:
Update registries and the compatibility matrix; produce the changelog and migration guide; start the observation window and rollback window.
IX. Rollback and Freeze Policies
- Hard freeze: only “sideband” releases may be added; no overwrites; tags are permanent.
- Soft freeze: within the observation window, allow hotfixes as patch+1, but a new tag must point to the new artifact.
- Rollback principle: rollback(tag_prev) must restore the previous viable data_release, and update consumer routing (readers point to tag_prev).
X. Compatibility Matrix (Indicative Rules)
- Dimensions: reader(schema_version) × dataset(schema_version) × data_release.
- Rules:
- If reader.major == dataset.major, assume compatibility by default;
- If reader.minor < dataset.minor and required fields were added, an adapter layer is needed;
- Different CRS or K requires a converter or routing rules;
- pk changes are major and require synchronized reader upgrades.
XI. Migration and Adapter Layers
- Field aliases: use register_field(..., aliases=[...]) and normalize_field for per-field compatibility.
- Units and dimensions: supply unit/dim converters and record corr_env(x; RefCond) under manifest.transforms.
- Partition/path: provide a query rewrite and routing table from old K to new K' until historical backfill completes.
- Indexes: when idx_k changes, provide a “write-new / read-old” window with dual-write flags.
XII. Manifest Extensions (Versioning and Release)
- versioning: { schema_version, data_release, prev_release, reader_min, notes }
- changelog: { diff_S, diff_R, diff_I, breaking: bool }
- integrity: { hash_sha256, signature, signed_by, signed_at }
- freeze: { tag, frozen_at, immutable: true }
- adapters: { fields_alias, unit_converters, routing_rules }
XIII. Alignment with Cross-Volume Anchors (Example: T_arr)
- Field set: { pid, ell, n_eff, c_ref, T_arr_factored, T_arr_general, delta_form }.
- Change rules:
- If c_ref reference updates but can be recovered via corr_env(x; RefCond), it is non-breaking (minor);
- If the path parameterization ell or the measure d ell changes, it is breaking (major); bump schema_version.major+1 and provide migration scripts.
XIV. Implementation Bindings (Aligned with I60 7)
- bump_version(schema:SRef, semver:str) -> None: update schema_version or data_release and write to manifest.versioning.
- diff_datasets(a:any, b:any, keys:list[str]) -> dict: return diff_R / diff_S / diff_I and a breaking flag.
- patch_dataset(ds:any, patch:any) -> any: execute the operation set in S67-3 and enforce strict validation.
- freeze_release(ds:any, tag:str) -> None: finalize signing, fingerprinting, and immutability marking.
XV. Audit and Governance
- Retention: enforce_retention(ds, ttl_days) cleans up temporary releases without affecting frozen artifacts.
- Traceability: Trace must include at least { source -> method -> artifact } with version fingerprints.
- Compliance: signature and hash_sha256 must be visible both on the release page and in the manifest.
XVI. Release Checklist (Executive Summary)
- Inputs: D, S, manifest, Trace, semver, target_tag
- Checks: validate_dataset, assert_contract, quality_metrics, T_arr delta_form <= tol_Tarr
- Artifacts: tagged dataset, manifest.versioning, indexes, signature, changelog
- Post: update compatibility matrix, switch reader routing, set observation and rollback windows
Copyright & License (CC BY 4.0)
Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.
First published: 2025-11-11|Current version:v5.1
License link:https://creativecommons.org/licenses/by/4.0/