06-EFT.WP.Core.DataSpec v1.0 | Chapter 6 — Partitioning, Indexing, and Query

Home ／ Docs-Technical WhitePaper ／ 06-EFT.WP.Core.DataSpec v1.0

Chapter 6 — Partitioning, Indexing, and Query

I. Scope and Objects

Define a unified approach for the physical partitioning, logical indexing, and query execution over D, so that partition / build_index / query deliver predictable latency and verifiable correctness across fmt backends.
Support time, space, and path gamma(ell) scenarios, coordinating with pk / idx_k, manifest, schema_version, CRS, unit(dim), and m ∈ {0,1}.

II. Terminology and Objectives

Objective: minimize scan volume and I/O amplification, maximize pruning and hit-rate, and guarantee determinism for pk and contract-bound queries.
Terms:
- K = [k1,k2,...,kd] (partition key order), B_i = card(partition on ki).
- sel(P) = |{ r ∈ R : P(r) }| / N (selectivity).
- C_proj (projected column set), alpha = |C_proj| / |Fields_total| (projection ratio).

III. Postulates (P66-*)

P66-1 Partition-First: whenever sel(P) is stable and ≪ 1, promote the predicate’s fields to a prefix of the partition key K.
P66-2 Monotone-Order: within a partition, ts and ell must be non-decreasing to enable range pruning and sequential scans.
P66-3 Unambiguous-pk: the primary key is globally unique across all partition levels; cross-partition updates materialize only via new parts.
P66-4 Metric Consistency: index ranges and manifest.stats intervals [min,max] must not conflict with data; on conflict, rollback the release.

IV. Partitioning Strategies and Naming

Time: date=YYYY-MM-DD/ or hour=YYYY-MM-DD-HH/; window width Delta_t and fs are declared in the manifest.
Space: crs=EPSG:xxxx/gh=p{p}/, where gh = geohash(lon,lat,p) or equivalent s2cell code.
Path: pid=<path id>/ell_bucket=[ell_lo,ell_hi)/, recording alongside L_gamma = ( ∫_gamma 1 d ell ).
Labels: sid=<site>/tid=<trajectory>/ and other low-cardinality dimensions for secondary bucketing.
Composite: K = [date, sid, pid] (keys ordered left-to-right by decreasing selectivity).

V. Cost Model for Partition-Key Selection (S66-1)

With top-to-depth d partition counts B = [B1,...,Bd] and predicate match fractions f = [f1,...,fd],
Under approximate independence (declare approx independence):
- E[parts(P)] approx prod_{i=1..d} ceil( fi * Bi ).
- V_scan approx E[parts(P)] * avg_bytes_per_part * ( alpha / ratio_compress ).
- L approx seek_cost * E[parts(P)] + V_scan / throughput_io.
Key selection rule: over candidates, minimize E[parts] and L, while constraining each B_i not to exceed the target metadata overhead.

VI. Time-Series Partitioning

Uniform sampling: each part covers fixed M * Delta_t; record { ts_start, ts_end, fs } and missing statistics gap_count.
Non-uniform sampling: enforce non_decreasing(ts); align row groups to time segments; enable interval stats { min(ts), max(ts) }.
Common key orders: K = [date, sid] or K = [hour, sid] (for high-concurrency ingestion).

VII. Space and Path Partitioning

Space:
- gh = geohash(lon,lat,p) with tile angular scale tile_deg ≈ 180 / 2^p; choose p from target spatial selectivity and hotspot distribution.
- Declare CRS and use float64 for lon/lat/alt; normalize across CRS before pushing predicates.
Path gamma(ell):
- Partition with K = [pid, ell_bucket], and align row groups to contiguous ell intervals;
- For T_arr datasets, retain both formulations and the delta_form field, enabling segment-level aggregation.

VIII. Directory and File Naming (Aligned with manifest)

Template:
.../dataset=DS/schema=S/version=vX.Y.Z/date=YYYY-MM-DD/sid=.../pid=.../ell=[a,b)/part-00000-of-000K.parquet.
Each part is recorded under manifest.files[*] as { path, bytes, hash_sha256, row_count, row_group_count, min,max }.

IX. Index Types and Storage Bindings

Primary index:
Clustered pk (ordered by pk or hash-bucketed), guaranteeing O(log N) or O(1) bucket location for equality lookups.
Secondary indexes (idx_k):
- B+tree: range queries and ordered scans (ts, ell, numeric intervals).
- hash: high-cardinality equality (uid, rid).
- bitmap: low-cardinality dims (sid, quality labels, enums); supports bitwise composite predicates.
- inverted: text and multivalue tags (tags[]).
Columnar auxiliaries:
- Zone maps: [min_i,max_i]; prune if disjoint with predicate intervals.
- Bloom filters: enable on hot columns; false-positive rate p_fp approx ( 1 - exp( -k * n_elem / m_bits ) )^k.
- Dictionary & RLE: amplify bitmap and range-pruning effectiveness.

X. Composite Indexes and Prefix Rules (S66-2)

Composite key order mirrors K: build idx(k1,k2,...).
Prefix usability: if predicates cover the prefix k1,...,kj, the index is directly usable; skipping prefixes degrades to full scan or suboptimal seeks.
Covering index: when idx includes C_proj, perform index-only scans, skipping data pages.

XI. Query Specification and Pushdown Order

Predicate normal form: P = CNF( conj_i disj_{i,j} pred_{i,j} ); normalize field and constant units via unit(dim).
Pushdown order:
- Partition pruning (directory keys & manifest.stats);
- Column segment pruning (zone maps & Bloom);
- Secondary index lookups (idx_k hits);
- Row filtering and applying m=1 mask;
- Projection C_proj and aggregation.
Quality filter convention: WHERE m = 1 AND q_score >= q_min.

XII. Canonical Query Templates

pk equality: SELECT * WHERE pk = rid → idx(pk) locate → single-row return.
Time window: SELECT C_proj WHERE ts ∈ [t0,t1) → partition prune date/hour → zone map → row filter.
Spatial window: SELECT * WHERE gh IN {tiles} AND CRS = ref → tile-set pruning → secondary index or row filter.
Path segment: SELECT C_proj WHERE pid = p AND ell ∈ [a,b) → prune with K=[pid,ell_bucket] → B+tree (pid,ell) scan.
T_arr segment aggregation:
- T_arr(factored) = ( 1 / c_ref ) * ( ∑ over rows n_eff * Δell )；
- T_arr(general) = ( ∑ over rows ( n_eff / c_ref ) * Δell )；
- delta_form = | T_arr(factored) - T_arr(general) | and assert delta_form ≤ tol_Tarr.

XIII. Consistency and Contract Checks

Partition consistency: assert unique(pk) across all partitions; non_decreasing(ts) and non_decreasing(ell) hold within partitions.
Index consistency: count(idx_lookup(all)) == N; index range stats match actual data.
Dimensional consistency: check_dim(expr) passes; units unified when equations are involved.

XIV. Workflow Mx-4 (Partitioning and Index Construction)

Workload profiling: estimate sel(P), hotspots, and write concurrency; generate candidate K sets.
Costing: use S66-1 to estimate E[parts], V_scan, L; select K and bucket counts B.
Physical write: materialize by K into part-*; generate manifest.files and segment stats.
Index build: build_index(ds, keys) to create pk and idx_k; prebuild Bloom filters where needed.
Contract checks: run assert_contract for uniqueness, time/path monotonicity, m and q_score rules.
Publish & freeze: export_manifest, attach_provenance, freeze_release(ds, tag).

XV. Runtime Parameters and Baseline Suggestions

Wide-table analytics: K=[date,sid]; columnar with stats and Bloom; row_group_size_bytes ≈ 128 MiB.
Path-heavy loads: K=[pid,ell_bucket]; choose ell_bucket so each partition meets V_scan and latency targets.
Spatial hotspots: raise p in hot regions, or enable hierarchical grids (multi-level gh).
High-cardinality equality: prefer hash or clustered pk; mixed range+equality: B+tree.

XVI. Implementation Bindings (Aligned with I60)

partition(ds:any, by:list[str]) -> dict[str, any]: materialize partitions by K, returning a partition map and stats.
build_index(ds:any, keys:list[str]) -> IRef: build pk / idx_k and return a reference.
query(ds:any, expr:str) -> any: execute normalized queries following “partition prune → index → filter → project.”

XVII. Manifest Extension Fields (Partitioning & Indexing)

partitioning: { keys: K, buckets: {k1:B1,...}, strategy, notes }.
indexes: [ { name, keys, type, covered_cols, bloom: {enabled, p_fp_target} } ].
stats: per part and per column { min, max, null_count, distinct_est }.

XVIII. Change and Compatibility

Changing the partitioning scheme or index types is a breaking change; bump schema_version.major and provide a migration plan.
Compatibility shim: provide query rewrite and routing from old K to new K' until historical backfill completes.

Copyright & License (CC BY 4.0)

Copyright: Unless otherwise noted, the copyright of “Energy Filament Theory” (text, charts, illustrations, symbols, and formulas) belongs to the author “Guanglin Tu”.
License: This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0). You may copy, redistribute, excerpt, adapt, and share for commercial or non‑commercial purposes with proper attribution.
Suggested attribution: Author: “Guanglin Tu”; Work: “Energy Filament Theory”; Source: energyfilament.org; License: CC BY 4.0.

First published： 2025-11-11｜Current version：v5.1
License link：https://creativecommons.org/licenses/by/4.0/