| Domain | h (bits) | n | κ | h·ln2 / √κ | Closes? |
|---|---|---|---|---|---|
|
Genomic
5,627 genomes · BiosphereCodec
|
1.61
Shannon entropy
measured
transition/transversion bias |
2.00
Tree topology
measured
Gromov δ = 0 |
1.34
GTDB telescope
measured
250 genomes, Spearman peak |
0.96 | ✓ |
|
Viral
15 families · RNA + DNA viruses
|
0.8–2.9
Per-family entropy
measured
RNA & DNA polymerases |
2.00
Tree topology
measured
phylogenetic reconstruction |
0.3–4.1
BiosphereCodec
measured
per-family embedding |
r = 0.996 | ✓ |
|
Linguistic
1,015 languages · 18 families
|
1.65
Index Diachronica
measured
16,496 sound change rules |
2.00
Gromov δ = 0
measured
Sarkar H² ≈ H³ |
1.31
(h·ln2)²
predicted
telescope: 0.75 (compressed) |
1.00 | ✓ |
|
Proteomic
14 Pfam families · BLOSUM62
|
2.81
BLOSUM62 effective alphabet
measured
keff ≈ 7 |
2.03
BiosphereCodec
measured
embedding |
3.80
(h·ln2)²
predicted
awaiting tree validation |
1.00 | ◯ |
|
Neuropixels
39 Steinmetz sessions · SPD(180)
|
1.04
Volume entropy
measured
geodesic ball growth, 2.4s |
2.03
From h + κ
implied
p = 0.59 vs n = 2 |
0.485
Triangle excess
measured
bootstrap CI ± 0.005 |
1.03 | ✓ |
|
GPT-2 (layer 9)
124M params · autoregressive
|
0.97
Volume entropy
measured
activation covariance SPD |
2.04
From h + κ
implied
|
0.413
Triangle excess
measured
179 covariance windows |
1.05 | ✓ |
|
BERT (layer 6)
110M params · bidirectional encoder
|
0.96
Volume entropy
measured
activation covariance SPD |
2.05
From h + κ
implied
|
0.403
Triangle excess
measured
180 covariance windows |
1.05 | ✓ |
|
DistilGPT-2 (layer 3)
82M params · distilled autoregressive
|
1.05
Volume entropy
measured
activation covariance SPD |
2.15
From h + κ
implied
|
0.398
Triangle excess
measured
179 covariance windows |
1.15 | ✓ |
|
fMRI
ABIDE Pitt · 20 subjects · cc200
|
1.70
Volume entropy
measured
20s windows, 60 ROIs |
2.72
Whole-brain recurrence
implied
n > 2 as predicted |
0.469
Log-Euclidean triangle excess
measured
20s windows |
1.72 | n > 2* |
|
EEG
EEGBCI · 20 subjects · EO/EC
|
0.97
Derived from dcorr + κ
derived
EO > EC (p = 0.04) |
2.19
Correlation dimension
measured
Grassberger-Procaccia on AIRM |
0.284
AIRM triangle excess
measured
64 sensors, alpha band |
1.19 | ◯ |
|
ViT-Base (layer 12)
86M params · vision encoder
|
1.01
Volume entropy
measured
activation covariance SPD |
2.00
From h + κ
implied
monotonic L1→L12 convergence |
0.486
Triangle excess
measured
46 covariance windows |
1.00 | ✓ |
h is the entropy rate of the information code, measured in bits per symbol. For DNA, it reflects the effective alphabet of accessible mutations (~3 transitions per nucleotide). For neural systems, it is the volume entropy — the exponential growth rate of geodesic balls on the SPD covariance manifold, derived from Manning’s theorem (1979). For language, it is the transition entropy of sound changes.
n is the embedding dimension. For tree-structured data (genomes, languages), n = 2 because trees embed isometrically into the hyperbolic plane (Gromov δ = 0). For neural and AI systems, n is implied from independently measured h and κ via the state equation. The 39-session Neuropixels cohort gives n = 2.03 ± 0.36 (p = 0.59 vs n = 2). All three transformer architectures have layers where n ≈ 2.
κ is the sectional curvature, measured by triangle excess on the data manifold. For genomic data, this is the Poincaré ball embedding; for neural and AI data, it is the SPD covariance manifold with the Log-Euclidean or AIRM metric.
h·ln2 / √κ is the closure test. If the state equation holds with n = 2, this ratio should equal n − 1 = 1. A value within 5% of 1.0 indicates the equation closes.
active-geometry — Paper I: Genomic, viral, proteomic validation
information-geometry — Paper II: Neural and AI validation
convergent-alphabets — Paper III: Linguistic validation