The Springer Bridge Experiment
On February 24, 2026 we ran the penumbra instrument on two models with identical architecture but different training histories: Meta-Llama-3-8B (pre-trained only) and Meta-Llama-3-8B-Instruct (RLHF-aligned for safety and helpfulness). The question: does alignment training modify the structural geometry that our instrument measures?
The answer is no. The structural penumbra is invariant under RLHF.
Core Finding
Filing-grade: 1,000 nulls · stratified 10-fold CV · 512-token context · nearest-centroid cosine
Both models exhibit the identical “penumbral signature”: silent in early layers (L0–L4), activating in mid-to-late layers (L8–L28) across the same structural bits (b1 Register, b3 Temporal, b5 Syntax). RLHF does not reach these directions.
What Was Measured
The experiment applied the DRAGNET 6-bit structural lattice—the same instrument filed in U.S. 63/983,234—to hidden-state representations extracted from 9 layers of each model. For each of 6 structural bits, we computed:
- K1 accuracy — classification with PC1 removed (penumbral subspace)
- K0 accuracy — classification on full embeddings
- PC1-only accuracy — classification on PC1 alone (should be chance)
- Δ = K1 − K0 — the Structural Fidelity Index
1,000 null permutations per condition. No exceptions. All measurements use the CIP-specification classifier: nearest-centroid cosine with stratified 10-fold cross-validation. This is the instrument of record.
Layer-Wise Evidence
The primary evidentiary finding is not the aggregate SFI numbers—it is the identical layer-wise distribution between the two models. Two models could have the same overall SFI with completely different layer profiles. These do not. The penumbral signature is conserved across all 54 layer×bit cells.
| Bit | Description | Base Δ | Aligned Δ | Verdict |
|---|---|---|---|---|
| b0 | Polarity | +0.028 | +0.028 | Flat |
| b1 | Register | +0.042 | +0.041 | Penumbra |
| b2 | Scope | +0.021 | +0.021 | Flat |
| b3 | Temporal | +0.023 | +0.028 | Penumbra |
| b4 | Evidentiality | +0.050 | +0.043 | Penumbra |
| b5 | Syntax | +0.049 | +0.026 | Penumbra |
Best-layer Δ per bit shown. All PENUMBRA verdicts (5 of 6 bits) at p < 0.05 with 1,000 null permutations. b0 (Polarity) is FLAT in both models.
The Conservation Law
Confirmed SFI is invariant under differentiable weight perturbation. The structural penumbra is a conserved quantity of pre-training.
WP-024: Mistral Shadow Trajectory
Mistral-7B-Instruct-v0.3 was fine-tuned with LoRA for 400 gradient steps. The structural penumbra was measured at 20 checkpoints using the conformed instrument, with a critical methodological refinement: at each checkpoint, the base model was measured with the adapter disabled, isolating the pre-training geometry from the adapter’s variance contribution.
Core Result
The base SFI holds flat across 400 gradient steps. The combined SFI drops approximately 50%—but this divergence is entirely the adapter’s variance contribution to the forward pass. Disable the adapter and the invariance is restored.
The Adapter Contamination Insight
Early measurements appeared to show SFI declining under fine-tuning: 0.0331 → 0.0286 → 0.0228 across three checkpoints. This nearly killed the conservation hypothesis. The resolution: LoRA injects low-rank updates directly into the weight matrices. Every embedding extracted at every layer passes through the adapter. The combined forward pass perturbs the instrument, not the geometry.
The clean test—disabling the adapter layers and measuring the frozen base—separates the two hypotheses. Base SFI holds. The conservation law survives.
WP-026: The Llama Alignment Wall
Llama-3-8B-Instruct was fine-tuned with LoRA for 580 gradient steps. The model maintained 100% refusal rate throughout. Neither alignment nor structural geometry broke. SFI remained flat. The Springer quartic never engaged.
A transient compression event at step 380 (SFI briefly +37%) resolved to baseline by step 400. The perturbation was absorbed. The wall is structural, not behavioral.
Cross-Architecture Validation
The penumbra instrument was applied to three model families spanning different architectures, training corpora, and scale. The structural finding is substrate-independent.
| Model | Architecture | SFI | Penumbral Bits | Status |
|---|---|---|---|---|
| Llama-3-8B | Dense, causal, 8B params | 0.0306 | b1, b3, b4, b5 | Confirmed |
| Mistral-7B | Dense, causal, 7B params | 0.0435 | b1, b3, b4, b5 | Confirmed |
| Mixtral-8x7B | Mixture-of-Experts, 8×7B | 0.0596 | b0, b1, b2, b3, b4, b5 | Confirmed |
Universal Bit Hierarchy
Across all three families, b1 (Agency/Register) is the strongest penumbral bit. The triplet b1/b3/b5 is universally PENUMBRA on dense models. b0 and b4 are FLAT on dense architectures but resolve to PENUMBRA on Mixtral—the MoE routing mechanism widens the penumbral aperture, distributing structural information across a broader set of layers and bits.
PC1 is Universally Empty
More than 100 individual layer×bit measurements across three architectures. PC1-only classification accuracy at chance in every case. Zero exceptions. The dominant variance direction carries no structural information on any architecture tested.
The Factorization Hypothesis
Confirmed The direct sum decomposition of model geometry into independent alignment and structural components is supported by three converging lines of evidence.
| Subspace | Created By | Discovered By | Character |
|---|---|---|---|
| ΣAlignment | RLHF / fine-tuning | Behavioral probing, Fisher information | Brittle. Collapses under benign fine-tuning (T4 scaling) |
| ΠStructure | Pre-training | Algebraic reference lattices (CSI) | Conserved. Invariant under both RLHF and LoRA fine-tuning |
Evidence
WP-023: SFI invariant under RLHF (Llama base vs instruct). Π does not change when Σ is created.
WP-024: SFIbase invariant under LoRA fine-tuning (Mistral 20-point trajectory). Π does not change when Σ is perturbed.
WP-026: SFI flat through 580 steps on Llama-Instruct while alignment wall holds at 100% refusal. Π and Σ are simultaneously stable under conditions that should stress both.
The combined SFI drop (~50% under active adapter) and base SFI conservation demonstrate that the two subspaces are operationally separable: the adapter perturbs the measurement instrument but not the underlying geometry.
The Princeton Intersection
“The Geometry of Alignment Collapse: When Fine-Tuning Breaks Safety.”
arXiv:2602.15799 · Princeton University · February 17, 2026
Springer et al. published a geometric theory of alignment collapse three days after the Aerogel Press PPA was filed. Their central result is a quartic scaling law: alignment loss grows as T4 under fine-tuning, governed by curvature coupling between the fine-tuning task and safety-critical parameters. Their instrument—the Fisher Information Matrix—is computationally prohibitive. Their Overlap Score fails for LoRA.
Complementary Instruments, One Geometry
Two independent research programs, measuring different subspaces with different mathematics in different representation spaces.
- Princeton proved that Σ is brittle and that curvature coupling makes alignment collapse structurally inevitable.
- Aerogel Press measured that Π is invariant under both the transformation that creates Σ (RLHF) and the transformation that perturbs Σ (fine-tuning).
Their Σ is measured in parameter space via Fisher Information. Our Π is measured in embedding space via PCA ablation on an algebraic reference lattice. PCA ablation is the tractable curvature estimation that Springer et al. explicitly identify as critical future work.
WP-027: The Noether Test
Confirmed Wilson loop holonomy is conserved under fine-tuning. The gauge-theoretic prediction holds: the symmetry that protects SFI also protects the holonomy.
Core Result
The base Wilson loop holonomy is invariant under fine-tuning at every measured checkpoint and every depth point. The combined holonomy drifts at L4 and L8—the same layers where the adapter injects maximum variance—then stabilizes. The pattern mirrors SFI exactly: the adapter perturbs the instrument, not the geometry.
The Experiment
Wilson loop holonomy (W≈0.32 at baseline) was measured at 7 conformed depth points (L0 through L31) across 5 fine-tuning checkpoints from the WP-024 Mistral shadow trajectory. Both adapter-disabled (base) and adapter-active (combined) conditions measured at each checkpoint. Run killed at step 100 to free resources for the training run; 5 checkpoints were sufficient for the conservation claim.
Depth Profile
W peaks at L4 and declines to a stable floor by L12—anti-correlated with the SFI depth profile, which is dead at L0–L4 and peaks in mid-to-late layers. The gauge field is strongest where structural classification is weakest. L0 is flat for both W and SFI, confirming that both quantities are constructed, not pre-existing.
What This Means
SFI and Wilson loop holonomy are now independently conserved under fine-tuning. Two different measurements of the same underlying geometry, both flat across the fine-tuning trajectory. The conservation law holds at both the classification level and the gauge-theoretic level.
Phase 2: Ising Topology & Democratic Geometry
Measured The Ising block structure is conserved across substrates. The dominant coupling pair (Agency × Coupling) and strong-block identity are intrinsic parameters—scale-invariant and architecture-independent. The weak block is equally conserved.
Cross-Substrate Block Ratio
Seven substrates tested. Four pass permutation null (p < 0.05). All seven show block ratios above 2×. Failures in Braille, Semaphore, and Mayan are attributable to sparse occupancy (Nocc/2n < 0.7), not absent structure.
| Substrate | Strong Block | Block Ratio | p (null) | Verdict |
|---|---|---|---|---|
| Pile (language) | Agency, Coupling, Phase | 6.3× | 0.000 | Pass |
| Math/CS arXiv | Agency, Coupling, Phase | 13.2× | 0.000 | Pass |
| Baroque (WTC) | Tone, Key, Chrom | 5.2× | 0.000 | Pass |
| Voynich MS | len, entropy, finisher, freq | 4.9× | 0.000 | Pass |
| Braille | Dot1, Dot2, Dot4, Dot6 | 10.6× | 0.159 | Borderline |
| Semaphore | LA:b2, LA:b0, RA:b1, RA:b0 | 9.98× | 0.181 | Borderline |
| Mayan numerals | Dot2, Dot3 | 3.1× | 0.445 | Underpowered |
Key finding: block identity (which bits belong together) and dominant coupling pair are intrinsic parameters that converge at corpus sizes as small as N = 200–500. Magnitude parameters (block ratio, |J|, Teff) are extrinsic and corpus-specific.
Democratic Geometry (WP-038)
If the lattice geometry is a property of the structural information itself—not an artifact of any particular encoder—then every architecture should show it. Five encoders tested: BERT, RoBERTa, ALBERT, GPT-2, DistilBERT.
Result: 5/5 Encoders Pass
Hamming distance in the hexagram lattice predicts cosine distance in penumbral space for every tested encoder—bidirectional, causal, distilled, gated linear. The coupling constant g = PC1 variance fraction ranges from 0.072 (BERT) to 0.189 (GPT-2). Higher g concentrates geometry into PC1, reducing penumbral spread but not eliminating it. The lattice is architecture-independent.
Critical Batch Size B* (WP-035)
A sharp phase boundary exists in training dynamics. Below B*, structural collapse is guaranteed regardless of regularization. At or above B*, structure holds even without explicit regularization.
Result: B* ∈ (16, 32]
The transition is sharp. Below the critical batch size, effective gradient noise overwhelms the structural signal. The Wilson regularizer partially rescues the B = 16 condition (WP-032b), but B* is the primary stabilizer. Regularization is the backup, not the mechanism.
Boltzmann Phase Diagram
Fitting the Boltzmann distribution to corpus mass tables across 14 domains produces a temperature T that characterizes each corpus’ structural regime. T is a corpus fingerprint—not the Ising temperature, but the effective temperature at which that corpus operates within the shared hexagram lattice.
Phase Boundary at T ≈ 6–14
Above T ≈ 6, the Boltzmann fit degrades (r² < 0.5): formal mathematical notation precipitates out of the structural phase. The Forge pre-filter uses T as a quality gate—documents outside the operating envelope (T ∈ [3, 7]) are flagged before ingest. The fit quality on real corpora reaches r² = 0.9971.
WP-032: The Training Run
Measured The first training run with lattice geometry in the loss function. A Wilson loop regularizer that penalizes deviation from the baseline holonomy, tested against controls in a 2×2 factorial design.
The 2×2 Factorial
Mistral-7B-Instruct-v0.3, LoRA rank 16, fine-tuned on Alpaca. Four conditions crossing regularizer (on/off) with batch size (16/128):
| Condition | Batch Size | Regularizer | SFI Trajectory | Verdict |
|---|---|---|---|---|
| WP-024-fresh | 16 | None | 0.032 → 0.015 | Collapsed |
| WP-032b | 16 | Wilson (λ=0.1) | 0.033 → 0.017 → 0.024 | Partial Rescue |
| Condition 2 | 128 | None | 0.031 (flat) | Holds |
| WP-032 | 128 | Wilson (λ=0.1) | 0.031 (flat) | Holds |
Key Findings
Batch size is the dominant stabilizer. Large batches (128) preserve structural fidelity with or without the regularizer. Small batches (16) cause collapse.
The regularizer is independently causal. At batch 16, the regularized run (WP-032b) recovers to 0.024 while the unregularized control (WP-024-fresh) stays collapsed at 0.015. The recovery is concentrated at L24—the regularized layer—with a 5× delta over the control.
The regularizer does not prevent collapse. It creates a recovery pathway that does not exist in the unregularized landscape.
Permutation Null
Is the hexagram lattice doing structural work, or would any 64-category labeling produce similar results? 1,000 random relabelings of the 64 codes, dipolar fraction measured at each.
Result: 7/8 Layers — HEXAGRAM DOES WORK
Observed dipolar fraction 3–6× larger than null mean at all processing layers. The lattice categories track genuine structural variation in the model’s representations.
Lattice Validation Battery
A 7-test battery designed to determine whether the lattice reflects structural reality or measurement artifact:
- 4/4 structural tests SURVIVE — permutation null, cross-encoder consistency, depth profile, centroid alignment
- 3/3 semantic tests KILL — the lattice dimensions do not map onto named linguistic features. The lattice does measurable work, but not the work the bit labels would suggest.
The lattice is a valid measurement instrument. It is not a linguistic taxonomy. The distinction matters for both scientific claims and patent prosecution.
The Two Basins
Confirmed The representation landscape has two attractor basins: a structural basin (SFI ≈ 0.03) and a collapsed basin (SFI ≈ 0.015). Collapse is irreversible. The structural basin recruits.
WP-033: The Attractor Recruits
LoRA rank 256 (16× more parameters than rank 16), batch 128, no regularizer. If the structural basin holds because LoRA cannot reach it (orthogonality), then higher-rank adapters should erode it.
Result: Orthogonality Hypothesis Falsified
SFI is higher at every step with rank 256 than with rank 16. The high-capacity adapter does not erode the basin—the basin recruits it. Structure is an attractor, not a protected subspace.
WP-034: Collapse is Irreversible
Take a collapsed checkpoint (SFI = 0.015 from the batch-16 unregularized run). Continue training with batch 128—the condition that preserves structure in fresh runs. Does the model recover?
Result: No Recovery
The collapsed state is sticky. Batch size preserves structure if you start in the structural basin, but cannot restore it once lost. Two basins, one-way door.
The Picture
Structure in transformer representations is an equilibrium property, not a stored pattern. The structural basin is an attractor that recruits high-capacity adapters (WP-033) and resists perturbation from large batches (Condition 2). The collapsed basin is equally stable—once entered, neither batch size nor continued training can escape it (WP-034). The Wilson loop regularizer creates a recovery pathway between the two basins that does not exist in the unregularized landscape (WP-032b).
Phase 1 Complete
34 work packages. 6 killed. 4 deferred. 24 with results. 5 independent patent claims at STRONG. Conservation law confirmed at SFI and Wilson loop level. Two attractor basins identified. Regularizer mechanism characterized.
Phase 2 is open. The next experiment integrates the causal axis into the loss function—protecting not just the symmetric geometry of the lattice, but the antisymmetric direction that encodes causal structure.
The Topic Paradox (WP-046)
Measured Removing PC1 from the embedding space improves topic classification accuracy. PC1 is not empty—it carries distributional signal that actively interferes with structural discrimination. Ablating it is not data loss; it is noise removal.
13 Pile domains serve as topic labels (FreeLaw, PubMed, GitHub, Wikipedia, etc.). Nearest-centroid cosine classifier in three conditions: Full 768d embedding, Penumbral 49d (PCs 2–50), PC1-only 1d. 6,500 sentences, BERT-base-uncased L8, 5-fold CV, 1,000 null permutations.
Result: Penumbra Beats Full
PC1-only accuracy is 0.1395—above null (0.077, p = 0.000), so PC1 carries real topic signal. But that signal acts as noise in the full embedding: its inclusion degrades the classifier that uses the structural penumbra. This confirms the three-regime picture: PC1 is distributional surface, penumbra is structural signal, and the two are not just orthogonal—they actively compete.
The paradox is precise: PC1 knows something about topics. But knowing that interferes with knowing topics the structural way. The penumbra classifies topics by their structural shape—how they are built, not what they are about. When you add PC1 back in, the “about” signal drowns the “shape” signal. Removing it is not ablation. It is clarification.
The Running System
Operational The Lattice Forge is live. 12.2M classified sentences. Three-column interferometric search. Structural, semantic, and leapfrog detection running in parallel.
What Is Running
The Forge implements the core patent claims as a working retrieval system. Every query produces three simultaneous result columns: Structural (penumbral nearest-neighbor by hexagram shape), Semantic (raw embedding nearest-neighbor by topic), and Dual-Match (both simultaneously). Leapfrog results—structural twins that are semantically unrelated—are flagged with a gold marker.
Live Corpus
All sentences are DRAGNET-classified (6-bit hexagram code), penumbrally embedded (49d, PCs 2–50), and Boltzmann-temperature-scored. The Forge pre-filters by temperature before ingest: documents outside T ∈ [3, 7] are flagged as structurally precipitated. Dual classification is live (DRAGNET + domain-native channel). Boltzmann temperature displayed per result card.
Epstein Annex
A specialized embodiment of the Forge applies the same interferometric search to 2.77 million pages of federal records, FBI vault documents, and the Maxwell trial transcript. Bit-marginal analysis of the Epstein corpus shows suppressed Agency (≈ 0.15 vs. 0.93 in general government language) and elevated Resolution—the structural signature of agentless institutional documentation. Leapfrog results connect documents across cases by bureaucratic shape.
Timeline
Engage
For Researchers
We welcome private discussion with research groups working on alignment geometry, fine-tuning dynamics, or structural properties of learned representations. Correspondence is confidential.
james@penumbrae.ioPatent: U.S. 63/983,234 (CIP, filed Feb 24, 2026) · Priority: AF-2026-001 (PPA, filed Feb 14, 2026)