Not what text
is about.
How it is built.

Interferometric Search finds sentences that make the same logical move — regardless of subject, domain, or vocabulary. A line from a legal deposition and a sentence from a mathematics paper can be structural twins. Semantic search will never show you that. This one does.

What it looks like

Here is a real query run against 66,000 sentences drawn from academic literature, technical prose, and general language corpora.

Query
"When in the course of human events it becomes necessary for two people to break ties."
▲ Structural results
"In many situations one encounters an entity that resembles a monoid."
arxiv · math leapfrog
"When, to a primitive laxation a consecutive one succeeds, several days may concur in its production."
general corpus leapfrog
"The scientist often assumes that to a pupil a scientific fact or law is its own excuse for being."
general corpus
— Semantic results
"In the process where the evolution between systems is made in this way, a situation may occur where it is necessary to support both the communication systems before and after the evolution."
general corpus
"Especially, in a network of wireless communications, a certain period of time is needed to determine whether or not communications can be performed on the same network."
general corpus
The structural column returned a category theory abstract — a sentence about algebra — alongside sentences from entirely different domains. The semantic column returned sentences about communication systems and networks. The query was about human events and breaking ties. These results have never been compared before. They share logical shape, not subject matter. The results marked leapfrog appear in the structural column but not the semantic column. That gap is the signal.

The question you are probably asking

A mathematics sentence and a sentence about human events are structural twins. How is that possible? What does "same logical move" actually mean? The answer is in what we removed.

Step 1 — The noise

Every modern language model encodes text as a high-dimensional vector. When you run principal component analysis on a large corpus of these vectors, the first principal component — PC1 — dominates. It captures the loudest signal in the data: surface meaning, topic, vocabulary, domain. "Mathematics" lands far from "legal brief" in PC1 space. That is useful for topic search. It is noise for structural search.

Step 2 — The penumbral subspace

We remove PC1. What remains — principal components 2 through 50 — is the penumbral subspace. This subspace carries structural information: the shape of how a sentence is constructed, independent of what it is constructed about. A passive-stative construction about a legal obligation and a passive-stative construction about a topological invariant land in the same neighborhood of penumbral space. They should. They are the same move.

This is not a hypothesis. It has been measured across six transformer architectures, including non-transformer models. The structural signal survives model fine-tuning. It survives domain shift. It is a geometric property of the embedding, not a classifier artifact.

Step 3 — The hexagram lattice

We classify every sentence against a 64-cell reference lattice — the hexagram — derived from six binary properties of the root verb morphism: polarity, agency, coupling, phase, scope, and resolution. Every sentence in every formal domain has a structural address between 0 and 63. Structural search is retrieval by address.

The 64 cells are not arbitrary. The mass distribution of a 66,000-sentence corpus across these cells follows a Boltzmann distribution — R² = 0.996 at octupole order. The lattice is a real reference frame, not a labeling convention.

Step 4 — The dual channel

Every sentence carries two simultaneous signals: the grammatical chassis of the statement (read by DRAGNET, our morphism classifier) and the conceptual machinery of the domain (read by a domain-native classifier, where one exists). Running both produces a dual-channel structural fingerprint. The Epstein deposition and the legal brief from an unrelated case share a DRAGNET code. That is not coincidence. That is the same logical operation appearing in two different proceedings.

What is new here

I
PC1 ablation as structural signal
The first principal component of transformer embeddings is noise for structural classification. Its removal is the key that unlocks the penumbral subspace. Without ablation: SFI = 0. With ablation: SFI = 0.031, p < 0.001.
II
The hexagram as reference frame
A 64-cell lattice derived from verb morphism properties, validated against a 4.4 million sentence corpus. The mass distribution is Boltzmann. The structure is real. It is the index.
III
Dual-channel fingerprint
Universal NL structure (DRAGNET) plus domain-native structure (CT, Legal, Medical) in the same query. Two independent structural addresses for every sentence. The intersection is the tightest structural twin.

The system knows its limits

Not all text is structurally indexable. The Boltzmann model has a phase boundary. Cold formal prose — mathematics abstracts, legal filings, scientific literature — sits well within the operating domain. Raw mathematical notation sits outside it. The system measures this before indexing and refuses to classify documents that would produce unreliable structural codes.

DomainTemperatureRegime
arXiv math · CT3.140.72Cold
Legal (FreeLaw)3.700.84Cold
PubMed4.050.82Cold
Pile (general)4.150.86Baseline
Wikipedia5.460.66Warm
Raw mathematics (LaTeX)13.860.05Outside model
Temperature is not a quality score. It is a measurement of how much structural variety a corpus expresses. Cold domains use a narrow band of the hexagram. Warm domains use more. The leapfrog — finding a cold-domain result for a warm-domain query — is what makes cross-domain structural search interesting.
Try it yourself
66,000 sentences. Structural search, semantic search, dual-match.
Enter any sentence. See what makes the same logical move.
Open Lattice Forge →