Three searches. One query.
Lattice Forge runs three parallel searches on every query you enter.
One of them is what you already know. The other two are new.
The three columns
The right column. Standard similarity search. It finds sentences
about the same topic as your query — the kind of results you get from any
modern search engine. "Bridge collapse" returns other text about bridges, infrastructure,
structural failure. This is your baseline. It is here so you can see what the other
two columns do differently.
The left column. This finds sentences that make the same logical move
as your query, regardless of topic. "Bridge collapse" might return a horse race,
a legal verdict, or a sentence from a mathematics paper — because all of them
share the same underlying grammatical shape: an agent pressing forward, a dynamic
process, a resolved outcome. The vocabulary is different. The structure is the same.
Results marked with a gold star (✦) are
leapfrogs: they appear here but not in the semantic column.
These are the cross-domain discoveries — sentences no keyword search would ever connect.
The middle column. The tightest filter. These results match your query
on both the continuous structural geometry and the discrete hexagram code
(explained below). When two sentences agree on both measurements independently, they are
confirmed structural twins. Fewer results, higher confidence.
Reading a result card
Every result card shows you more than the text. Here is what the annotations mean.
What you see on each card
Source badge
Where the sentence came from: pile (general corpus),
CT (category theory), efta (government documents),
epstein (Epstein files), fbi_vault (FBI releases).
d = 0.34
Distance from your query in the search space. Lower = closer match.
Structural distances and semantic distances are not directly comparable —
they measure different things.
Six dots
The structural fingerprint. Six binary properties of the sentence,
read from its root verb. Lit = on. Dark = off. Hover to see which property.
D:38
The hexagram code — a number between 0 and 63. It is the
six dots packed into a single integer. Two sentences with the same code answered
the same way on all six structural questions.
T = 3.70
Boltzmann temperature of the source domain. Cold domains (T < 4)
use a narrow structural vocabulary — formal prose. Warm domains (T > 5)
use a wider range.
✦ Leapfrog
This result appears in the structural column but not in the semantic
column. It crossed the topic barrier. This is the signal.
The six structural dimensions
The structural search does not read meaning. It reads shape — six binary
properties extracted from the root verb of the sentence. These properties describe
what the sentence does, not what it is about.
Example: "The investigation revealed financial irregularities."
111101 = code 47
Growing polarity (something emerged). Active agent (the investigation acts).
Transitive coupling (it acts on something). Dynamic phase (a change occurred).
Local scope. Resolved outcome (indicative mood — it happened).
The six properties
Polarity
Is the verb accumulating or depleting? Built vs collapsed.
Agency
Is the subject an active agent or a passive recipient?
The committee reviewed vs It was reviewed.
Coupling
Does the verb take a direct object (transitive) or act alone (intransitive)?
Filed the motion vs The motion expired.
Phase
Does it describe a state or a change?
The account exists vs The account closed.
Scope
Is it a particular event or a general condition?
He flew on March 3 vs Flights occurred regularly.
Resolution
Is the outcome settled or open?
The court ruled (resolved) vs The court may rule (unresolved).
Pack those six answers into a 6-bit number and you get a structural address
between 0 and 63. Every sentence in the corpus has one. Every query you enter gets one.
The Structural column finds sentences nearby in the continuous geometry of this space.
The Dual-Match column additionally requires the discrete code to match exactly.
Why the results look surprising
Modern language models encode every sentence as a high-dimensional vector.
The first principal component of that vector — PC1 — carries
topic, vocabulary, and domain. It is the loudest signal. It is what semantic search uses.
We remove it.
What remains — principal components 2 through 50 — is the
penumbral subspace. It carries the structural shape of the sentence,
independent of subject matter. A legal deposition and a mathematics abstract that make
the same logical move land in the same neighborhood of penumbral space.
They should. They are the same move.
That is why the Structural column returns results that look topically unrelated to your query.
They are structurally related. The topic signal was removed on purpose.
This is not a hypothesis. The structural signal has been measured
across six transformer architectures including non-transformer models.
The hexagram lattice's mass distribution across a 4.4-million-sentence corpus follows
a Boltzmann distribution (R² = 0.996). The structure is real.
The removal of PC1 is the key that makes it visible.