Pragmatic Semiotics and Large Language Models
Sign Navigation vs. Information Transmission in LLMs
We examine how the Pragmatic Semiotics Model (PSM) framework applies to understanding the emergent communicative behaviour of large language models, arguing that LLMs navigate sign systems rather than transmit information.
Introduction
The emergence of large language models (LLMs) as dominant tools for text generation has prompted renewed interest in the underlying mechanisms of meaning-making. The dominant paradigm treats language generation as a probabilistic prediction task — yet this framing elides fundamental questions about how these systems participate in sign relations.
In this paper we apply the Pragmatic Semiotics Model (PSM) to analyse LLM communication. The PSM holds that communication is fundamentally navigational: a speaker selects among available sign vehicles to steer an interpreter toward a target interpretant, rather than encoding a pre-formed thought for decoding.[1]
We argue that LLMs, despite lacking intentional states, exhibit behaviour structurally consistent with the navigational account — and that this framing yields sharper predictions about failure modes and alignment challenges than the transmission model.
Sign Systems in LLMs
A sign, in Peircean terms, is a triadic relation: a sign vehicle (representamen) mediates between an object and an interpretant.[2] The token embeddings of an LLM approximate a compressed representation of the sign-vehicle space of a training corpus, while attention patterns encode contextual constraints on which interpretants are licensed.
The key insight is that next-token prediction selects among sign vehicles constrained by prior context — it is irreducibly semiotic in nature. The model navigates a high-dimensional sign space to produce a vehicle that, in context, tends to elicit appropriate interpretants in human readers.
This reframing dissolves several puzzles in LLM research: apparent hallucinations are sign-selection failures (plausible vehicle, wrong object), not encoding errors.
The probability of selecting sign vehicle given context can be written:
where is the token embedding, is the context representation, and is the temperature parameter. At the model becomes deterministic (greedy decoding); at it samples uniformly over the sign space.
Implications for Alignment
If LLMs navigate sign systems rather than transmit information, alignment must be understood as semiotic alignment: ensuring that the sign vehicles selected by the model reliably elicit the intended interpretants in target interpreters.
Several practical implications follow. First, evaluation metrics that measure surface similarity (BLEU, ROUGE) conflate sign vehicles with objects and are therefore theoretically inadequate. Second, instruction-following failures may stem from sign-space divergence between training distribution and deployment context, not from capacity limitations.
Future work should develop semiotic benchmarks that measure interpretant-level outcomes rather than vehicle-level similarity.