computationsemiotics

Pragmatic Semiotics and Large Language Models

Sign Navigation vs. Information Transmission in LLMs

The Device Research Group2025-01 last modified Mar 28, 2026

We examine how the Pragmatic Semiotics Model (PSM) framework applies to understanding the emergent communicative behaviour of large language models, arguing that LLMs navigate sign systems rather than transmit information.

semioticsllmpragmaticscognition

Introduction

Large language models have become the dominant tools for text generation, and their success has reopened basic questions about how meaning works. The standard paradigm treats language generation as a probabilistic prediction task — yet this framing skips past the question of how these systems participate in sign relations.

In this paper we apply the Pragmatic Semiotics Model (PSM) to analyse LLM communication. The PSM holds that communication is fundamentally navigational: a speaker selects among available sign vehicles to steer an interpreter toward a target interpretant, rather than encoding a pre-formed thought for decoding.^[1]

We argue that LLMs, despite lacking intentional states, exhibit behaviour structurally consistent with the navigational account — and that this framing yields sharper predictions about failure modes and alignment challenges than the transmission model.

Sign Systems in LLMs

A sign, in Peircean terms, is a triadic relation: a sign vehicle (representamen) mediates between an object and an interpretant.^[2] The token embeddings of an LLM approximate a compressed representation of the sign-vehicle space of a training corpus, while attention patterns encode contextual constraints on which interpretants are licensed.

Next-token prediction selects among sign vehicles constrained by prior context. The model navigates a high-dimensional sign space to produce a vehicle that, in context, tends to elicit appropriate interpretants in human readers.

This reframing dissolves several puzzles in LLM research: apparent hallucinations are sign-selection failures (plausible vehicle, wrong object), not encoding errors.

The probability of selecting sign vehicle $v$ given context $c$ can be written:

P(v \mid c) = \frac{\exp(\mathbf{e}_v \cdot \mathbf{h}_c / \tau)}{\sum_{v'} \exp(\mathbf{e}_{v'} \cdot \mathbf{h}_c / \tau)}

where $\mathbf{e}_v$ is the token embedding, $\mathbf{h}_c$ is the context representation, and $\tau$ is the temperature parameter. At $\tau \to 0$ the model becomes deterministic (greedy decoding); at $\tau \to \infty$ it samples uniformly over the sign space.

The Peircean semiotic triad as instantiated in LLMs. Attention mechanisms mediate the relationship between sign vehicle (token), object (semantic referent), and interpretant (contextual meaning activated in the reader). click to expand

Implications for Alignment

If LLMs navigate sign systems rather than transmit information, alignment must be understood as semiotic alignment: ensuring that the sign vehicles selected by the model reliably elicit the intended interpretants in target interpreters.

A few practical implications follow. Evaluation metrics that measure surface similarity (BLEU, ROUGE) conflate sign vehicles with objects and are theoretically inadequate. Instruction-following failures may stem from sign-space divergence between training distribution and deployment context, not from capacity limitations.

The needed benchmarks would measure interpretant-level outcomes rather than vehicle-level similarity.