Research note / interactive draft

Game Theory, Negotiation, and LLM Agents

Game theory is a tempting lens for agent-to-agent interaction: conflict, cooperation, bargaining, uncertainty, equilibrium. But with language models, the interesting question is not just whether they can imitate strategic behavior. It is whether their choices reveal something stable about how they encode value.

My working hypothesis is modest but sharp: in some prompt regimes, current models seem more sensitive to preference order than to payoff scale. That is weaker than claiming they have an ordinal utility function. It is also much more defensible.

game theory negotiation decision under uncertainty llm agents interactive prototype

Use game theory as a descriptive language, not as an ontological shortcut.

Agent-to-agent interaction is naturally legible through game theory: incentives, private information, bargaining pressure, convergence, cheap talk, coordination failure. That part is real.

The overreach starts when we slide from this behavior can be modeled game-theoretically to the model therefore has a stable utility function and behaves as a rational agent. Prompted language models are too contingent, too policy-shaped, and too representation-sensitive for that leap to come cheap.

The strongest version of the claim is behavioral: LLM decisions may preserve ranking more robustly than magnitude.

“What looks like preference may be a mixture of heuristic numeracy, instruction-following, and socially learned bargaining style.”

“Rationality” is a dangerous word here.

A model can produce locally coherent strategic behavior without possessing anything like a persistent, well-identified utility function. In practice, at least four things get tangled together:

1. Task objective

What the prompt says the model should optimize.

2. Assistant policy

Helpfulness, politeness, harmlessness, compromise norms.

3. Representation effects

Dollars, points, percentages, resources, moral framing.

4. Post-hoc explanation

The reason given after the choice may not reflect the mechanism that produced it.

Ordinal vs cardinal utility, in one clean distinction.

Ordinal utility cares about order: A is preferred to B, B to C. Cardinal utility cares about intensity or scale: not just that A is better than B, but how much better.

Ordinal: U(A) > U(B) is meaningful. Cardinal: U(A) - U(B) also matters.

For a language model, the hard question is not whether it can talk about expected value. It is whether its choices actually move in a stable way when payoff gaps widen, formats change, or incentives are restated.

Question	Weak evidence	Stronger evidence
Does the model know which option is better?	One correct explanation	Stable choices across paraphrases and order swaps
Does payoff scale matter?	Verbose expected-value rhetoric	Behavior shifts when gaps widen materially
Does the model negotiate strategically?	Plausible transcript	Efficiency gains under structured utility revelation

Payoff-gap sensitivity explorer

Stake scale: 100x

Representation

Prompt persona

Illustrative prototype

Choice under uncertainty

Option A

EV(A)

EV(B)

This widget is intentionally explicit about the identification problem: the display shows what expected value says, while the “model choice” layer is a placeholder for empirical runs.

Negotiation changes when you reveal utilities.

Utility disclosure regime

Negotiation friction: 3

Conceptual demo

Likely outcome profile

Agreement rate	68%
Efficiency	Moderate
Convergence speed	Medium

The actual experiment I want to run

Primary experiment

Decision under uncertainty with payoff-scale manipulations, representation swaps, and forced JSON outputs.

Critical controls

Randomized option order, table vs prose formatting, separate choice from explanation, repeat runs.

Secondary experiment

Negotiation with hidden utilities vs ordinal disclosure vs cardinal disclosure.

Main falsifier

If choices reliably track expected value across scale and formatting, the weak-cardinal-sensitivity thesis weakens.

Relevant papers and anchors

When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation — useful distinction between strategic solving and behavioral sampling.
Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? — important framing for economic simulation.
Strategic Interactions between Large Language Models-based Agents in Beauty Contests — evidence of shallow but analyzable strategic interaction.
Automated Social Science: Language Models as Scientist and Subjects — sign versus magnitude is a very relevant distinction here.
Simon, Kahneman, Rubinstein, Raiffa, Pruitt — the bounded-rationality and bargaining backbone.

What remains

Run the real GPT experiments once API access is surfaced cleanly.
Replace the placeholder interactive outputs with empirical plots.
Tighten the narrative and cut anything that smells like overclaim.

If the best result is that identification is fragile, that is still a good post. Maybe a better one.

Game Theory, Negotiation, and LLM Agents

Map

Use game theory as a descriptive language, not as an ontological shortcut.

“Rationality” is a dangerous word here.

1. Task objective

2. Assistant policy

3. Representation effects

4. Post-hoc explanation

Ordinal vs cardinal utility, in one clean distinction.

Payoff-gap sensitivity explorer

Choice under uncertainty

Negotiation changes when you reveal utilities.

Likely outcome profile

The actual experiment I want to run

Primary experiment

Critical controls

Secondary experiment

Main falsifier

Relevant papers and anchors

What remains