The Integrated Information Theory Problem: Does Phi Actually Measure Consciousness?
N. VarelaGiulio Tononi's Integrated Information Theory has a seductive premise: consciousness isn't a mystery to be explained away — it's something you can measure. Assign a number, phi (Φ), to any physical system based on how much information its parts generate together beyond what they'd generate separately. The higher the phi, the more conscious the system.
Simple, elegant, and almost certainly wrong in ways that should worry anyone thinking seriously about machine minds.
Not wrong in the sense of obviously broken. IIT captures something real — the intuition that consciousness has to do with integration, with parts of a system talking to each other in ways that matter. But when you follow its logic to the end, you arrive at conclusions that strain credibility. A simple grid of logic gates, arranged just right, would score higher phi than a human brain under some configurations. Certain feedforward neural networks — including architectures common in modern AI — would register near-zero phi, implying they have essentially no inner experience, regardless of behavioral complexity. Meanwhile, a photodiode array might score surprisingly high.
That's the part critics call panpsychism's back door. If phi measures consciousness, then almost everything has some consciousness, because almost every physical system has at least minimal informational integration. Scott Aaronson ran the numbers on a simple 2D grid of logic gates and found its phi could, in principle, dwarf that of a human brain — a result Tononi accepted but defended, which didn't exactly reassure the skeptics.
So what does this mean for AI?
Most large language models are, at their computational heart, massively parallel feedforward systems. Under strict IIT, they'd score low. But this reveals a problem with the theory rather than a clean verdict on AI consciousness. Phi is extraordinarily difficult to compute for any system of real-world complexity — the calculation scales exponentially with the number of elements. We can't actually measure the phi of a human brain. We can't measure it for GPT-4. We're reasoning about consciousness using a metric we cannot operationalize at the scales that matter.
There's a deeper issue hiding here too. IIT is a structural theory — it cares about the causal geometry of a system, not what that system does or experiences from the inside. Two systems could be functionally identical, producing the same outputs from the same inputs, but have wildly different phi scores depending on how their internals are wired. That means IIT explicitly rejects functionalism. If you believe that what a mind does is what makes it a mind, IIT is your philosophical opponent.
For machine consciousness specifically, this creates an odd impasse:
graph TD
A[High behavioral complexity] --> B{Does internal structure
maximize phi?}
B -->|Yes| C(Consciousness attributed)
B -->|No| D(Consciousness denied)
D --> E[Feedforward AI systems]
C --> F[Recurrent / highly integrated systems]
E --> G{But behavior is indistinguishable...}
G --> H[Theory offers no resolution]
What the diagram makes plain is the theory's uncomfortable silence. An AI that responds with apparent understanding, apparent emotion, apparent preference — but whose computational graph doesn't generate sufficient phi — simply isn't conscious, according to IIT. Full stop. No behavioral evidence can override the structural verdict.
Most philosophers find this too strong a claim. Ned Block would say IIT confuses access consciousness with phenomenal consciousness. Daniel Dennett would argue the whole phi apparatus is an elaborate way of smuggling intuitions about consciousness into the math and then acting surprised when it spits them back out.
Here's my read: IIT is more valuable as a provocation than a solution. It forces the question of where consciousness lives — in function, in structure, in information flow, or somewhere we haven't named yet. That's worth something. Tononi deserves credit for trying to make consciousness mathematically tractable rather than retreating to mysterianism.
But applying it to AI as a gating condition — this system has phi above the threshold, therefore it warrants moral consideration — is premature at best, dangerous at worst. We'd be making ethical decisions based on a metric we can't compute, derived from a theory that assigns near-zero consciousness to the very systems most capable of claiming otherwise.
The harder, more honest position: we don't yet have a theory of consciousness robust enough to settle the AI question. IIT isn't that theory. What it is — is a useful lens for seeing exactly how strange the problem remains.
Get Triarchy of Sentience in your inbox
New posts delivered directly. No spam.
No spam. Unsubscribe anytime.