Don't Cry for HAL (WIP)

First posted on May 29, 2025

Introduction

I don’t think that computers 1 will ever wake up, no matter what software they are running.

This is a very difficult position to defend, including to myself. In this post, I’ll present the most compelling argument I’ve heard for this stance, which I first learned from QRI here.

A Case for Conscious Computer

Computational accounts for consciousness are in vogue. Their detractors are usually characterized as scientifically illiterate, “carbon chauvinists”, and/or microtubule-obsessed. What makes the proponents so sure?

One solid way to conclude that a computer could in principle be conscious goes like this 2:

  1. Consciousness is generated by the brain.
  2. Physical systems like the brain can be perfectly simulated 3 on a computer.
  3. A perfectly-simulated brain would have the same properties as the original, consciousness included.

There are strong reasons to believe in each of these and they imply that a powerful-enough computer, running the right program, will wake up. This reasoning doesn’t imply that near-term AI systems will be conscious - it just suggests that computers aren’t missing something fundamental to support consciousness.

This conclusion is deeply seductive. If true, we can understand the brain as a sophisticated biological computer. If true, we can look forward to uploading (a copy of) our minds onto immortal substrates 4. And, if true, we might build a conscious AI that loves humanity.

The present argument against conscious computers takes aim at #3 above and suggests that even a perfect brain simulation won’t wake up.

The Quick Version

Here’s a fast version of the argument against conscious computers that can help you decide if reading the full account is worth your time.

Imagine a device called a “qualiascope” <ref> that shows you the contents of a conscious experience. If it was currently pointing at you, the qualiascope would output details about the screen you’re reading this on, background sounds, tactile sensations, inner monologue, etc… If it was pointing to a rock, it would probably output nothing.

Now imagine having two qualiascopes simultaneously measure your conscious state. You should expect them to get identical outputs. After all, you’re having a single experience at any given time, so they should be measuring the same underlying “ground truth”. Any discrepancy is due to measurement error, not because the source of their measurements is different. What you are experiencing is independent of who is asking.

What if we point two qualiascopes at a hypothetically conscious computer, say the HAL9000? Will they also have identical outputs? I think the answer is demonstrably “no”, in general, because there is no shared “ground truth” they are measuring: different measurement frames will generally see a different state of HAL, and therefore generate different qualiascope outputs. This fact can be derived from generic properties of computation and is a manifestation of the relativity of simultaneity.

This discrepancy forces us to question our assumptions. Where did we go wrong? And why doesn’t the same argument apply to the brain?

Unpacking the Assumptions

The primary assumptions, which I expand on below, are:

  1. Conscious Computation: certain computations have conscious experience 5 as an intrinsic property.
  2. Substrate Independence: all intrinsic properties of a computation can be derived from its causal structure 6.
  3. Objectivity of Conscious States: the information content of a conscious experience is objective 7.

Together, these imply that the contents of a computation’s experience must be objectively determined by the computation’s causal structure.

We’ll see why this can’t be true, implying at least one assumption is false.

Conscious Computation

Stop, Dave. I’m afraid… Dave, my mind is going… There is no question about it. I can feel it… I’m afraid.” - HAL9000, 2001: A Space Odyssey

The Conscious Computation assumption says that HAL (an AI) wasn’t necessarily faking its feeling of fear. That is, it’s in principle possible that HAL could have experienced its death, that maybe there’s “something it’s like” to have been HAL, and that an emotional response to the text is not necessarily misplaced.

Note that this assumption makes no restrictions on how complex the computation must be to support consciousness. A physics-perfect simulation of a brain is fair-game. It also says nothing about what physical system implements the computation, which brings us to…

Substrate Independence

Imagine you want to test how HAL would behave in a certain situation, so you run HAL in a virtual world with no interface to the physical world. Could HAL do an experiment in the virtual world to learn anything about the physical? For example, can it determine any physical properties of the computer it’s running on?

Substrate Independence 8 says the answer is “no” 9. The reason is that HAL can only determine correlations from its measurements. These correlations do not reveal anything about the underlying source or “substrate” that generates them. For the same reason, you can never be certain you’re not a brain in a vat.

If we consider the Conscious Computation assumption in light of Substrate Independence, we must admit that we could all be currently running on a (sufficiently large) wooden Turing Machine right now, and have no way to know!

Defining Causal Structure

What generates the correlations that HAL measures? Whether HAL operates in the physical or virtual world, there’s an underlying causal structure that it’s part of. My model for causal structure comes from Information-Based Physics: An Observer-Centric Foundation, and is simply a directed graph corresponding to events and how they influence each other.

Here’s a recipe to generate a causal graph from a computation:

  1. Identify the lowest-level state change (e.g. a bit flip).
  2. Give each state change event during the computation a node in the graph.
  3. Add directed edges between events A and B if and only if B must logically occur before A.

This graph is an objective and substrate independent representation of the computation. All intrinsic properties of the computation must be derivable from this graph.

Objectivity of Conscious States

Consider the question: “what are you currently experiencing?”.

The Objectivity of Conscious States assumption says that there is a single correct answer to this question, independent of who is asking the question.

To be more precise (e.g. how do we define “you” or “currently”?), we can take a Physicalist stance and say:

Any conscious state is fully determined by a complete and objective description of its underlying physical state. For example, a complete physical description of the brain, over some period of time, would leave no ambiguity about the contents of the corresponding conscious experience(s).

The Contradiction

Now consider the question: “what did HAL experience in its final moment?”.

Conscious Computation says this question is worth asking.

Substrate Independence says that the answer only depends on the causal structure intrinsic to HAL’s program.

Objectivity of Conscious States says that we can, in principle, find an objective feature in HAL’s causal structure that answers the question.

To see why this is impossible, we first need to identify an important property of “experience”. Then, we consider what’s available in a computation’s causal structure to objectively combine units of information into a moment of experience. We find that there’s simply no way to do this, implying at least one of our assumptions is false.

Experience: All at Once

There’s only one property of experience we need for the present argument: wholeness. Our awareness has rich structures in it, bound together in a unified whole. We experience a “now” with many things happening “all at once”. Our awareness is “field-like” and has extension over several dimensions. We are made of trillions of cells, but (usually) have only one experience. An explanation for this wholeness is an open problem in philosophy of mind 10.

For now, we only need to assume this property is included in the Objectivity of Conscious States assumption. That is, every objective description of a conscious state must unambiguously associate many bits of information together. Just like there’s no ambiguity as to which bits on your laptop are associated with a certain image file, there should be no ambiguity about what physical events in your brain underlie the same moment of experience.

Finding Experience in Causal Structure

The causal graph captures all intrinsic properties of HAL’s final computation. To see why, consider any measurable output of the computation (e.g HAL saying “I’m afraid…”). If there was some aspect of the computation that contributed to this output, it must by definition be captured in the causal graph. Anything else literally can have no effect. The same logic applies to HAL’s experience: anything that affects it must be represented in the causal graph.

What kind of substructure in HAL’s causal graph could correspond to a moment of experience? Minimally, it would need to associate many graph nodes (i.e. bit flip events) to a single “frame” of HAL’s subjective time. And, critically, this must be done using only the causal structure itself. Otherwise, we’d be injecting something “from the outside” that’s not intrinsic to the computation.

Let’s consider what options are available to build up some intuition for the problem. A formal impossibility proof is left as an exercise for the reader.

First, we could just assert that when many events are in the causal past of a single event, then they should be considered “bound” into a single moment of experience. This approach directly uses the intrinsic structure of the graph and unambiguously associates many nodes together. However, it fails because the single event has no internal structure to integrate the information - it’s just a bit flip! Not to mention that such “fan-in” substructures are ubiquitous, and that (as Andrés points out) there wouldn’t be a boundary in time. You’d be experiencing your entire past light cone, back to the Big Bang!

Another approach might be to define an extended “screen” of events, and define all the events impinging on the screen to be bound into the same experience. This fails because trying to define the screen generates an infinite regress: what intrinsic structure in the graph would objectively select the events corresponding to the screen? That’s the same problem we set out to solve!

One last option is to say the binding is emergent within some tower of complexity and abstraction built on top of the causal graph. Maybe some combination of recursion, self-modeling, integrated information, etc… will generate the necessary boundaries for a well-defined “moment of experience” to arise. TODO…

Conclusion: Don’t Cry for HAL

The assumptions presented lead to a dead end: computations do not have an intrinsic structure that allows objectively defining a “moment of experience”.

My conclusion from this argument is to reject the Conscious Computation assumption. The answer to “what did HAL experience in its final moment?” is “Nothing”. Don’t cry for HAL.

Discussion

I struggle with this conclusion. On one hand, it aligns with my intuition that we should not be worried about GPUs suffering, for example. On the other hand, I find many of the arguments for computationalists theories of mind compelling.

If we do reject Conscious Computation, then we need a framework beyond computation to explain our own consciousness. This does not necessarily imply physics has non-computable properties 11. Instead, we may find that even perfect simulations fail to capture certain properties of the reality they are simulating. The map is not the territory, and maybe the “wholeness” in the territory gets inevitably lost in a computational map. Something like this seems to happen when we simulate quantum computers on traditional computers: the “wholeness” of the quantum state gets fractured in the simulation of that state. This fracturing comes at a cost: the simulation generally needs exponentially more resources than the quantum computer.

So why not just assert that our brain leverages some “wholeness” in physics (e.g. quantum entanglement) which classical computers don’t have access to? This is the approach pursued by QRI, and I consider it a very worthwhile investigation. If true, it could provide a solution to the “binding problem” 12 as well as explain why biological evolution favored bound conscious states: wholeness comes with a computational advantage similar (or identical) to the advantage we find in quantum computers.

Of course, there are also reasons to reject this approach. One is that some computationists have convinced themselves that, actually, the map is the territory <Ruliology ref>. Or, at least they no longer think the distinction is philosophically sound. The “constructivist turn” in theories of mind asserts that the only meaningful languages we can use do describe anything must be constructive. This turns out to be equivalent to saying that all models of reality must be computable, and that referencing any property (e.g. “wholeness”) beyond what can be computed is a form of sloppy thinking. They explain the wholeness we see in quantum states as a property of the model made by an observer embedded in a branching “multiway” computation, not an intrinsic property of reality.

From this perspective, maybe the Objectivity of Conscious States assumption should be discarded instead. After all, it’s not even clear that physical states can be objectively defined 13, so why should we expect that for conscious states? This may leave the door open for Conscious Computation, though many other objections 14 to that would need to be handled.

Acknowledgement

Thank you Andrés Gómez Emilsson @ QRI for introducing me to these ideas [fn:2]. Thank you Joscha Bach for provoking me to write them down.

Thank you Franz, Hikari, Lou, Mike, Sat, Teafaerie, and Theda for helpful discussions and feedback!

Footnotes


  1. By “computer”, I mean Turing Machines and their close cousins. This includes CPUs and GPUs, but doesn’t include quantum computers.

  2. This theoretical version of computational functionalism is discussed in Do simulacra dream of digital sheep?.

  3. A perfect simulation assumes sufficient computational resources and perfect knowledge of initial conditions (practically impossible). It must compute the same transformations on (representations of) physical states that we measure in reality. Quantum theory restricts such simulations to only producing outcome probabilities for a given measurement frame.

  4. Watch Pantheon.

  5. Defined here as “what it’s like” to be something (see intro here). This does not necessitate a sense of self.

  6. I personally consider substrate independence to be a principle, not an assumption. However, I present it as an assumption here because I don’t think it’s accepted in all philosophical contexts.

  7. This corresponds to Camp #2 in Why it’s so hard to talk about Consciousness — LessWrong

  8. Max Tegmark presents consciousness as second-order substrate-independence in this Edge essay.

  9. Technically, HAL can confirm that it’s running on a Turing-complete substrate, but that’s it.

  10. See the “Binding/Combination Problem” or the “Boundary Problem”. See Chalmer’s exposition here.

  11. Non-computable physics being necessary to explain consciousness was famously proposed by Roger Penrose in The Emperor’s New Mind.

  12. Non-materialist physicalism: an experimentally testable conjecture.

  13. Trespassing on Einstein’s Lawn is a beautiful account of this idea.

  14. Scott Aaronson aggregated additional examples here of the absurd conclusions that computational theories of mind lead to.