What is computational neuroscience? (XXXIV) Is the brain a computer (2)

In a previous post, I argued that the way the brain works is not algorithmic, and therefore it is not a computer in the common sense of the term. This contradicts a popular view in computational neuroscience that the brain is a kind of a computer that implements algorithms. That view comes from formal neural network theory, and the argumentation goes as follows. Formal neural networks can implement any computable function, which is a function that can be implemented by an algorithm. Thus the brain can implement algorithms for computable functions, and therefore is by definition a computer. There are multiple errors in this reasoning. The most salient error is a semantic drift on the concept of algorithm, the second major error is a confusion on what a computer is.

Algorithms

A computable function is a function that can be implemented by an algorithm. But the converse “if a function is computable, then whatever implements this function runs an algorithm” is not true. To see this, we need to be a bit more specific about what is meant by “algorithm” and “computable function”.

Loosely speaking, an algorithm is simply a set of explicit instructions to solve a problem. A cooking recipe is an algorithm in this sense. For example, to cook pasta: put water in a pan; heat up; when water boils, put pasta; wait for 10 minutes. The execution of this algorithm occurs in continuous time in a real environment. But what is algorithmic about this description is the discrete sequential flow of instructions. Water boiling itself is not algorithmic, the high-level instructions are: “when condition A is true (water boils), then do B (put pasta)”. Thus, when we speak of algorithms, we must define what is considered as elementary instructions, that is, what is beneath the algorithmic level (water boils, put pasta).

The textbook definition of algorithm in computer science is: "a sequence of computational steps that transform the input into the output." (Cormen et al., Introduction to algorithms; possibly the most used textbook on the subject). Computability is a way to formalize the notion of algorithm for functions of integers (in particular logical functions). To formalize it, one needs to specify what is considered an elementary instruction. Thus, computability does not formalize the loose notion of algorithm above, i.e, any recipe to calculate something, for otherwise any function would be computable and the concept would be empty (to calculate f(x), apply f to x). A computable function is a function that can be calculated by a Turing machine, or equivalently, which can be generated by a small set of elementary functions on integers (with composition and recursion). Thus, an algorithm in the sense of computability theory is a discrete-time sequence of arithmetic and logical operations (and recursion). Note that this readily extend to any countable alphabet instead of integers, and of course you can replace arithmetic and logical operations with higher-order instructions, as long as they are themselves computable (ie a high-level programming language). But it is not any kind of specification of how to solve a problem. For example, there are various algorithms to calculate pi. But we could also calculate pi by drawing a circle, measuring both the diameter and the perimeter, then dividing perimeter by diameter. This is not an algorithm in the sense of computability theory. It could be called an algorithm in the broader sense, but again note that what is algorithmic about it is the discrete structure of the instructions.

Thus, a device could calculate a computable function using an algorithm in the strict sense of computability, or in the broader sense (cooking recipe), or in a non-algorithmic way (i.e., without any discrete structure of instructions). In any case, what the brain or any device manages to do bears no relation with how it does it.

As pointed out above, what is algorithmic about a description of how something works is the discrete structure (first do A; if B is true, then do C, etc). If we removed this condition, then we would be left with the more general concept of model, not algorithm: a description of how something works. Thus, if we want to say anything specific by claiming that the brain implements algorithms, then we must insist on the discrete-time structure (steps). Otherwise, we are just saying that the brain has a model.

Now that we have more precisely defined what an algorithm is, let us examine whether the brain might implement algorithms. Clearly, it does not literally implement algorithms in the narrow sense of computability theory, i.e., with elementary operations on integers and recursion. But could it be that it implements algorithms in the broader sense? To get some perspective, consider the following two physical systems:

(A) are dominoes, (B) is a tent (illustration taken from my essay “Is coding a relevant metaphor for the brain?”). Both are physical systems that interact with an environment, in particular which can be perturbed by mechanical stimuli. The response of dominoes to mechanical stimuli might be likened to an algorithm, but that of the tent cannot. The fact that we can describe unambiguously (with physics) how the tent reacts to mechanical stimuli does not make the dynamics of the tent algorithmic, and the same is true of the brain. Formal neural networks (e.g. perceptrons or deep learning networks) are algorithmic, but the brain is a priori more like the tent: a set of coupled neurons that interact in continuous time, together and with the environment, with no evident discrete structure similar to an algorithm. As argued above, a specification of how these real neural networks work and solve problems is not an algorithm: it’s a model – unless we manage to map the brain’s dynamics to the discrete flow of an algorithm.

Computers

Thus, if a computer is something that solves problems by running algorithms, then the brain is not a computer. We may however consider a broader definition: the computer is something that computes, i.e., which is able to calculate computable functions. As pointed out above, this does not require the computer to run algorithms. For example, consider a box with some gas, a heater (input = temperature T) and a pressure sensor (output = P). The device computes the function P = nRT/V by virtue of physical laws, and not by an algorithm.

This box, however, is not a computer. Otherwise, any physical system would be called a computer. To be called a computer, the device should be able to implement any computable function. But what does it mean exactly? To run an arbitrary computable function, some parameters of the device need to be appropriately adjusted. Who adjusts these parameters and how? If we do not specify how this adjustment is being made, then the claim that the brain is a computer is essentially empty. It just says that for each function, there is a way to arrange the structure of the brain so that this function is achieved. It is essentially equivalent to the claim that atoms can calculate any computable function, depending on how we arrange them.

To call such a device a computer, we must additionally include a mechanism to adjust the parameters so that it does actually perform a particular computable function. This leads us to the conventional definition of a computer: something that can be instructed via computer programming. The notion of program is central to the definition of computers, whatever form this program takes. A crucial implication is that a computer is a device that is dependent on an external operator for its function. The external operator brings the software to the computer; without the ability to receive software, the device is not a computer.

In this sense, the brain cannot be a computer. We may then consider the following metaphorical extension: the brain is a self-programmed computer. But the circularity in this assertion is problematic. If the program is a result of the program itself, then the “computer” cannot actually implement any computable function, but only those that result from its autonomous functioning. A cat, a mouse, an ant and a human do not actually do the same things, and cannot even in principle do the same tasks.

Finally, is computability theory the right framework to describe the activity of the brain in the first place? It is certainly not the right framework to describe the interaction of a tent with its environment, so why would it be appropriate for the brain, an embodied dynamical system in circular relation with the environment? Computability theory is a theory about functions. But a dynamical system is not a function. You can of course define functions on dynamical systems, even though they do not fully characterize the system. For example, you can define the function that maps the current state to the state at some future time. In the case of the brain, we might want to define a function that maps an external perturbation of the system (i.e. a stimulus) to the state of the system at some future time. However, this is not well defined, because it depends on the state of the system at the time of the perturbation. This problem does not occur with formal neural networks precisely because these are not dynamical systems but mappings. The brain is spontaneously active, whether there is a “stimulus” or not. The very notion of the organism as something that responds to stimuli is the most naïve version of behaviorism. The organism has an endogenous activity and a circular relation to its environment. Consider for example central pattern generators: these are rhythmic patterns produced in the absence of any input. Not all dynamical systems can be framed into computability theory, and in fact most of them, including the brain, cannot because they are not mappings.

Conclusion

As I have argued in my essay on neural coding, there are two core problems with the computer metaphor of the brain (it should be clear by now that this is a metaphor and not a property). One is that it tries to match two causal structures that are totally incongruent, just like dominoes and a tent. The other is that the computer metaphor, just as the coding metaphor, implicitly assumes an external operator – who programs it / interprets the code. Thus, what these two metaphors fundamentally miss is the epistemic autonomy of the organism.

Is the coding metaphor relevant for the genome?

I have argued that the neural coding metaphor is highly misleading (see also similar arguments by Mark Bickhard in cognitive science). The coding metaphor is very popular in neuroscience, but there is another domain of science where it is also very popular: genetics. Is there a genetic code? Many scientists have criticized the idea of a genetic code (and of a genetic program). A detailed criticism can be found in Denis Noble’s book “The music of life” (see also Noble 2011 for a short review).

Many of the arguments I have made in my essay on neural coding readily apply to the “genetic code”. Let us start with the technical use of the metaphor. The genome is a sequence of DNA base triplets called “codons” (ACG, TGA, etc). Each codon specifies a particular amino-acid, and proteins are made of amino-acids. So there is a correspondence between DNA and amino-acids. This seems an appropriate use of the term “code”. But even it in this limited sense, it should be used with caution. The fact that a base triplet encodes an amino-acid is conditional on this triplet being effectively translated into an amino-acid (note that there are two stages, transcription into RNA, then translation into a protein). But in fact only a small fraction of a genome is actually translated, about 10% (depending on species); the rest is called “non-coding DNA”. So the same triplets can result in the production of an amino-acid, or they can influence the translation-transcription system in various ways, for example by interacting with various molecules involved in the production of RNA and proteins, thereby regulating transcription and translation (and this is just one example).

Even when DNA does encode amino-acids, it does not follow that a gene encodes a protein. What might be said is that a gene encodes the primary structure of proteins, that is, the sequence of amino-acids; but it does not specify by itself the shape that the protein will take (which determines its chemical properties), the various modifications that occur after translation, the position that the protein will take in the cellular system. All of those crucial properties depend on the interaction of the product of transcription with the cellular system. In fact, even the primary structure of proteins is not fully determined by the gene, because of splicing.

Thus, the genome is not just a book, as suggested by the coding metaphor (some have called the genome the “book of life”); it is a chemically active substance that interacts with its chemical environment, a part of a larger cellular system.

At the other end of the genetic code metaphor, genes encode phenotypes, traits of the organism. For example, the gene for blue eyes. A concept that often appears in the media is the idea of genes responsible for diseases. One hope behind the human genome project was that by scrutinizing the human genome, we might be able to identify the genes responsible for every disease (at least for every genetic disease). Some diseases are monogenic, i.e., due to a single gene defect, but the most common diseases are polygenic, i.e., are due to a combination of genetic factors (and generally environmental factors).

But even the idea of monogenic traits is misleading. There is no single gene that encodes a given trait. What has been demonstrated in some cases is that mutations in a single gene can impact a given trait. But this does not mean that the gene is responsible by itself for that trait (surprisingly, this fallacy is quite common in the scientific literature, as pointed out by Yoshihara & Yoshihara 2018). A gene by itself does nothing. It needs to be embedded into a system, namely a cell, in order to produce any phenotype. Consequently, the expressed phenotype depends on the system in which the gene is embedded, in particular the rest of the genome. There cannot be a gene for blue eyes if there are no eyes. So no gene can encode the color of eyes; this encoding is at best contextual (in the same way as “neural codes” are always contextual, as discussed in my neural coding essay).

So the concept of a “genetic code” can only be correct in a trivial sense: that the genome, as a whole, specifies the organism. This clearly limits the usefulness of the concept, however. Unfortunately, even this trivial claim is also incorrect. An obvious objection is that the genome specifies the organism only in conjunction with the environment. The deeper objection is that the immediate environment of the genome is the cell itself. No entity smaller than the cell can live or reproduce. The genome is not a viable system, and as such it cannot produce an organism, nor can it reproduce. An interesting experiment is the following: the nucleus (and thus the DNA) from an animal cell is transferred to the egg of an animal of another species (where the nucleus has been removed) (Sun et al., 2005). The “genetic code” theory would predict that the egg would develop into an animal of the donor species. What actually happens (this was done in related fish species) is that the egg develops into some kind of hybrid, with the development process closer to that of the recipient species. Thus, even in the most trivial sense, the genome does not encode the organism. Finally, since no entity smaller than the cell can reproduce, it follows that the genome is not the unique basis of heritability – the entire cell is (see Fields & Levin, 2018).

In summary, the genome does not encode much except for amino-acids (for about 10% of it). It should be conceptualized as a component that interacts with the cellular system, not as a “book” that would be read by some cellular machinery.

What is computational neuroscience? (XXXIII) The interactivist model of cognition

The interactivist model of cognition has been developed by Mark Bickhard over the last 40 years or so. It is related to the viewpoints of Gibson and O’Regan, among others. The model is described in a book (Bickhard and Tervenn, 1996) and a more recent review (Bickhard 2008).

It starts with a criticism of what Bickhard calls “encodingism”, the idea that mental representations are constituted by encodings, correspondences between things in the world and symbols (this is very similar to my criticism of the neural coding metaphor, except Bickhard’s angle is cognitive science while mine was neuroscience). The basic argument is that the encoding “crosses the boundary of the epistemic agent”: the perceptual system stands on only one side of the correspondence, so there is no way it can interpret symbols in terms of things in the world since it never has access to things in the world at any point. The interpretation of the symbols in terms of things in the world would require an interpreter, some entity that makes sense of a priori arbitrary symbols. But this was precisely the epistemic problem to be solved, so the interpreter is a homunculus and this is an incoherent view. This is related to the skeptic argument about knowledge: there cannot be valid knowledge since we acquire knowledge by our senses and we cannot step outside of ourselves to check that it is valid. Encodingism fails the skeptic objection. Note that Bickhard refutes neither the possibility of representations nor even the possibility of encodings, but rather the fact that encodings can be foundational of representations. There can be derivative encodings, based on existing representations (for example Morse is a derivative encoding, which presupposes that we know about both letters and dots and dashes).

A key feature that a representational system must have is what Bickhard calls “system-detectable errors”. A representational system must be able to test whether its representations are correct or not. This is not possible in encodingism because the system does not have access to what is being represented (knowledge that cannot be checked is what I called “metaphysical knowledge” in my Subjective physics paper). No learning is possible if there are no system-detectable errors. This is the problem of normativity.

The interactivist model proposes the following solution: representations are anticipations of potential interactions and their expected impact on future states of the systems, or on the future course of processes of the system (this is close to Gibson’s “affordances”). I give an example taken from Subjective physics. Consider a sound source located somewhere in space. What does it mean to know where the sound came from? In the encoding view, we would say that the system has a mapping between the angle of the source and properties of the sounds, and so it infers the source’s angle from the captured sounds. But what can this mean? Is the inferred angle in radians or degrees? Surely radians and degrees cannot make sense for the perceiver and cannot have been learned (this is what I called “metaphysical knowledge”), so in fact the representation cannot actually be in the form of the physical angle of the source. Rather, what it means that the source is at a given position is that (for example) you would expect that moving your eyes in a particular way would make the source appear in your fovea (see more detail about the Euclidean structure of space and related topics in Subjective physics). Thus, the notion of space is a representation of the expected consequences of certain types of actions.

The interactivist model of representations has the desirable property that it has system-detectable errors: a representation can be correct or not, depending on whether the anticipation turns out to be correct or not. Importantly, what is anticipated is internal states, and therefore the representation does not cross the boundary of the epistemic agent. Contrary to standard models of representation, the interactivist model successfully addresses the skeptic argument.

The interactivist model is described at a rather abstract level, often referring to abstract machine theory (states of automata). Thus, it leaves aside the problem of its naturalization: how is it instantiated by the brain? Important questions to address are: what is a ‘state’ of the brain? (in particular given that the brain is a continuously active dynamical system where no “end state” can be identified); how do we cope with its distributed nature, that is, that the epistemic agent is itself constituted of a web of interacting elementary epistemic agents? how are representations built and instantiated?

Better than the grant lottery

Funding rates for most research grant systems are currently very low, typically around 10%. This means that 90% of the time spent on writing and evaluating grant applications is wasted. It means that if each grant spans 5 years, then a PI has to write about 2 grants per year to be continuously funded; in practice, to reduce risk it should be more than 2 per year. It is an enormous waste, and in addition to that, it is accepted that below a certain funding rate, grant selection is essentially random (Fang et al., 2016). Such competition also introduces conservative biases (since only those applications that are consensual can make it to the top 10%), for example against interdisciplinary studies. Thus, low funding rates are a problem not only because of waste but also because they introduce distortions.

For these reasons, a number of scientists have proposed to introduce a lottery system (Fang 2016; see also Mark Humphries’ post): after a first selection, of say, the top 20-30%, the winners are picked at random. This would reduce bias without impacting quality. Thus, it would certainly be a progress. However, it does not address the problem of waste. 90% of applications would still be written in vain.

First, there is a very elementary enhancement to be implemented: pick at random before you evaluate the grants, i.e., directly reject every other grant, then select the best 20%. This gives exactly the same result, except the cost of evaluation is divided by two.

Now I am sure it would feel quite frustrating for an applicant to write a full grant only to get immediately rejected by the flip of a coin. So there is again a very simple enhancement: decide who will get rejected before they write the application. Pick at random 50% of scientists and invite them to submit a grant. Again, the result is the same, but in addition you reduce the time spent on grant writing by two.

At this point we might wonder why do this initial selection at random? This introduces variance for no good reason. You never know in advance whether you will be allowed to get funding next year and this seems arbitrary. Thus, there is an obvious enhancement: replace lottery by rotation. Every PI is allowed to submit a grant only every two years. Again, this is equivalent on average to the initial lottery system, except there is less variance and less waste.

This reasoning leads me to a more general point. There is a simple way to increase the success rate of a grant system, which is to reduce the number of applications. The average funding rate of labs does not depend on the number of applications; it depends on the budget and only on the budget. If you bar 50% of scientists from applying, then you don’t divide by two the average budget of every lab. The average budget allocated to each lab is the same, but the success rate is doubled.

The counter-intuitive part is that individually, you increase your personal success rate if you apply to more calls. But collectively it is exactly the opposite: the global success rate decreases if there are more calls (for the same overall budget), since there are more applications. This is because the success rate is low because of other people submitting, not because you are submitting. This is a tragedy of commons phenomenon.

There is a simple way to solve it, which is to add constraints. There are different ways to do it: 1) reduce the frequency of calls, and merge redundant calls, 2) introduce a rotation (e.g. those born on even years submit on even years), 3) do not allow submission if you are already funded (or say, in the first years). Any of these constraints mechanically increases the success rate, thus reduces both waste and bias, with no impact on average funding. It is better than a lottery.

 

p.s.: There is also an obvious and efficient way to reduce the problem, which is to increase base funding, so that scientists do not need grants in order to survive (see this and other ideas in a previous post).

What is computational neuroscience? (XXXI) The problem of biological measurement (1)

We tend to think of sensory receptors (photoreceptors, inner hair cells) or sensory neurons (retinal ganglion cells; auditory nerve fibers) as measuring physical dimensions, for example light intensity or acoustical pressure, or some function of it. The analogy is with physical instruments of measure, like a thermometer or a microphone. This confers a representational quality to the activity of neurons, an assumption that is at the core of the neural coding metaphor. I explain at length why that metaphor is misleading in many ways in an essay (Brette (2018) Is coding a relevant metaphor for the brain?). Here I want to examine more specifically the notion of biological measurement and the challenges it poses.

This notion comes about not only in classical representationalist views, where neural activity is seen as symbols that the brain then manipulates (the perception-cognition-action model, also called sandwich model), but also in alternative views, although it is less obvious. For example, one alternative is to see the brain not as a computer system (encoding symbols, then manipulating them) but as a control system (see Paul Cisek’s behavior as interaction, William Powers’ perceptual control theory, Tim van Gelder’s dynamical view of cognition). In this view, the activity of neurons does not encode stimuli. In fact there is no stimulus per se, as Dewey pointed out: “the motor response determines the stimulus, just as truly as sensory stimulus determines the movement.”.

A simple case is feedback control: the system tries to maintain some input at a target value. To do this, the system must compare the input with an internal value. We could imagine for example something like an idealized version of the stretch reflex: when the muscle is stretched, a sensory feedback triggers a contraction, and we want to maintain muscle length constant. But this apparently trivial task raises a number of deep questions, as more generally the application of control theory to biological systems. I suppose there is a sensor, a neuron that transduces some physical dimension into spike trains, for example the stretch of a muscle. There is also an actuator, which reacts to a spike by a physical action, for example contracting the muscle with a particular time course. I chose a spike-based description not just because it corresponds to the physiology of the stretch reflex, but also because it will illustrate some fundamental issues (which would exist also with graded transduction, but less obviously so).

Now we have a neuron, or a set of neurons, which receive these sensory inputs and send spikes to the actuator. For this discussion, it is not critical that these are actually neurons; we can just consider that there is a system there, and we ask how this system should be designed so as to successfully achieve a control task.

The major issue here is that the control system does not directly deal with the physical dimension. At first sight, we could think this is a minor issue. The physical dimension gets transduced, and we could simply define the target value in the transduced dimension (eg the current). But here we see that the problem is more serious. What the control system deals with is not simply a function of the physical dimension. More accurately, transduction is a nonlinear dynamical system influenced by a physical signal. The physical signal can be constant, for example, while the transduced current decays (adaptation) and the sensory neuron outputs spike trains, i.e., a highly variable signal. This poses a much more serious problem than a simple calibration problem. When the controlled physical value is at the target value, the sensory neuron might be spiking, perhaps not even at a regular rate. The control system should react to that particular kind of signal by not acting, while it should act when the signal deviates from it. But how can the control system identify the target state, or even know whether to act in one or the opposite direction?

Adaptation in neurons is often depicted as an optimization of information transmitted, in line with the metaphor of the day (coding). But the relevant question is: how does the receiver of this “information” knows how the neuron has adapted? Does it have to de-adapt, to somehow be matched to the adaptive process of the encoding neuron? (This problem has to do with the dualistic structure of the neural coding metaphor).

There are additional layers of difficulty. We have first recognized that transduction is not a simple mapping from a physical dimension to a biological (e.g. electrochemical) dimension, but rather a dynamical system influenced by a physical signal. Now this dynamical system depends on the structure of the sensory neuron. It depends for example on the number of ionic channels and their properties, and we know these are highly plastic and indeed quite variable both across time and across cells. This dynamical system also depends on elements of the body, or let’s say more generally the neuron’s environment. For example, the way acoustical pressure is transduced in current by an inner hair cell depends obviously on the acoustical pressure at the eardrum, but that physical signal depends on the shape the ear, which filters sounds. Properties of neurons change with time too, development and aging. Thus, we cannot assume that the dynamical transformation from physical signal to biological signal is a fixed one. Somehow, the control system has to work despite this huge plasticity and the dynamical nature of the sensors.

Let us pause for a moment and outline a number of differences between physical measurements, as with a thermometer, and biological measurements (or “sensing”):

  • The physical meter is calibrated with respect to an external reference, for example 0°C is when water freezes, while 100°C is when it boils. The biological sensor cannot be calibrated with respect to an external reference.
  • The physical meter produces a fixed value for a stationary signal. The biological sensor produces a dynamical signal in response to a stationary signal. More accurately, the biological sensor is a nonlinear dynamical system influenced by the physical signal.
  • The physical meter is meant to be stable, in that the mapping from physical quantity to measurement is fixed. When it is not, this is considered an error. The biological sensor does not have fixed properties. Changes in properties occur in the normal course of life, from birth to death, and some changes in properties are interpreted as adaptations, not errors.

From these differences, we realize that biological sensors do not provide physical measurements in the usual sense. The next question, then, is how can a biological system control a physical dimension with biological sensors that do not act as measurements of that dimension?

What is computational neuroscience? (XXIX) The free energy principle

The free energy principle is the theory that the brain manipulates a probabilistic generative model of its sensory inputs, which it tries to optimize by either changing the model (learning) or changing the inputs (action) (Friston 2009; Friston 2010). The “free energy” is related to the error between predictions and actual inputs, or “surprise”, which the organism wants to minimize. It has a more precise mathematical formulation, but the conceptual issues I want to discuss here do not depend on it.

Thus, it can be seen as an extension of the Bayesian brain hypothesis that accounts for action in addition to perception. It shares the conceptual problems of the Bayesian brain hypothesis, namely that it focuses on statistical uncertainty, inferring variables of a model (called “causes”) when the challenge is to build and manipulate the structure of the model. It also shares issues with the predictive coding concept, namely that there is a conflation between a technical sense of “prediction” (expectation of the future signal) and a broader sense that is more ecologically relevant (if I do X, then Y will happen). In my view, these are the main issues with the free energy principle. Here I will focus on an additional issue that is specific of the free energy principle.

The specific interest of the free energy principle lies in its formulation of action. It resonates with a very important psychological theory called cognitive dissonance theory. That theory says that you try to avoid dissonance between facts and your system of beliefs, by either changing the beliefs in a small way or avoiding the facts. When there is a dissonant fact, you generally don’t throw your entire system of beliefs: rather, you alter the interpretation of the fact (think of political discourse or in fact, scientific discourse). Another strategy is to avoid the dissonant facts: for example, to read newspapers that tend to have the same opinions as yours. So there is some support in psychology for the idea that you act so as to minimize surprise.

Thus, the free energy principle acknowledges the circularity of action and perception. However, it is quite difficult to make it account for a large part of behavior. A large part of behavior is directed towards goals; for example, to get food and sex. The theory anticipates this criticism and proposes that goals are ingrained in priors. For example, you expect to have food. So, for your state to match your expectations, you need to seek food. This is the theory’s solution to the so-called “dark room problem” (Friston et al., 2012): if you want to minimize surprise, why not shut off stimulation altogether and go to the closest dark room? Solution: you are not expecting a dark room, so you are not going there in the first place.

Let us consider a concrete example to show that this solution does not work. There are two kinds of stimuli: food, and no food. I have two possible actions: to seek food, or to sit and do nothing. If I do nothing, then with 100% probability, I will see no food. If I seek food, then with, say, 20% probability, I will see food.

Let’s say this is the world in which I live. What does the free energy principle tell us? To minimize surprise, it seems clear that I should sit: I am certain to not see food. No surprise at all. The proposed solution is that you have a prior expectation to see food. So to minimize the surprise, you should put yourself into a situation where you might see food, ie to seek food. This seems to work. However, if there is any learning at all, then you will quickly observe that the probability of seeing food is actually 20%, and your expectations should be adjusted accordingly. Also, I will also observe that between two food expeditions, the probability to see food is 0%. Once this has been observed, surprise is minimal when I do not seek food. So, I die of hunger. It follows that the free energy principle does not survive Darwinian competition.

Thus, either there is no learning at all and the free energy principle is just a way of calling predefined actions “priors”; or there is learning, but then it doesn’t account for goal-directed behavior.

The idea to act so as to minimize surprise resonates with some aspects of psychology, like cognitive dissonance theory, but that does not constitute a complete theory of mind, except possibly of the depressed mind. See for example the experience of flow (as in surfing): you seek a situation that is controllable but sufficiently challenging that it engages your entire attention; in other words, you voluntarily expose yourself to a (moderate amount of) surprise; in any case certainly not a minimum amount of surprise.

Is a thermostat conscious?

A theory of consciousness initially proposed by David Chalmers (in his book the Conscious Mind) is that consciousness (or experience) is a property of information processing systems. It is an additional property, not logically implied by physical laws; a new law of nature. The theory was later formalized by Giulio Tononi into Integrated Information Theory, based on Shannon’s mathematical concept of information. One important feature of this theory is it is a radical form of panpsychism: it assigns consciousness (to different degrees) to virtually anything in the world, including a thermostat.

The Bewitched experiment of thought

I have criticized IIT previously on the grounds that it fails to define in a sensible way what makes a conscious subject (eg a subsystem of a conscious entity would be another conscious entity, so for example your brain would produce an infinite number of minds). But here I want to comment specifically on the example of the thermostat. It is an interesting example brought up by Chalmers in his book. The reasoning is as follows: a human brain is conscious; a mouse brain is probably conscious, but with a somewhat lower degree (for example, no self-consciousness). As we go down the scale of information-processing systems, the system might be less and less conscious, but why would it be that there is a definite threshold for consciousness? Why would a billion neurons be conscious but not a million? Why would a million neurons be conscious but not one thousand? And how about just one neuron? How about a thermostat? A thermostat is an elementary information-processing system with just two states, so maybe, Chalmers argue, the thermostat has a very elementary form of experience.

To claim that a thermostat is conscious defies intuition, but I would not follow Searle on insisting that the theory must be wrong because it assigns consciousness to things that we wouldn’t intuitively think are conscious. As I argued in a previous post, to claim that biology tells us that only brains are conscious is to use circular arguments. We don’t know whether anything else than a brain is conscious, and since consciousness is subjective, to decide whether anything is conscious is going to involve some theoretical aspects. Nonetheless, I am skeptical that a thermostat is conscious.

I propose to examine the Bewitched experiment of thought. In the TV series Bewitched, Samantha the housewife twitches her nose and everyone freezes except her. Then she twitches her nose and everyone unfreezes, without noticing that anything happened. For them, time has effectively stopped. The question is: was anyone experiencing anything during that time? To me, it is clear that no one can experience anything if time is frozen. In fact, that whole time has not existed at all for the conscious subject. It follows that a substrate with a fixed state (e.g. hot/cold) cannot experience anything, because time is effectively frozen for that substrate. Experience requires a flow of time, a change in structure through time. I leave it open whether the interaction of the thermostat with the room might produce experience for that coupled system (see below for some further thoughts).

What is “information”?

In my view, the fallacy in the initial reasoning is to put the thermostat and the brain in the same scale. That scale is the set of information-processing systems. But as I have argued before (mostly following Gibson’s arguments), it is misleading to see the brain an information-processing system. The brain can only be seen to transform information of one kind into information of another kind by an external observer, because the very concept of information is something that makes sense to a cognitive/perceptual system. The notion of information used by IIT is Shannon information, a notion from communication theory. This is an extrinsic notion of information: for example, neural activity is informative about objects in the world in the sense that properties of those objects can be inferred from neural activity. But this is totally unhelpful to understand how the brain, which only ever gets to deal with neural signals and not things in the world, sees the world (see this argument in more detail in my paper Is coding a relevant metaphor for the brain?).

Let’s clarify with a concrete case: does the thermostat perceive temperature? The thermostat can be in different states depending on temperature, but from its perspective, there is no temperature. There are changes in state that seems to be unrelated to anything else (there is literally nothing else for the thermostat). One could replace the temperature sensor with some other sensor, or with a random number generator, and there would be literally no functional change in the thermostat itself. Only an external observer can link the thermostat’s state with temperature, so the thermostat cannot possibly be conscious of temperature.

Thus, Shannon’s notion of information is inappropriate to understand consciousness. Instead of extracting information in the sense of communication theory, what the brain might do is build models of sensory (sensorimotor) signals from its subjective perspective, in the same way as scientists make models of the world with observations (=sensory signals) and experiments (=actions). But this intrinsic notion of information, which corresponds eg to laws of physics, is crucially not what Shannon’s notion of information is. And it is also not the kind of information that a thermostat is dealing with.

This inappropriate notion of information leads to what in my view is a rather absurd quantitative scale of consciousness, according to which entities are more or less conscious along a graded scale (phi). Differences in consciousness are qualitative, not quantitative: there is dreaming, being awake, being self-conscious or not, etc. These are not different numbers. This odd analog scale arises because Shannon information is counted in bits. But information in the sense of knowledge (science) is not counted in bits; there are different kinds of knowledge, they have different structure, relations between them etc.

Subjective physics of a thermostat

But let us not throw away Chalmers’ interesting experiment of thought just now. Let us ask, following Chalmers: what does it feel like to be a thermostat? We will examine it not with Shannon’s unhelpful notion of information but with what I called “subjective physics”: the laws that govern sensory signals and their relations to actions, from the perspective of the subject. This will define my world from a functional viewpoint. Let’s say I am a conscious thermostat; a homunculus inside the thermostat. All I can observe is a binary signal. Then there is a binary action that I can make, which for an external observer corresponds to turning on the heat. What kind of world does that make to me? Let’s say I’m a scientist homunculus, what kind of laws about the world can I infer?

If I’m a conventional thermostat, then the action will be automatically triggered when the signal is in a given state (“cold”). After some time, the binary signal will switch and so will the action. So in fact there is an identity between signal and action, which means that all I really observe is just the one binary signal, switching on and off, probably with some kind of periodicity. This is the world I might experience, as a homunculus inside the thermostat (note that to experience the periodicity requires memory, which a normal thermostat doesn’t have). In a way, I’m a “locked-in” thermostat: I can make observations, but I cannot freely act.

Let’s say that I am not locked-in and have a little more free will, so I can decide whether to act (heat) or not. If I can, then my world is a little bit more interesting: my action can trigger a switch of the binary signal, after some latency (again requiring some memory), and then when I stop, the binary signal switches back, after a time that depends on how much time my previous action lasted. So here I have a world that is much more structured, with relatively complex laws which in a way defines the concept of “temperature” from the perspective of the thermostat.

So if a thermostat were conscious, then we have a rough idea of the kind of world it might experience (although not how it feels like), and even in this elementary example, you can’t measure these experiences in bits - let alone the fact that a thermostat is not conscious anyway.

A brief critique of predictive coding

Predictive coding is becoming a popular theory in neuroscience (see for example Clark 2013). In a nutshell, the general idea is that brains encode predictions of their sensory inputs. This is an appealing idea because superficially, it makes a lot of sense: functionally, the only reason why you would want to process sensory information is if it might impact your future, so it makes sense to try to predict your sensory inputs.

There are substantial problems in the details of predictive coding theories, for example with the arbitrariness of the metric by which you judge that your prediction matches sensory inputs (what is important?), or the fact that predictive coding schemes encode both noise and signal. But I want to focus on the more fundamental problems. One has to with “coding”, the other with “predictive”.

It makes sense that brains anticipate. But does it make sense that brains code? Coding is a metaphor of a communication channel, and this is generally not a great metaphor for what the brain might do, unless you fully embrace dualism. I discuss this at length in a recent paper (Is coding a relevant metaphor for the brain?) so I won’t repeat the entire argument here. Predictive coding is a branch of efficient coding, so the same fallacy underlies its logic: 1) neurons encode sensory inputs; 2) living organisms are efficient; => brains must encode efficiently. (1) is trivially true in the sense that one can define a mapping from sensory inputs to neural activity. (2) is probably true to some extent (evolutionary arguments). So the conclusion follows. Critiques of efficient coding have focused on the “efficient” part: maybe the brain is not that efficient after all. But the error is elsewhere: living organisms are certainly efficient, but it doesn’t follow that they are efficient at coding. They might be efficient at surviving and reproducing, and it is not obvious that it entails coding efficiency (see the last part of the abovementioned paper for a counter-example). So the real strong assumption is there: the main function of the brain is to represent sensory inputs.

The second problem has to with “predictive”. It makes sense that an important function of brains, or in fact of any living organism, is to anticipate (see the great Anticipatory Systems by Robert Rosen). But to what extent do predictive coding schemes actually anticipate? First, in practice, those are generally not prediction schemes but compression schemes, in the sense that they do not tell us what will happen next but what happens now. This is at least the case of the classical Rao & Ballard (1999). Neurons encode the difference between expected input and actual input: this is compression, not prediction. It uses a sort of prediction in order to compress: other neurons (in higher layers) produce predictions of the inputs to those neurons, but the term prediction is used in the sense that the inputs are not known to the higher layer neurons, not that the “prediction” occurs before the inputs. Thus the term “predictive” is misleading because it is not used in a temporal sense.

However, it is relatively easy to imagine how predictive coding might be about temporal predictions, although the neural implementation is not straightforward (delays etc). So I want to make a deeper criticism. I started by claiming that it is useful to predict sensory inputs. I am taking this back (I can because I said it was superficial reasoning). It is not useful to know what will happen. What is useful is to know what might happen, depending on what you do. If there is nothing you can do about the future, what is the functional use of predicting it? So what is useful is to predict the future conditionally to a different set of potential actions. This is about manipulating models of the world, not representing the present.

The substrate of consciousness

Here I want to stir some ideas about the substrate of consciousness. Let us start with a few intuitive ideas: a human brain is conscious; an animal brain is probably conscious; a stone is not conscious; my stomach is not conscious; a single neuron or cell is not conscious; the brainstem or the visual cortex is not a separate conscious entity; two people do not form a single conscious entity.

Many of these ideas are in fact difficult to justify. Let us start with single cells. To see the problem, think first of organisms that consist of a single cell. For example, bacteria, or ciliates. In this video, an amoeba’s engulfs and then digests two paramecia. At some point, you can see the paramecia jumping all around as if they were panicking. Are these paramecia conscious, do they feel anything? If I did not know anything about their physiology or size, my first intuition would be that they do feel something close to fear. However, knowing that these are unicellular organisms and therefore do not have a nervous system, my intuition is rather that they are not actually conscious. But why?

Why do we think a nervous system is necessary for consciousness? One reason is that organisms to which we ascribe consciousness (humans and animals, or at least some animals) all have a nervous system. But it’s a circular argument, which has no logical validity. A more convincing reason is that in humans, the brain is necessary and sufficient for consciousness. A locked-in patient is still conscious. On the other hand, any large brain lesion has an impact on conscious experience, and specific experiences can be induced by electrical stimulation of the brain.

However, this tends to prove that the brain is the substrate of my experience, but it says nothing about, say, the stomach. The stomach also has a nervous system, it receives sensory signals and controls muscles. If it were conscious, I could not experience it, by definition, since you can only experience your own consciousness. So it could also be, just as for the brain, that the stomach is sufficient and necessary for consciousness of the gut mind: perhaps if you stimulate it electrically, it triggers some specific experience. As ridiculous as it might sound, I cannot discard the idea that the stomach is conscious just because I don’t feel that it’s conscious; I will need arguments of a different kind.

I know I am conscious, but I do not know whether there are other conscious entities in my body. Unfortunately, this applies not just to the stomach, but more generally to any other component of my body, whether it has a nervous system or not. What tells me that the liver is not conscious? Imagine I am a conscious liver. From my perspective, removing one lung, or a foot, or a large part of the visual cortex, has no effect on my conscious experience. So the fact that the brain is necessary and sufficient for your conscious experience doesn’t rule out the fact that some other substrate is necessary and sufficient for the conscious experience of another entity in your body. Now I am not saying that the question of liver consciousness is undecidable, only that we will need more subtle arguments than those exposed so far (discussed later).

Let us come back to the single cell. Although I feel that a unicellular organism is not conscious because it doesn’t have a nervous system, so far I have no valid argument for this intuition. In addition, it turns out that Paramecium, as many other unicellular organism including (at least some) bacteria, is an excitable cell with voltage-gated channels, structurally very similar to a neuron. So perhaps it has some limited form of consciousness after all. If this is true, then I would be inclined to say that all unicellular organisms are also conscious, for example bacteria. But then what about a single cell (eg a neuron) in your body, is it conscious? One might object that a single cell in a multicellular organism is not an autonomous organism. To address this objection, I will go one level below the cell.

Eukaryotic cells (eg your cells) have little energy factories called mitochondria. It turns out that mitochondria are in fact bacteria which have been engulfed in cells a very long (evolutionary) time ago. They have their own DNA, but they now live and reproduce inside cells. This is a case of endosymbiosis. If mitochondria were conscious before they lived in cells, why would they have lost consciousness when they started living in cells? So if we think bacteria are conscious, then we must admit that we have trillions of conscious entities in the cells of our body – not counting the bacteria in our digestive system. The concept of an autonomous organism is an illusion: any living organism depends on interactions with an ecosystem, and that ecosystem might well be a cell or a multicellular organism.

By the same argument, if we think unicellular organisms are conscious, then single neurons should be conscious, as well as all single cells in our body. This is not exclusive of the brain being conscious as a distinct entity.

A plausible alternative, of course, is that single cells are not conscious, although I have not yet proposed a good argument for this alternative. Before we turn to a new question, I will let you contemplate the fact that bacteria can form populations that are tightly coupled by electrical communication. Does this make a bacteria colony conscious?

Let us now turn to another question. We can imagine that a cell is somehow minimally conscious, and that at the same time a brain forms a conscious entity of a different nature. Of course it might not be true, but there is a case for that argument. So now let us consider two people living their own life on opposite sides of the planet. Can this pair form a new conscious entity? Here, there are arguments to answer negatively. This is related to a concept called the unity of consciousness.

Suppose I see a red book. In the brain, some areas might respond to the color and some other areas might respond to the shape. It could be then that the color area experiences redness, and the shape area experience bookness. But I, as a single conscious unit, experiences a red book as a whole. Now if we consider two entities that do not interact, then there cannot be united experiences: somehow the redness and the bookness must be put together. So the substrate of a conscious entity cannot be made of parts that do not interact with the rest. Two separated people cannot form a conscious entity. But this does not rule out the possibility that two closely interacting people may not form a conscious superentity. Again, I do not believe this is the case, but we need to find new arguments to rule this out.

Now we finally have something a little substantial: a conscious entity must be made of components in interaction. From this idea follow a few remarks. First, consciousness is not a property of a substrate, but of the activity of a substrate (see a previous blog post on this idea). For example, if we freeze the brain in a particular state, it is not conscious. This rules out a number of inanimate objects (rocks) as conscious. Second, interactions take place in time. For example, it takes some time, up to a few tens of ms, for an action potential to travel from one neuron to another. This implies that a 1 ms time window cannot enclose a conscious experience. The “grain” of consciousness for a human brain should thus be no less than a few tens of milliseconds. In the same way, if a plant is conscious, then that consciousness cannot exist on a short timescale. This puts a constraint on the kind of experiences that can be ascribed to a particular substrate. Does consciousness require a nervous system? Maybe it doesn’t, but at least for large organisms, a nervous system is required to produce experiences on a short timescale.

I want to end with a final question. We are asking what kind of substrate gives rise to consciousness. But does consciousness require a fixed substrate? After all, the brain is dynamic. Synapses appear and disappear all the time, all the proteins get renewed regularly. The brain is literally a different set of molecules and a different structure from one day to the next. But the conscious entity remains. Or at least it seems so. This is what Buddhists call the illusion of self: contrary to your intuition, you are not the same person today and ten years ago; the self has no objective permanent existence. However, we can say that there is a continuity in conscious experience. That continuity, however, does not rely on a fixed material basis but more likely on some continuity of the underlying activity. Imagine for example a fictional worm that is conscious, but the substrate of consciousness is local. At some point it is produced by the interaction of neurons at some particular place of the nervous system, then that activity travels along the worm’s spine. The conscious entity remains and doesn’t feel like it’s travelling, it is simply grounded on a dynamic substrate.

Now I don’t think that this is true of the brain (or of the worm), but rather that long-range synchronization has something to do with the generation of a global conscious entity. However, it is conceivable that different subsets of neurons, even though they might span the same global brain areas, are involved in conscious experience at different times. In fact, this is even plausible. Most neurons don’t fire much, perhaps a few Hz on average. But one can definitely have a definite conscious experience over a fraction of second, and that experience thus can only involve the interaction of a subset of all neurons. We must conclude that the substrate of consciousness is actually not fixed but involve dynamic sets of neurons.

A summary of these remarks. I certainly have raised more questions than I have answered. In particular, it is not clear whether a single cell or a component of the nervous system (stomach, brainstem) is conscious. However, I have argued that: 1) any conscious experience requires the interaction of the components that produce it, and this interaction takes place in time; 2) the set of components that are involved in any particular experience is dynamic, despite the continuity in conscious experience.

Project: Binaural cues and spatial hearing in ecological environments

I previously laid out a few ideas for future research on spatial hearing:

  1. The ecological situation and the computational problem.
  2. Tuning curves.
  3. The coding problem.

This year I wrote a grant that addresses the first point. My project was rather straightforward:

  1. To make binaural recordings with ear mics in real environments, with real sound sources (actual sounding objects) placed at predetermined positions. This way we obtain distributions of binaural cues conditioned on source direction, capturing the variability due to context.
  2. To measure human localization performance in those situations.
  3. To try to see if a Bayesian model can account for these results, and possibly previous psychophysical results.

The project was preselected but unfortunately not funded. I probably won't resubmit it next year, except perhaps with a collaboration. So here it is for everyone to read: my grant application, "Binaural cues and spatial hearing in ecological environments". If you like it, if you want to do these recordings and experiments, please do so. I am interested in the results, but I'm happy if someone else does it. Please contact me if you would like to set up a collaboration, or discuss the project. I am especially interested in the theoretical analysis (ie the third part of the project). Our experience in the lab is primarily on the theoretical side, but also in signal analysis, and we have done a number of binaural recordings too and some psychophysics.