What is computational neuroscience? (XXXVI) Codes and processes

There are two classes of problems with the concept of neural codes. Initially, while working on my critique of the neural coding metaphor, I focused mostly on the epistemic problem (the first two parts of the paper). The epistemic problem is that when we say that Y is a neural code for X and that Y is metaphorically decoded by the brain, we imply that Y is informative about X by simple virtue of being in lawful relation with X. But this is a kind of information that is only available to an external observer who can see both X and Y, knows the two domains and the correspondence. If not X but X’ caused the neural activity Y, the organism would never know from just observing Y. Therefore Y cannot be a primary representation of X for the organism. Of course it could be a secondary representation of X, if the organism could observe that Y is in lawful relation with Z, a primary representation of X. But then we need to account for the existence of that primary representation, which cannot be based on an encoding. A number of other authors have made similar criticisms, in particular Mark Bickhard.

The epistemic problem is, I would say, the “easy problem” of neural codes. Addressing it gives rise to alternative notions of information based on internal relations, such as O’Regan’s sensorimotor contingencies, Gibson’s invariance structure, and my subjective physics.

But there is a deeper, more fundamental problem. It has to do with substance vs. process metaphysics and the way time is conceived (or in this case, disregarded). I address it in the third part of the neural coding essay, and in my response to commentaries (especially the third part). To explain it, I will compare the neural code with the genetic code. There are some problematic aspects with the idea of a “genetic code”, but in its most unproblematic form, there is a lawful correspondence between triplets of nucleotides and amino acids, which we can call a code. Nucleotides and amino acids are two types of substances, that is, some stable entities (molecules). Nucleotides are transformed into amino acids by some process that unfolds through time (translation). A process is not a substance; it may involve some substance, for sure (e.g. enzymes), but it is the activity that defines the process. The code refers to some lawful relation between two types of substances, disregarding the process.

With this analysis in mind, “neural codes” now look very peculiar. The neural end of the code is not a substance at all. It is a particular measurement of the activity of neurons done at a particular time, for example the number of spikes during a particular time window. We then consider that this number is the output of some process, some kind of stable entity that can be further manipulated and transformed by some other processes. Of course this is exactly what it is for the experimenter, who manipulates those measurements, makes calculations etc. But from the organism’s perspective this view is very puzzling: the activity of neurons is the process, not the result of a process (what other process?). Neurons do not produce stable entities like amino acids which can participate in various processes. A spike is not a stable entity, it is a timed event in the process of neural interaction (like, say, the binding of an enzyme on RNA), and measurements like spike counts are simply “snapshots” of that process. It is not coherent to treat signatures of processes as if they were substances.

What is computational neuroscience? (XXXV) Metaphors as morphisms

What is a metaphor? Essentially, a metaphor is an analogy that doesn’t say its name. We use metaphors all the time without even noticing it, as was beautifully demonstrated by Lakoff & Johnson (1980). When I say for example, “let me cast some light on this issue”, I am using a fairly sophisticated metaphor in which I make an analogy between understanding and seeing. In that analogy, an explanation allows you to understand, in the same way as light allows you to see. You might then reply: I see what you mean, it is clearer! Chances are that, in normal conversation, we would not have noticed that we both used a metaphor.

Metaphors are everywhere in neuroscience, and in biology more generally (see these posts). For example: evolution optimizes traits (see the excellent article of Gould & Lewontin (1979) for a counterpoint); the genome is a code for the organism (see Denis Noble (2011a; 2011b)); the brain runs algorithms, or is a computer (see also Paul Cisek (1999) or Francisco Varela); neural activity is a code.

These metaphors are so ingrained in neuroscientific thinking that many object to the very idea that they are metaphorical. The objection is that “evolution is optimization” or “brain runs algorithms” is not a metaphor, it is a theory. Or, for the more dogmatic, these are not metaphors, these are facts.

Indisputable truths belong to theology, not science, so any claim that a general proposition is a fact should be seen as suspect – it is an expression of dogmatism. But there is a case that we are actually talking about theories. In the case of neural codes or brains as computers, one might insist that the terms “code” or “computer” refer to abstract properties, not to concrete objects like a desktop computer. But this is a misunderstanding of what a metaphor, or more generally an analogy, is. When I am “casting light on this issue”, I am not referring to any particular lamp, but to an abstract concept of light which does not actually involve photons. The question is not whether words are actually some sort of photons, but whether the functional relation between light and seeing is similar to the functional relation between explanation and understanding. There is no doubt that these concepts are abstracted from actual properties of concrete situations (of light and perception), but so are the concepts of code and computer. In the metaphor, it is the abstract properties that are at stake, so the objection “it is not a metaphor, it is a theory” either misunderstands what metaphor is (a metaphor is a theory), or perhaps really means “the theory is correct” – again dogmatism.

For the mathematically minded, a mathematical concept that captures this idea is “morphism”. A morphism is a map that preserves structure. For example, a group homomorphism f from X to Y is such that f(a*b) = f(a) x f(b): the operation * defined on X is mapped to the operation x defined on Y (of course “metaphors are morphisms” is a metaphor!).

For example, in the “let me cast light on this issue” metaphor, I am mapping the domain of visual perception to the domain of linguistic discourse: light -> words; visual object -> issue ; seeing -> understanding. What makes the metaphor interesting is that some relations within the first domain are mapped to relations in the other domain: use of light on an object causes seeing; use of words on an issue causes understanding.

Another example in science is the analogy between the heart and a pump. Each element of the pump (e.g. valve, liquid) is mapped to an element of the heart, and the analogy is relevant because functional relations between elements of the pump are mapped to corresponding relations between elements of the heart. Thus, the analogy has explanatory power. What makes a metaphor or an analogy interesting is not the fact that the two domains are similar (they are generally not), but the richness of the structure preserved by the implied morphism.

In other words, a metaphor or an analogy is a theory that takes inspiration from another domain (e.g. computer science), by mapping some structure from one domain to the other. There is nothing intrinsically wrong with this, on the contrary. Why then is the term “metaphor” so vehemently opposed in science ? Because the term implies that the theory is questionable (hence, again, dogmatism). There are ways in which understanding is like seeing, but there are also ways in which it is different.

Let us consider the metaphor “the brain implements algorithms”, which I previously discussed. Some are irritated by the very suggestion that this might even be a metaphor. The rhetorical strategy is generally two-fold: 1) by “algorithm”, we mean some abstract property, not programs written in C++; 2) the definition of “algorithm” is made general enough that it is trivially true, in which case it is not a metaphor since it is literally true. As argued, (1) is a misunderstanding of linguistics because metaphor is about abstract properties. And if we follow (2), then nothing can be inferred from the statement. Thus, it is only to the extent that “the brain implements algorithms” is metaphorical that it is insightful (and it is to some extent, but in my view to a limited extent).

The key question, thus, is what we mean by “algorithm”. A natural starting point would be to take the definition from a computer science textbook. The most used textbook on the subject is probably Cormen et al., Introduction to algorithms. It proposes the following definition: “a sequence of computational steps that transform the input into the output”. One would need to define what “computational” means in this context, but it is not key for this discussion. With this definition, to say that the brain implements an algorithm means that there exists a morphism between brain activity and a sequence of computational steps. That is, intermediate values of the algorithm are mapped to properties of brain activity (e.g. firing rates measured over some time window) - this is the “encoding”. Then we claim that this mapping has the property that a computational step linking two values is mapped to the operation of the dynamics of the brain linking the two corresponding neural measurements. I explain in the third part of my essay on neural coding why this claim cannot be correct, at least not generally and only approximately (one reason is that a measurement of neural activity must be done on some time window, and thus cannot be considered as an initial state of a dynamical system, from which you could deduce the future dynamics). But this is not the point of this discussion. The point is that this claim, that there is a morphism between an algorithm and brain activity, is not trivial and it has explanatory value. In other words, it is interesting. This stems from the rich structure that is being mapped between the two domains.

Since it is not trivial (as in fact any metaphor), a discussion will necessarily arise about whether and to what extent the implied mapping does in fact preserve structure between the two domains. You could accept this state of affairs and provide empirical or theoretical arguments. Or you could dismiss the metaphorical nature entirely. But by doing so, you are also dismissing what is interesting about the metaphor, that is, the fact that there might be a morphism between two domains. We could for example redefine “algorithm” in a more general way as a computable function, even if it is not what is usually meant by that (as the Cormen textbook shows). But in that case, the claim loses all explanatory value because no structure at all is transported between the two domains. We are just calling sensory signals “input” and motor commands “output” and whatever happens in between “algorithm”. In mathematical terms, this is a mapping but not a morphism.

Thus, metaphors are interesting because they are morphisms between domains, which is what gives them scientific value (they are models). The problem, however, is that metaphor is typically covert, and failure to recognize them as such leads to dogmatism. When one objects to the use of some words like “code”, “algorithm”, “representation”, “optimization”, a common reaction is that the issue “is just semantic”. What this means is that it is just about arbitrary labels, and the labels themselves do not really matter. As if scientific discourse were essentially uninteresting and trivial (we just observe things and give them names). This reaction reveals a naïve view of language where words are mappings (between objects and arbitrary labels), when what matters is the structured concepts that words refer to through morphisms, not just mappings. This is what metaphor is about.

What is computational neuroscience? (XXXIV) Is the brain a computer (2)

In a previous post, I argued that the way the brain works is not algorithmic, and therefore it is not a computer in the common sense of the term. This contradicts a popular view in computational neuroscience that the brain is a kind of a computer that implements algorithms. That view comes from formal neural network theory, and the argumentation goes as follows. Formal neural networks can implement any computable function, which is a function that can be implemented by an algorithm. Thus the brain can implement algorithms for computable functions, and therefore is by definition a computer. There are multiple errors in this reasoning. The most salient error is a semantic drift on the concept of algorithm, the second major error is a confusion on what a computer is.

Algorithms

A computable function is a function that can be implemented by an algorithm. But the converse “if a function is computable, then whatever implements this function runs an algorithm” is not true. To see this, we need to be a bit more specific about what is meant by “algorithm” and “computable function”.

Loosely speaking, an algorithm is simply a set of explicit instructions to solve a problem. A cooking recipe is an algorithm in this sense. For example, to cook pasta: put water in a pan; heat up; when water boils, put pasta; wait for 10 minutes. The execution of this algorithm occurs in continuous time in a real environment. But what is algorithmic about this description is the discrete sequential flow of instructions. Water boiling itself is not algorithmic, the high-level instructions are: “when condition A is true (water boils), then do B (put pasta)”. Thus, when we speak of algorithms, we must define what is considered as elementary instructions, that is, what is beneath the algorithmic level (water boils, put pasta).

The textbook definition of algorithm in computer science is: "a sequence of computational steps that transform the input into the output." (Cormen et al., Introduction to algorithms; possibly the most used textbook on the subject). Computability is a way to formalize the notion of algorithm for functions of integers (in particular logical functions). To formalize it, one needs to specify what is considered an elementary instruction. Thus, computability does not formalize the loose notion of algorithm above, i.e, any recipe to calculate something, for otherwise any function would be computable and the concept would be empty (to calculate f(x), apply f to x). A computable function is a function that can be calculated by a Turing machine, or equivalently, which can be generated by a small set of elementary functions on integers (with composition and recursion). Thus, an algorithm in the sense of computability theory is a discrete-time sequence of arithmetic and logical operations (and recursion). Note that this readily extend to any countable alphabet instead of integers, and of course you can replace arithmetic and logical operations with higher-order instructions, as long as they are themselves computable (ie a high-level programming language). But it is not any kind of specification of how to solve a problem. For example, there are various algorithms to calculate pi. But we could also calculate pi by drawing a circle, measuring both the diameter and the perimeter, then dividing perimeter by diameter. This is not an algorithm in the sense of computability theory. It could be called an algorithm in the broader sense, but again note that what is algorithmic about it is the discrete structure of the instructions.

Thus, a device could calculate a computable function using an algorithm in the strict sense of computability, or in the broader sense (cooking recipe), or in a non-algorithmic way (i.e., without any discrete structure of instructions). In any case, what the brain or any device manages to do bears no relation with how it does it.

As pointed out above, what is algorithmic about a description of how something works is the discrete structure (first do A; if B is true, then do C, etc). If we removed this condition, then we would be left with the more general concept of model, not algorithm: a description of how something works. Thus, if we want to say anything specific by claiming that the brain implements algorithms, then we must insist on the discrete-time structure (steps). Otherwise, we are just saying that the brain has a model.

Now that we have more precisely defined what an algorithm is, let us examine whether the brain might implement algorithms. Clearly, it does not literally implement algorithms in the narrow sense of computability theory, i.e., with elementary operations on integers and recursion. But could it be that it implements algorithms in the broader sense? To get some perspective, consider the following two physical systems:

(A) are dominoes, (B) is a tent (illustration taken from my essay “Is coding a relevant metaphor for the brain?”). Both are physical systems that interact with an environment, in particular which can be perturbed by mechanical stimuli. The response of dominoes to mechanical stimuli might be likened to an algorithm, but that of the tent cannot. The fact that we can describe unambiguously (with physics) how the tent reacts to mechanical stimuli does not make the dynamics of the tent algorithmic, and the same is true of the brain. Formal neural networks (e.g. perceptrons or deep learning networks) are algorithmic, but the brain is a priori more like the tent: a set of coupled neurons that interact in continuous time, together and with the environment, with no evident discrete structure similar to an algorithm. As argued above, a specification of how these real neural networks work and solve problems is not an algorithm: it’s a model – unless we manage to map the brain’s dynamics to the discrete flow of an algorithm.

Computers

Thus, if a computer is something that solves problems by running algorithms, then the brain is not a computer. We may however consider a broader definition: the computer is something that computes, i.e., which is able to calculate computable functions. As pointed out above, this does not require the computer to run algorithms. For example, consider a box with some gas, a heater (input = temperature T) and a pressure sensor (output = P). The device computes the function P = nRT/V by virtue of physical laws, and not by an algorithm.

This box, however, is not a computer. Otherwise, any physical system would be called a computer. To be called a computer, the device should be able to implement any computable function. But what does it mean exactly? To run an arbitrary computable function, some parameters of the device need to be appropriately adjusted. Who adjusts these parameters and how? If we do not specify how this adjustment is being made, then the claim that the brain is a computer is essentially empty. It just says that for each function, there is a way to arrange the structure of the brain so that this function is achieved. It is essentially equivalent to the claim that atoms can calculate any computable function, depending on how we arrange them.

To call such a device a computer, we must additionally include a mechanism to adjust the parameters so that it does actually perform a particular computable function. This leads us to the conventional definition of a computer: something that can be instructed via computer programming. The notion of program is central to the definition of computers, whatever form this program takes. A crucial implication is that a computer is a device that is dependent on an external operator for its function. The external operator brings the software to the computer; without the ability to receive software, the device is not a computer.

In this sense, the brain cannot be a computer. We may then consider the following metaphorical extension: the brain is a self-programmed computer. But the circularity in this assertion is problematic. If the program is a result of the program itself, then the “computer” cannot actually implement any computable function, but only those that result from its autonomous functioning. A cat, a mouse, an ant and a human do not actually do the same things, and cannot even in principle do the same tasks.

Finally, is computability theory the right framework to describe the activity of the brain in the first place? It is certainly not the right framework to describe the interaction of a tent with its environment, so why would it be appropriate for the brain, an embodied dynamical system in circular relation with the environment? Computability theory is a theory about functions. But a dynamical system is not a function. You can of course define functions on dynamical systems, even though they do not fully characterize the system. For example, you can define the function that maps the current state to the state at some future time. In the case of the brain, we might want to define a function that maps an external perturbation of the system (i.e. a stimulus) to the state of the system at some future time. However, this is not well defined, because it depends on the state of the system at the time of the perturbation. This problem does not occur with formal neural networks precisely because these are not dynamical systems but mappings. The brain is spontaneously active, whether there is a “stimulus” or not. The very notion of the organism as something that responds to stimuli is the most naïve version of behaviorism. The organism has an endogenous activity and a circular relation to its environment. Consider for example central pattern generators: these are rhythmic patterns produced in the absence of any input. Not all dynamical systems can be framed into computability theory, and in fact most of them, including the brain, cannot because they are not mappings.

Conclusion

As I have argued in my essay on neural coding, there are two core problems with the computer metaphor of the brain (it should be clear by now that this is a metaphor and not a property). One is that it tries to match two causal structures that are totally incongruent, just like dominoes and a tent. The other is that the computer metaphor, just as the coding metaphor, implicitly assumes an external operator – who programs it / interprets the code. Thus, what these two metaphors fundamentally miss is the epistemic autonomy of the organism.

What is computational neuroscience? (XXXIII) The interactivist model of cognition

The interactivist model of cognition has been developed by Mark Bickhard over the last 40 years or so. It is related to the viewpoints of Gibson and O’Regan, among others. The model is described in a book (Bickhard and Tervenn, 1996) and a more recent review (Bickhard 2008).

It starts with a criticism of what Bickhard calls “encodingism”, the idea that mental representations are constituted by encodings, correspondences between things in the world and symbols (this is very similar to my criticism of the neural coding metaphor, except Bickhard’s angle is cognitive science while mine was neuroscience). The basic argument is that the encoding “crosses the boundary of the epistemic agent”: the perceptual system stands on only one side of the correspondence, so there is no way it can interpret symbols in terms of things in the world since it never has access to things in the world at any point. The interpretation of the symbols in terms of things in the world would require an interpreter, some entity that makes sense of a priori arbitrary symbols. But this was precisely the epistemic problem to be solved, so the interpreter is a homunculus and this is an incoherent view. This is related to the skeptic argument about knowledge: there cannot be valid knowledge since we acquire knowledge by our senses and we cannot step outside of ourselves to check that it is valid. Encodingism fails the skeptic objection. Note that Bickhard refutes neither the possibility of representations nor even the possibility of encodings, but rather the fact that encodings can be foundational of representations. There can be derivative encodings, based on existing representations (for example Morse is a derivative encoding, which presupposes that we know about both letters and dots and dashes).

A key feature that a representational system must have is what Bickhard calls “system-detectable errors”. A representational system must be able to test whether its representations are correct or not. This is not possible in encodingism because the system does not have access to what is being represented (knowledge that cannot be checked is what I called “metaphysical knowledge” in my Subjective physics paper). No learning is possible if there are no system-detectable errors. This is the problem of normativity.

The interactivist model proposes the following solution: representations are anticipations of potential interactions and their expected impact on future states of the systems, or on the future course of processes of the system (this is close to Gibson’s “affordances”). I give an example taken from Subjective physics. Consider a sound source located somewhere in space. What does it mean to know where the sound came from? In the encoding view, we would say that the system has a mapping between the angle of the source and properties of the sounds, and so it infers the source’s angle from the captured sounds. But what can this mean? Is the inferred angle in radians or degrees? Surely radians and degrees cannot make sense for the perceiver and cannot have been learned (this is what I called “metaphysical knowledge”), so in fact the representation cannot actually be in the form of the physical angle of the source. Rather, what it means that the source is at a given position is that (for example) you would expect that moving your eyes in a particular way would make the source appear in your fovea (see more detail about the Euclidean structure of space and related topics in Subjective physics). Thus, the notion of space is a representation of the expected consequences of certain types of actions.

The interactivist model of representations has the desirable property that it has system-detectable errors: a representation can be correct or not, depending on whether the anticipation turns out to be correct or not. Importantly, what is anticipated is internal states, and therefore the representation does not cross the boundary of the epistemic agent. Contrary to standard models of representation, the interactivist model successfully addresses the skeptic argument.

The interactivist model is described at a rather abstract level, often referring to abstract machine theory (states of automata). Thus, it leaves aside the problem of its naturalization: how is it instantiated by the brain? Important questions to address are: what is a ‘state’ of the brain? (in particular given that the brain is a continuously active dynamical system where no “end state” can be identified); how do we cope with its distributed nature, that is, that the epistemic agent is itself constituted of a web of interacting elementary epistemic agents? how are representations built and instantiated?

What is computational neuroscience? (XXXII) The problem of biological measurement (2)

In the previous post, I have pointed out differences between biological sensing and physical measurement. A direct consequence is that it is not so straightforward to apply the framework of control theory to biological systems. At the level of behavior, it seems clear that animal behavior involves control; it is quite documented in the case of motor control. But this is the perspective of an external observer: the target value, the actual value and the error criterion are identified with physical measurements by an external observer. But how does the organism achieve this control, from its own perspective?

What the organism does not do, at least not directly, is measure the physical dimension and compare it to a target value. Rather, the biological system is influenced by the physical signal and reacts in a way that makes the physical dimension closer to a target value. How? I do not have a definite answer to this question, but I will explore a few possibilities.

Let us first explore a conventional possibility. The sensory neuron encodes the sensory input (eg muscle stretch) in some way; the control system decodes it, and then compares it to a target value. So for example, let us say that the sensory neuron is an integrate-and-fire neuron. If the input is constant, then the interspike interval can be mapped back to the input value. If the input is not constant, it is more complicated but estimates are possible. There are various studies relevant to this problem (for example Lazar (2004); see also the work of Sophie Denève, e.g. 2013). But all these solutions require knowing quite precisely how the input has been encoded. Suppose for example that the sensory neuron adapts with some time constant. Then the decoder needs somehow to de-adapt. But to do it correctly, one needs to know the time constant accurately enough, otherwise biases are introduced. If we consider that the encoder itself learns, e.g. by adapting to signal statistics (as in the efficient coding hypothesis), then the properties of the encoder must be considered unknown by the decoder.

Can the decoder learn to decode the sensory spikes? The problem is it does not have access to the original signal. The key question then is: what could the error criterion be? If the system has no access to the original signal but only streams of spikes, then how could it evaluate an error? One idea is to make an assumption about some properties of the original signal. One could for example assume that the original signal varies slowly, in contrast with the spike train, which is a highly fluctuating signal. Thus we may look for a slow reconstruction of the signal from the spike train; this is in essence the idea of slow feature analysis. But the original signal might not be slowly fluctuating, as it is influenced by the actions of the controller, so it is not clear that this criterion will work.

Thus it is not so easy to think of a control system which would decode the sensory neuron activity into the original signal so as to compare it to a target value. But beyond this technical issue (how to learn the decoder), there is a more fundamental question: why splitting the work into two units (encoder/decoder), if the function of the second one is essentially to undo the work of the first one?

An alternative is to examine the system as a whole. We consider the physical system (environment), the sensory neuron, the actuator, and the interneurons (corresponding to the control system). Instead of seeing the sensory neuron as involved in an act of measurement and communication and the interneurons as involved in an act of interpretation and command, we see the entire system as a distributed dynamical system with a number of structural parameters. In terms of dynamical systems (rather than control), the question becomes: is the target value for the physical dimension an attractive fixed point of this system, or more generally, is there such a fixed point? (as opposed to fluctuations) We can then derive complementary questions:

  • robustness: is the fixed point robust to perturbations, for example changes in properties of the sensor, actuator or environment?
  • optimality: are there ways to adjust the structure of the system so that the firing rate is minimized (for example)?
  • control: can we change the fixed point by an intervention on this system? (e.g. on the interneurons)

Thus, the problem becomes one of designing a spiking system that has an attractive fixed point in the physical dimension, with some desirable properties. Framing the problem in this way does not necessarily require that the physical dimension is explicitly extracted (“decoded”) from the activity of the sensory neuron. If we look at such a system, we might not be able to identify in any of the neurons a quantity that corresponds to the physical signal, or to the target value. Rather, physical signal and target value are to be found in the physical environment, and it is a property of the coupled dynamical system (neurons-environment) that the physical signal tends to approach the target value.

What is computational neuroscience? (XXXI) The problem of biological measurement (1)

We tend to think of sensory receptors (photoreceptors, inner hair cells) or sensory neurons (retinal ganglion cells; auditory nerve fibers) as measuring physical dimensions, for example light intensity or acoustical pressure, or some function of it. The analogy is with physical instruments of measure, like a thermometer or a microphone. This confers a representational quality to the activity of neurons, an assumption that is at the core of the neural coding metaphor. I explain at length why that metaphor is misleading in many ways in an essay (Brette (2018) Is coding a relevant metaphor for the brain?). Here I want to examine more specifically the notion of biological measurement and the challenges it poses.

This notion comes about not only in classical representationalist views, where neural activity is seen as symbols that the brain then manipulates (the perception-cognition-action model, also called sandwich model), but also in alternative views, although it is less obvious. For example, one alternative is to see the brain not as a computer system (encoding symbols, then manipulating them) but as a control system (see Paul Cisek’s behavior as interaction, William Powers’ perceptual control theory, Tim van Gelder’s dynamical view of cognition). In this view, the activity of neurons does not encode stimuli. In fact there is no stimulus per se, as Dewey pointed out: “the motor response determines the stimulus, just as truly as sensory stimulus determines the movement.”.

A simple case is feedback control: the system tries to maintain some input at a target value. To do this, the system must compare the input with an internal value. We could imagine for example something like an idealized version of the stretch reflex: when the muscle is stretched, a sensory feedback triggers a contraction, and we want to maintain muscle length constant. But this apparently trivial task raises a number of deep questions, as more generally the application of control theory to biological systems. I suppose there is a sensor, a neuron that transduces some physical dimension into spike trains, for example the stretch of a muscle. There is also an actuator, which reacts to a spike by a physical action, for example contracting the muscle with a particular time course. I chose a spike-based description not just because it corresponds to the physiology of the stretch reflex, but also because it will illustrate some fundamental issues (which would exist also with graded transduction, but less obviously so).

Now we have a neuron, or a set of neurons, which receive these sensory inputs and send spikes to the actuator. For this discussion, it is not critical that these are actually neurons; we can just consider that there is a system there, and we ask how this system should be designed so as to successfully achieve a control task.

The major issue here is that the control system does not directly deal with the physical dimension. At first sight, we could think this is a minor issue. The physical dimension gets transduced, and we could simply define the target value in the transduced dimension (eg the current). But here we see that the problem is more serious. What the control system deals with is not simply a function of the physical dimension. More accurately, transduction is a nonlinear dynamical system influenced by a physical signal. The physical signal can be constant, for example, while the transduced current decays (adaptation) and the sensory neuron outputs spike trains, i.e., a highly variable signal. This poses a much more serious problem than a simple calibration problem. When the controlled physical value is at the target value, the sensory neuron might be spiking, perhaps not even at a regular rate. The control system should react to that particular kind of signal by not acting, while it should act when the signal deviates from it. But how can the control system identify the target state, or even know whether to act in one or the opposite direction?

Adaptation in neurons is often depicted as an optimization of information transmitted, in line with the metaphor of the day (coding). But the relevant question is: how does the receiver of this “information” knows how the neuron has adapted? Does it have to de-adapt, to somehow be matched to the adaptive process of the encoding neuron? (This problem has to do with the dualistic structure of the neural coding metaphor).

There are additional layers of difficulty. We have first recognized that transduction is not a simple mapping from a physical dimension to a biological (e.g. electrochemical) dimension, but rather a dynamical system influenced by a physical signal. Now this dynamical system depends on the structure of the sensory neuron. It depends for example on the number of ionic channels and their properties, and we know these are highly plastic and indeed quite variable both across time and across cells. This dynamical system also depends on elements of the body, or let’s say more generally the neuron’s environment. For example, the way acoustical pressure is transduced in current by an inner hair cell depends obviously on the acoustical pressure at the eardrum, but that physical signal depends on the shape the ear, which filters sounds. Properties of neurons change with time too, development and aging. Thus, we cannot assume that the dynamical transformation from physical signal to biological signal is a fixed one. Somehow, the control system has to work despite this huge plasticity and the dynamical nature of the sensors.

Let us pause for a moment and outline a number of differences between physical measurements, as with a thermometer, and biological measurements (or “sensing”):

  • The physical meter is calibrated with respect to an external reference, for example 0°C is when water freezes, while 100°C is when it boils. The biological sensor cannot be calibrated with respect to an external reference.
  • The physical meter produces a fixed value for a stationary signal. The biological sensor produces a dynamical signal in response to a stationary signal. More accurately, the biological sensor is a nonlinear dynamical system influenced by the physical signal.
  • The physical meter is meant to be stable, in that the mapping from physical quantity to measurement is fixed. When it is not, this is considered an error. The biological sensor does not have fixed properties. Changes in properties occur in the normal course of life, from birth to death, and some changes in properties are interpreted as adaptations, not errors.

From these differences, we realize that biological sensors do not provide physical measurements in the usual sense. The next question, then, is how can a biological system control a physical dimension with biological sensors that do not act as measurements of that dimension?

What is computational neuroscience? (XXX) Is the brain a computer?

It is sometimes stated as an obvious fact that the brain carries out computations. Computational neuroscientists sometimes see themselves as looking for the algorithms of the brain. Is it true that the brain implements algorithms? My point here is not to answer this question, but rather to show that the answer is not self-evident, and that it can only be true (if at all) at a fairly abstract level.

One line of argumentation is that models of the brain that we find in computational neuroscience (neural network models) are algorithmic in nature, since we simulate them on computers. And wouldn’t it be a sort of vitalistic claim that neural networks cannot be (in principle) simulated on computer?

There is an important confusion in this argument. At a low level, neural networks are modelled biophysically as dynamical systems, in which the temporality corresponds to the actual temporality of the real world (as opposed to the discrete temporality of algorithms). Mathematically, those are typically differential equations, possibly hybrid systems (i.e. coupled by timed pulses), in which time is a continuous variable. Those models can of course be simulated on computer using discretization schemes. For example, we choose a time step and compute the state of the network at time t+dt, from the state at time t. This algorithm, however, implements a simulation of the model; it is not the model that implements the algorithm. The discretization is nowhere to be found in the model. The model itself, being a continuous time dynamical system, is not algorithmic in nature. It is not described as a discrete sequence of operations; it is only the simulation of the model that is algorithmic, and different algorithms can simulate the same model.

If we put this confusion aside, then the claim that neural networks implement algorithms becomes not that obvious. It means that trajectories of the dynamical system can be mapped to the discrete flow of an algorithm. This requires: 1) to identify states with representations of some variables (for example stimulus properties, symbols); 2) to identify trajectories from one state to another as specific operations. In addition to that, for the algorithmic view to be of any use, there should be a sequence of operations, not just one operation (ie, describing the output as a function of the input is not an algorithmic description).

A key difficulty in this identification is temporality: the state of the dynamical system changes continuously, so how can this be mapped to discrete operations? A typical approach is neuroscience is to consider not states but properties of trajectories. For example, one would consider the average firing rate in a population of neurons in a given time window, and the rate of another population in another time window. The relation between these two rates in the context of an experiment would define an operation. As stated above, a sequence of such relations should be identified in order to qualify as an algorithm. But this mapping seems only possible within a feedforward flow; coupling poses a greater challenge for an algorithmic description. No known nervous system, however, has a feedforward connectome.

I am not claiming here that the function of the brain (or mind) cannot possibly be described algorithmically. Probably some of it can be. My point is rather that a dynamical system is not generically algorithmic. A control system, for example, is typically not algorithmic (see the detailed example of Tim van Gelder, What might cognition be if not computation?). Thus a neural dynamical system can only be seen as an algorithm at a fairly abstract level, which can probably address only a restricted subset of its function. It could be that control, which also attaches function to dynamical systems, is a more adequate metaphor of brain function than computation. Is the brain a computer? Given the rather narrow application of the algorithmic view, the reasonable answer should be: quite clearly not (maybe part of cognition could be seen as computation, but not brain function generally).

What is computational neuroscience? (XXIX) The free energy principle

The free energy principle is the theory that the brain manipulates a probabilistic generative model of its sensory inputs, which it tries to optimize by either changing the model (learning) or changing the inputs (action) (Friston 2009; Friston 2010). The “free energy” is related to the error between predictions and actual inputs, or “surprise”, which the organism wants to minimize. It has a more precise mathematical formulation, but the conceptual issues I want to discuss here do not depend on it.

Thus, it can be seen as an extension of the Bayesian brain hypothesis that accounts for action in addition to perception. It shares the conceptual problems of the Bayesian brain hypothesis, namely that it focuses on statistical uncertainty, inferring variables of a model (called “causes”) when the challenge is to build and manipulate the structure of the model. It also shares issues with the predictive coding concept, namely that there is a conflation between a technical sense of “prediction” (expectation of the future signal) and a broader sense that is more ecologically relevant (if I do X, then Y will happen). In my view, these are the main issues with the free energy principle. Here I will focus on an additional issue that is specific of the free energy principle.

The specific interest of the free energy principle lies in its formulation of action. It resonates with a very important psychological theory called cognitive dissonance theory. That theory says that you try to avoid dissonance between facts and your system of beliefs, by either changing the beliefs in a small way or avoiding the facts. When there is a dissonant fact, you generally don’t throw your entire system of beliefs: rather, you alter the interpretation of the fact (think of political discourse or in fact, scientific discourse). Another strategy is to avoid the dissonant facts: for example, to read newspapers that tend to have the same opinions as yours. So there is some support in psychology for the idea that you act so as to minimize surprise.

Thus, the free energy principle acknowledges the circularity of action and perception. However, it is quite difficult to make it account for a large part of behavior. A large part of behavior is directed towards goals; for example, to get food and sex. The theory anticipates this criticism and proposes that goals are ingrained in priors. For example, you expect to have food. So, for your state to match your expectations, you need to seek food. This is the theory’s solution to the so-called “dark room problem” (Friston et al., 2012): if you want to minimize surprise, why not shut off stimulation altogether and go to the closest dark room? Solution: you are not expecting a dark room, so you are not going there in the first place.

Let us consider a concrete example to show that this solution does not work. There are two kinds of stimuli: food, and no food. I have two possible actions: to seek food, or to sit and do nothing. If I do nothing, then with 100% probability, I will see no food. If I seek food, then with, say, 20% probability, I will see food.

Let’s say this is the world in which I live. What does the free energy principle tell us? To minimize surprise, it seems clear that I should sit: I am certain to not see food. No surprise at all. The proposed solution is that you have a prior expectation to see food. So to minimize the surprise, you should put yourself into a situation where you might see food, ie to seek food. This seems to work. However, if there is any learning at all, then you will quickly observe that the probability of seeing food is actually 20%, and your expectations should be adjusted accordingly. Also, I will also observe that between two food expeditions, the probability to see food is 0%. Once this has been observed, surprise is minimal when I do not seek food. So, I die of hunger. It follows that the free energy principle does not survive Darwinian competition.

Thus, either there is no learning at all and the free energy principle is just a way of calling predefined actions “priors”; or there is learning, but then it doesn’t account for goal-directed behavior.

The idea to act so as to minimize surprise resonates with some aspects of psychology, like cognitive dissonance theory, but that does not constitute a complete theory of mind, except possibly of the depressed mind. See for example the experience of flow (as in surfing): you seek a situation that is controllable but sufficiently challenging that it engages your entire attention; in other words, you voluntarily expose yourself to a (moderate amount of) surprise; in any case certainly not a minimum amount of surprise.

What is computational neuroscience? (XXVIII)The Bayesian brain

Our sensors give us an incomplete, noisy, and indirect information about the world. For example, estimating the location of a sound source is difficult because in natural contexts, the sound of interest is corrupted by other sound sources, reflections, etc. Thus it is not possible to know the position of the source with certainty. The ‘Bayesian coding hypothesis’ (Knill & Pouget, 2014) postulates that the brain represents not the most likely position, but the entire probability distribution of the position. It then uses those distributions to do Bayesian inference, for example, when combining different sources of information (say, auditory and visual). This would allow the brain to optimally infer the most likely position. There is indeed some evidence for optimal inference in psychophysical experiments – although there is also some contradicting evidence (Rahnev & Denison, 2018).

The idea has some appeal. The problem is that, by framing perception as a statistical inference problem, it focuses on the most trivial type of uncertainty, statistical uncertainty. It is illustrated by the following quote: “The fundamental concept behind the Bayesian approach to perceptual computations is that the information provided by a set of sensory data about the world is represented by a conditional probability density function over the set of unknown variables”. Implicit in this representation is a particular model, for which variables are defined. Typically, one model describes a particular experimental situation. For example, the model would describe the distribution of auditory cues associated with the position of the sound source. Another situation would be described by a different model, for example one with two sound sources would require a model with two variables. Or if the listening environment is a room and the size of that room might vary, then we would need a model with the dimensions of the room as variables. In any of these cases where we have identified and fixed parametric sources of variation, then the Bayesian approach works fine, because we are indeed facing a problem of statistical inference. But that framework doesn’t fit any real life situation. In real life, perceptual scenes have variable structure, which corresponds to the model in statistical inference (there is one source, or two sources, we are in a room, the second source comes from the window, etc). The perceptual problem is therefore not just to infer the parameters of the model (dimensions of the room etc), but also the model itself, its structure. Thus, it is not possible in general to represent an auditory scene by a probability distribution on a set of parameters, because the very notion of a parameter already assumes that the structure of the scene is known and fixed.

Inferring parameters for a known statistical model is relatively easy. What is really difficult, and is still challenging for machine learning algorithms today, is to identify the structure of a perceptual scene, what constitutes an object (object formation), how objects are related to each other (scene analysis). These fundamental perceptual processes do not exist in the Bayesian brain. This touches on two very different types of uncertainty: statistical uncertainty, variations that can be interpreted and expected in the framework of a model; and epistemic uncertainty,  the model is unknown (the difference has been famously explained by Donald Rumsfeld).

Thus, the “Bayesian brain” idea addresses an interesting problem (statistical inference), but it trivializes the problem of perception, by missing the fact that the real challenge is epistemic uncertainty (building a perceptual model), not statistical uncertainty (tuning the parameters): the world is not noisy, it is complex.

What is computational neuroscience? (XXVII) The paradox of the efficient code and the neural Tower of Babel

A pervasive metaphor in neuroscience is the idea that neurons “encode” stuff: some neurons encode pain; others encode the location of a sound; maybe a population of neurons encode some other property of objects. What does this mean? In essence, that there is a correspondence between some objective property and neural activity: when I feel pain, this neuron spikes; or, the image I see is “represented” in the firing of visual cortical neurons. The mapping between the objective properties and neural activity is the “code”. How insightful is this metaphor?

An encoded message is understandable to the extent that the reader knows the code. But the problem with applying this metaphor to the brain is only the encoded message is communicated, not the code, and not the original message. Mathematically, original message = encoded message + code, but only one term is communicated. This could still work if there were a universal code that we could assume all neurons can read, the “language of neurons”, or if somehow some information about the code could be gathered from the encoded messages themselves. Unfortunately, this is in contradiction with the main paradigm in neural coding theory, “efficient coding”.

The efficient coding hypothesis stipulates that neurons encode signals into spike trains in an efficient way, that is, it uses a code such that all redundancy is removed from the original message while preserving information, in the sense that the encoded message can be mapped back to the original message (Barlow, 1961; Simoncelli, 2003). This implies that with a perfectly efficient code, encoded messages are undistinguishable from random. Since the code is determined on the statistics of the inputs and only the encoded messages are communicated, a code is efficient to the extent that it is not understandable by the receiver. This is the paradox of the efficient code.

In the neural coding metaphor, the code is private and specific to each neuron. If we follow this metaphor, this means that all neurons speak a different language, a language that allows expressing concepts very concisely but that no one else can understand. Thus, according to the coding metaphor, the brain is a Tower of Babel.

Can this work?