In a previous post, I criticized the notion of “neural code”. One of my main points was that information can only make sense in conjunction with a particular observer. I am certainly not the first one to make this remark: for example, it is presented in a highly cited review by deCharms and Zador (2000). More recently Buzsaki defended this point of view in a review (Neuron 2010), and from the notes in the supplemental material, it appears that he is clearly more philosophically lucid than the average neuroscientist on these issues (check the first note). I want to come back on this issue in more detail.
When one speaks of information or code in neuroscience, it is generally meant in the sense of Shannon. This is a very specific notion of information coming from communication theory. There is an emitter who wants to transmit some message to a receiver. The message is transmitted in an altered form called “code”, for example Morse code, which contains “information” insofar as it can be “decoded” by the observer into the original message. The metaphor is generally carried to neuroscience in the following form: there are things in the external world that are described in some way by the experimenter, for example bars with a variable orientation, and the activity of the nervous system is seen as a “code” for this description. It may carry “information” about the orientation of the bar insofar one can reconstruct the orientation from the neural activity.
It is important to realize how limited this metaphor is, and indeed that it is a metaphor. In a communication channel, the two ends agree upon a code, for example on the correspondence between letters and Morse code. For the receiving end, the fact that the message is information in the common sense of the word relies on two things: 1) that the correspondence is known, 2) that the initial message itself makes sense for the receiver. For example, imagine a few centuries ago, someone is given a papyrus with ancient Egyptian hieroglyphs. Probably it will represent very little information for that person because she has no way to make sense of it. The papyrus becomes informative with the Rosetta stone, where the same text is written in ancient Egyptian and in ancient Greek, so that the papyrus can be translated to ancient Greek. But of course this becomes information only if ancient Greek makes sense for the person that reads it!
So the metaphor of a “neural code”, understood in Shannon’s sense, is problematic in two ways: 1) the experimenter and the nervous system obviously do not agree upon a code, and 2) how the original “message” makes sense for the nervous system is left entirely unspecified. I will give another example to make it clearer. Imagine you have a vintage thermometer (non-digital), but that thermometer does not have any graduation. You could replace the thermometer by the activity of a temperature-sensitive neuron. From the point of view of information theory, there is just as much information about temperature in the liquid level than if temperature were given as a number of Celsius degrees. But clearly for an observer, there is very little information because one does not know the relationship between the level of the liquid and the physical temperature, so it is essentially useless. Perhaps one could say that the level says something relative about temperature, that is, whether a temperature is hotter than another one. But even this is not true, because it relies on the prior knowledge that the level of the liquid increases when the temperature increases, a physical law that is not obvious at all. So to make sense of the liquid level, one would actually rely on association with other sources of information that are not given by the thermometer, e.g. that for some level one feels cold and that for another level one feels hot. But now this means that the information in the liquid level is actually limited (and in fact defined) not by the “communication channel” (how accurate the thermometer is) but by the external source of knowledge that provides meaning to the liquid level. This limitation comes from the fact that at no moment in time is the true temperature in Kelvin given as an objective truth to the observer. The only way it gets information is through its own sensors. This is why Shannon’s information is highly misleading as a metaphor for information in biological systems: there can be no agreed code between the environment and the organism. The organism has to learn ancient Egyptian just with hieroglyphs.
To finish with this example, imagine now that the thermometer is graduated, so you can read the temperature. Wouldn’t this provide the objective information that was previously missing? As a matter of fact, not really. For example, as a European, if I am given the temperature in Fahrenheit degrees, I have no idea whether it is hot or cold. So the situation is not different for me than previously. Of course if I am also given the correspondence between Fahrenheit and Celsius, then it will start making sense for me. But how can Celsius degrees make sense for me in the first place? Again these are just numbers with arbitrary units. Celsius degrees make sense because they can be related to physical processes linked with temperature: water freezes at 0° and boils at 100°. Presumably, the same thing applies to our perception of temperature: the body senses a change in firing rate of some temperature-sensitive neuron, and this becomes information about temperature because it can be associated with a number of biophysical processes linked with temperature, say sweating, and all these effects can be noticed. In fact, what this example shows is that the activity of the temperature-sensitive neuron does not provide information about physical temperature (number of Kelvin degrees), but rather about the occurrence of various other events that can be captured with other sensors. This set of relationships between events is, in a way, the definition of temperature for the organism, rather than some number in arbitrary units.
Let us summarize. In Shannon’s information theory, it is implicitly assumed that there are two ends in a communication channel, and that 1) both ends agree upon a code, i.e., a correspondence between descriptive elements of information on both ends, and that 2) the initial message on the emitter end makes sense for the observer at the other end. None of these two assumptions apply to a biological organism because there is only one end. All the information that it can ever get about the world comes from that end, and so in this context Shannon’s information only makes sense for an external observer who can see both ends. A typical error coming from the failure to realize this fact is to highly overestimate the information in neural activity about some experimental quantity. I discussed this specific point in detail in a recent paper. The overestimation comes simply from the fact that detailed knowledge about the experiment is implicitly assumed on behalf of the nervous system.
Followed to its logical conclusions, the information-processing line of reasoning leads to what Daniel Dennett called the “Cartesian theater”. If neural activity gives information about the world in Shannon’s sense, then this means that at some final point this neural activity has to be analyzed and related to the external world. Indeed if this does not happen, then we cannot be speaking about Shannon information, for there is no link with the initial message. So this means that there is some critical stage at which neural activity is interpreted in objective terms. As Dennett noted, this is conceptually not very far from the dualism of Descartes, who thought that there is a non-material mind that reads the activity of the nerves and interprets it in terms of the outside physical world. The “Cartesian theater” is the brain seen as a screen where the world is projected, that a homunculus (the mind) watches.
Most neuroscientists reject dualism, but if one is to reject dualism, then there must be no final stage at which the observer end of the communication channel (the senses) is put in relationship with the emitter end (the world). All information about the world must come from the senses, and the senses alone. Therefore, this “information” cannot be meant in Shannon’s sense.
This, I believe, is essentially what James Gibson meant when he criticized the information-processing view of cognition. It is also related to Hubert Dreyfus’s criticism of artificial intelligence. More recently, Kevin O’Regan made similar criticisms. In his most cited paper with Noë (O’Regan and Noë, BBS 2001), there is an illuminating analogy, the “villainous monster”. Imagine you are exploring the sea with an underwater vessel. But a villainous monster mixes all the cables and so all the sensors and actuators are now related to the external world in a new way. How can you know anything about the world? The only way is to analyze the structure of sensor data and their relationships with actions that you can perform. So if one rejects dualism, then this is the kind of information that is available to the nervous system. A salient feature of this notion of information is that, contrary to Shannon’s information, it is defined not as numbers but as relations or statements: if I do action A, then sensory property B happens; if sensory property A happens, then another property B will happen next; if I do action A in sensory context B, then C happens.
Philosophy of knowledge
We have concluded that, if dualism is to be rejected, then the right notion of information for a biological organism is in terms of statements. This makes the problem of perception quite similar to that of science. Science is made of universal statements, such as the law of gravitation. But not all statements are scientific, for example “there is a God”. In philosophy of knowledge, Karl Popper proposed that a scientific statement is one that can potentially be falsified by an observation, whereas a metaphysical statement is a statement that cannot be falsified. For example, the statement “all penguins are black” is scientific, because I could imagine that one day I see a white penguin. On the other hand, the statement “there is a God” is metaphysical, because there is no way I can check. Closer to the matter of this text, the statement “the world is actually five-dimensional but we live in a three-dimensional subspace” is also metaphysical because independently of whether it is true or not, we have no way to confirm it or falsify it.
To come back to the matter of this text, I propose to qualify as metaphysical for an organism all knowledge that cannot be falsified, given the senses and possibilities for action. For example, in an experiment, one could relate the firing rate of a neuron with the orientation of a bar presented in front of the eyes. There is information in Shannon’s sense about the orientation in the firing rate. This means that we can “decode” the firing rate into the parameter “orientation”. However this decoding requires metaphysical knowledge because “orientation” is defined externally by the experimenter, it does not come out from the neuron’s activity itself. From the neuron’s point of view, there is no way to falsify the statement “10 Hz means horizontal bar”, because the notion of horizontal (or bar) is either defined in relation to something external to the neuron, or by its activity itself (horizontal is when the activity is 10 Hz) and in this latter case the statement is a tautology.
Therefore it appears that there can be very little information without metaphysical knowledge in the response of a single neuron, or in its input. Note that it is not completely empty, for there could be information about the future state of the neuron in the present state.
The structure of information and “neural assemblies”
When information is understood as statements rather than numbers to be decoded, it appears that information to be represented by the brain is much richer than implied by the usual notion inspired by Shannon’s communication theory. In particular, the problem of perception is not just to relate a vector of numbers (e.g. firing rates) to a particular set of parameters representing an object in the world. What is to be perceived is much richer than that. For example, in a visual scene, there could be Paul, a person I know, wearing a new sweater, sitting in a car. What is important here is that a scene is not just a “bag of objects”: objects have relationships with each other, and there are many possible different relationships. For example there is a car and there is Paul, and Paul is in a specific relationship with the car, that of “sitting in it”.
Unfortunately this does not fit well with the concept of “neural assemblies”, which is the mainstream assumption about how things we perceive are represented in the brain. If it is true that any given object is represented by the firing of a given assembly of neurons, then several objects should be represented by the firing of a bigger assembly of neurons, the union of all assemblies, one for each object. Several authors have noted that this may lead to the “superposition catastrophe”, i.e., there may be different sets of objects whose representations are fused into the same big assembly. But let us assume that this problem has somehow been solved and that there is no possible confusion. Still, the representation of a scene can be nothing else than an unstructured “bag of objects”, there are no relationships between objects in the assembly representation. One way to save the assembly concept is to consider that there are combination assemblies, which code for specific combinations of things, perhaps in a particular relationship. But this cannot work if it is the first time I see Paul in that sweater. There is a fundamental problem with the concept of neural assembly, which is that there is representation of relations, only of things to be related. In analogy with language, there is no syntax in the concept of neural assemblies. This is actually the analogy chosen by Buzsaki in his recent Neuron review (2010).
This remark, mostly made in the context of the binding problem, has led authors such as von der Malsburg to postulate that synchrony is used to bind the features of an object, as represented by neural firing. This avoids the superposition catastrophe because at a given time, only one object is represented by neural firing. It also addresses the problem of composition: by defining different timescales for synchrony, one may build representations for objects composed of parts, possibly in a recursive manner. However, the analogy of language shows that this is not going to be enough, because only one type of relation can be represented in this way. But the same analogy also shows that it is conceptually possible to represent structures as complex as linguistic structure by using time, in analogy with the flow of a sentence. Just for the sake of argument, and I do not mean that this is a plausible proposition (although it could be), you could imagine that assemblies can code either things (Paul, a car, a jumper) or relations between things (sitting, wearing), that only one assembly would be active at a time, and that the order of activation indicate which things a relation applies to. Here not only synchrony is important, but also the order of spikes. This idea is quite similar to Buszaki’s “neural syntax” (based on oscillations), but I would like to emphasize a point that I believe has not been noticed: that assemblies must stand not only for things but also for relations between things (note that “things” can also be thought of relations, and in this case we are speaking of relations of different orders).
All this discussion, of course, is only meant to save the concept of neural assembly and perhaps one might simply consider that a completely different concept should be looked for. I do not discard this more radical possibility. However, I note that if it is admitted that neurons interact mostly with spikes, then somehow the spatio-temporal pattern of spikes is the only way that information can be carried. Unless, perhaps, we are completely misled by the notion of “information”.