In this blog, I have argued many times that if there are neural representations, these must be about relations. For example, a relation between two sensory signals, or about a potential action and the effect on the sensory signals. But what does it mean exactly that something (say neural activity) “represents” a relation? It turns out that the answer is not so obvious.
The classical way to understand it is to consider that a representation is something (an event, a number of spikes, etc) that stands for the thing being represented. That is, there is a mapping between the thing being represented and the thing that represents it. For example, in the Jeffress model of sound localization, the identity of the most active binaural neuron stands for the location of the sound, or in terms of relation, for the fact that the right acoustical signal is a delayed copy of the left acoustical signal, with a specific delay. The difficulty here is that a representation always involves three elements: 1) the thing to be represented, 2) the thing that represents it, 3) the mapping between the first two things. But in the classical representational view, we are left with only the second element. In what sense does the firing of a binaural neuron tells us that there is such a specific relation between the monaural signals? Well it doesn’t, unless we already know in advance that this is what the firing of that neuron stands for. But from observing the firing of the binaural neurons, there is no way we can ever know that: we just see neurons lighting up sometimes.
There are different ways to address this issue. The simplest one is simply to say: it doesn’t matter. The activity of the binaural neurons represents a relationship between the monaural neurons, at least for us external observers, but the organism doesn’t care: what matters is that their activity can be related to the location of the sound source, defined for example as the movement of the eyes that put the sound source in the fovea. In operational terms, the organism must be able to take an action conditionally to the validity of a given relation, but what this relation exactly is in terms of the acoustical signals doesn’t matter.
An important remark is in order here. There is a difference between representing a relation and representing a quantity (or vector), even in this simple notion of representation. A relation is a statement that may be true or not. This is different from a quantity resulting from an operation. For example, one may always calculate the peak lag in the cross-correlation function between two acoustical signals, and call this “the ITD” (interaural time difference). But such a number is obtained whether there is a source, several sources or no source at all. Thus, this is not the same as relations of the form: the right signal equals the left signal delayed by 500 µs. Therefore, we are not speaking of a mapping between acoustical signals and action, which would be unconditional, but of actions conditional to a relation in the acoustical signals.
Now there is another way to understand the phrase “representing a relation”, which is in a predictive way: if there is a relation between A and B, then representing the relation means that from A and the relation, it is possible to predict B. For example: saying that the right signal is a delayed copy of the left signal, with delay 500 µs, means that if I know that the relation is true and I have the left signal, then I can predict the right signal. In the Jeffress model, or in fact in any model that represents the relation in the previous sense, it is possible to infer the right signal from the left signal and the representation, but only if the meaning of that representation is known, i.e., if it is known that a given neuron firing stands for “B comes 500 µs after A”. This is an important distinction with the previous notion of representation, where the meaning of the relation in terms of acoustics was irrelevant.
We now have a substantial problem: where does the meaning of the representation come from? The firing of binaural neurons in itself does not tell us anything about how to reconstruct signals. To see the problem more clearly, imagine that the binaural neurons develop by selecting axons from both sides. In the end there is a set of binaural neurons whose firing stands for binaural relations with different ITDs. But by just looking at the activity of the binaural neurons after development, or at the activity of both the binaural neurons and the left monaural acoustical signal, it is impossible to know what the ITD is, or what the right acoustical signal is at any time. To be able to do this, one actually needs to have learned the meaning of the representation carried by the binaural neurons, and this learning seems to require both monaural inputs.
It now seems that this second notion of representation is not very useful, since in any case it requires all terms of the relation. This brings us to the notion that to represent a relation, and not just a specific instantiation of it (i.e., these particular signals have such property), it must be represented in a sense that may apply to any instantiation. For example, if I know that a source at a given location, I can imagine for any left signal what should be the right signal. Or, given a signal on the left, I can imagine what should be the right signal if the source were at a given location.
I’m ending this post with probably more confusion than when I started. This is partly intended. I want to stress here that once we start thinking of perceptual representations in terms of relations, then classical notions of neural representations quickly seem problematic or at least insufficient.