Why do neurons spike?

Why do neurons produce those all-or-none electrical events named action potentials?

One theory, based on the coding paradigm, is that the production of action potentials is like analog-to-digital conversion, which is necessary if a cell wants to communicate to a distant cell. It would not be necessary if neurons were only communicating with their neighbors. For example, in the retina, most neurons do not spike but interact through graded potentials, and only retinal ganglion cells produce spikes, which travel over long distances (note that there is actually some evidence of spikes in bipolar cells). In converting graded signals into discrete events, some information is lost, but that is the price to pay in order to transmit any signal at all over a long distance. There is some theoretical work on this trade-off by Manwani and Koch (1999).

Incidentally, this theory is sometimes (wrongly) used to argue that spike timing does not matter because spikes are only used as a proxy for an analog signal, which is reflected by the firing rate. This theory is probably not correct, or at least incomplete.

First, neurons start spiking before they make any synaptic contact, and that activity is important for normal development (Pineda and Ribera, 2009). Apparently, normal morphology and mature properties of ionic channels depend on the production of spikes. In many neuron types, those early spikes are long calcium spikes.

A more convincing argument to me is the fact that a number of unicellular organisms produce spikes. For example, in paramecium, calcium spikes are triggered in response to various sensory stimuli and trigger an avoidance reaction, where the cell swims backward (reverting the beating direction of cilia). An interesting point here is that those sensory stimuli produce graded depolarizations in the cell, so from a pure coding perspective, the conversion of that signal to an all-or-none spike in the same cell seems very weird, since it reduces information about the stimuli. Clearly, coding is the wrong perspective here (as I have tried to argue in my recent review on the spike vs. rate debate). The spike should not be seen as a code for the stimulus, but rather as a decision or action, in this case to reverse the beating direction. This argues for another theory, that action potentials mediate decisions, which are by definition all-or-none.

Action potentials are also found in plants. For example, mimosa pudica produces spikes in response to various stimuli, for example if it is touched, and those spikes mediate an avoidance reaction where the leaves fold. Those are long spikes, mostly mediated by chloride (which is outward instead of inward). Again the spike mediates a timed action. It also propagates along the plant. Here spike propagation allows organism-wide coordination of responses.

It is also interesting to take an evolutionary perspective. I have read two related propositions that I found quite interesting (and neither is about coding). Andrew Goldsworthy proposed that spikes started as an aid to repair a damaged membrane. There is a lot of calcium in the extracellular space, and so when the membrane is ruptured, calcium ions rush into the cell, and they are toxic. Goldsworthy argues that the flow of ions can be reduced by depolarizing the cell, while repair takes place. We can immediately make two objections: 1) if depolarization is mediated by calcium then this obviously has little interest; 2) to stop calcium ions from flowing in, one needs to raise the potential to the reversal potential of calcium, which is very high (above 100 mV). I can think of two possible solutions. One is to trigger a sodium spike, but it doesn't really solve problem #2. Another might be to consider evenly distributed calcium channels on the membrane, perhaps together with calcium buffers/stores near them. When the membrane is ruptured, lots of calcium ions enter through the hole, and the concentration increases locally by a large amount, which probably immediately starts damaging the cell and invading it. But if the depolarization quickly triggers the opening of calcium channels all over the membrane, then the membrane potential would increase quickly with relatively small changes in concentration, distributed over the membrane. The electrical field then reduces the ion flow through the hole. It's an idea, but I'm not sure the mechanism would be so efficient in protecting the cell.

Another related idea was proposed in a recent review. When the cell is ruptured, cellular events are triggered to repair the membrane. Brunet and Arendt propose that calcium channels sensitive to stretch have evolved to anticipate damage: when the membrane is stretched, calcium enters through the channels to trigger the repair mechanisms before the damage actually happens. In this theory, it is the high toxicity of calcium that makes it a universal cellular signal. The theory doesn't directly explain why the response should be all-or-none, however. An important aspect, maybe, is cell-wide coordination: the opening of local channels must trigger a strong enough depolarization so as to make other calcium channels open all over the membrane of the cell (or at least around the stretched point). If the stretch is very local, then this requires an active amplification of the signal, which at a distance is only electrical. In other words, fast coordination at the cell-wide level requires a positive electrical feedback, aka an action potential. Channels must also close (inactivate) once the cellular response has taken place, since calcium ions are toxic.

Why would there be sodium channels? It's actually obvious: sodium ions are not as toxic as calcium and therefore it is advantageous to use sodium rather than calcium. However, this is not an entirely convincing response since in the end, calcium is in the intracellular signal. But a possible theory is the following: sodium channels appear whenever amplification is necessary but no cellular response is required at that cellular location. In other words, sodium channels are useful for quickly propagating signals across the cell. It is interesting to note that developing neurons generally produce calcium spikes, which are then converted to sodium spikes when the neurons start to grow axons and make synaptic contacts.

These ideas lead us to the following view: the primary function of action potentials is cell-wide coordination of timed cellular decisions, which is more general than fast intercellular communication.

Rate vs. timing (XXII) What Robert Rosen would say about rates and spikes

Robert Rosen was an influential theoretical biologist, who worked in particular on the nature of life. He made the point that living organisms are very special kinds of natural systems, in that they are anticipatory systems, which is the name of one of his most important books. He spends a substantial part of that book on epistemological issues, in particular on what a model is. The following figure is in my opinion a brilliant illustration of what a model is:

modeling relation

The natural system is what is being modeled. For example, the brain or the solar system. The formal system is the model, for example sets of differential equations describing Newtonian mechanics. The model has state variables that represent observables of the natural system. The mapping from the natural system to those state variables is called “encoding” here – it corresponds to measurement.  Decoding is the converse process. Causality describes changes occurring in the natural system, while implication describes changes in the formal system. A good model is one that is such that this diagram commutes. That is, causality in the natural system corresponds to implication in the formal system.

Let us apply this framework to the debate at hand: rate-based vs. spike-based theories of the brain. The question then is: can there be a good rate-based model of the brain, i.e., a model in which observables are rates, or is it necessary to include spikes in the set of observables? The question has little to do with the question of coding (how much information there is in either spike timing or rates about some other observable). It has to do with whether rates, as observables, sufficiently characterize the natural system so that the evolution of a formal system based on them can be mapped to the evolution of the natural system. In other words: do rates have a causal value in the dynamics of neural networks? It is easy to imagine how spikes might, because neurons (mainly) communicate with spikes and there are some biophysical descriptions of the effect of a single spike on various biophysical quantities. It is not so easy for rates. The problem is that in our current biophysical understanding of neurons, spikes are observables that have a causal role (e.g. the notion of a postsynaptic potential), but rates are primarily described as observables (averages of some sort) with no causal nature. To support rate-based theories is to demonstrate the causal nature of rates. As far as I know, it has not been done, and I have heard no convincing reason why rates might have a causal nature.

In fact, given that a rate is an observable that is defined on top of another observable, spikes, the question reduces to a more formal question, about relations between two formal systems: can a spike-based model of the brain be approximated by a rate-model? (in the same sense as depicted on the figure above) This is an interesting remark, because now the question is not primarily empirical but formal, and therefore it can be addressed theoretically. In fact, this question has already been addressed. This is precisely the goal of all studies trying to derive mean-field descriptions of spiking neural networks. So far, the results of those studies are that 1) it is not possible in the general case; 2) it is possible under some specific assumptions about the structure of the spiking model, which are known not to be empirically valid (typically: random sparse connectivity, independent external noise to all neurons).

Rate vs. timing (XXI) Rate coding in motor control

Motor control is sometimes presented as the prototypical example of rate coding. That is, muscle contraction is determined by the firing rate of motoneurons, so ultimately the “output” of the nervous system follows a rate code. This is a very interesting example, precisely because it is actually not an example of coding, which I previously argued is a problematic concept.

I will briefly recapitulate what “neural coding” means and why it is a problematic concept. “Coding” means presenting some property of things in the world (the orientation of a bar, or an image) in another form (spikes, rates). That a neuron “codes” for something means nothing more than its activity co-varies with that thing. For example, pupillary diameter encodes the amount of light captured by the retina (because of the pupillary contraction reflex). Or blood flow in the primary visual cortex encodes local visual orientation (this is what is actually measured by intrinsic optical imaging). So coding is really about observations made by an external observer, it does not tell much about how the system works. It is a common source of confusion because when one speaks of neural coding, there is generally the implicit assumption that the nervous system “decodes it” somehow. But presumably the brain does not “read-out” blood flow to infer local visual orientation. The coding perspective leaves the interesting part (what is the “representation” for?) largely unspecified, which is the essence of the homunculus fallacy.

The control of muscles by motoneurons does not fit this framework, because each spike produced by a motoneuron has a causal impact on muscle contraction: its activity does not simply co-vary with muscle contraction, it causes it. So first of all, motor control is not an example of rate coding because it is not really an example of coding. But still, we might consider that it conforms to rate-based theories of neural computation. I examine this statement now.

I will now summarize a few facts about muscle control by motoneurons, which are found in neuroscience textbooks. First of all, a motoneuron controls a number of muscle fibers and one fiber is contacted by a single motoneuron (I will only discuss α motoneurons here). There is indeed a clear correlation between muscle force and firing rate of the motoneurons. In fact, each single action potential produces a “muscle twitch”, i.e., the force increases for some time. There is also some amount of temporal summation, in the same way as temporal summation of postsynaptic potentials, so there is a direct relationship between the number of spikes produced by the motoneurons and muscle force.

Up to this point, it seems fair to say that firing rate is what determines muscle force. But what do we mean by that exactly? If we look at muscle tension as a function of a time, resulting from a spike train produced by a motoneuron, what we see is a time-varying function that is determined by the timing of every spike. The rate-based view would be that the precise timing of spikes does not make a significant difference to that function. But it does make a difference, although perhaps small: for example, the variability of muscle tension is not the same if the spike train is regular (small variability) or if it is random, e.g. Poisson (larger variability). Now this gets interesting: during stationary muscle contraction (no movement), those motoneurons generate constant muscle tension and they fire regularly, unlike cortical neurons (for example). Two remarks: 1) this does not at all conform to standard rate-based theory where rate is the intensity of a Poisson process (little stochasticity); 2) regularly firing is exactly what motoneurons should be doing to minimize variability in muscle tension. This latter remark is particularly significant. It means that, beyond the average firing rate, spikes occur at a precise timing that minimizes tension variability, and so spikes do matter. Thus motor control rather seems to support spike-based theories.

Rate vs. timing (XX) Flavors of spike-based theories (6) Predictive coding and spike-based inference

In two companion papers in Neural Computation (followed by a related paper on working memory), Sophie Denève developed a spike-based theory of Bayesian inference. It can be categorized as a representational spike-based theory, in the sense that spikes collectively represent some objective variable of the world, for which there is some uncertainty. It follows a typical Marr-ian approach, in which the function of the neural network (level 1) is first postulated, in terms of external properties of the world, and then the properties of the network (dynamics, connectivity) are derived. But unlike Marr’s approach, the algorithmic and physical levels are not considered independent, that is, the algorithm is defined directly at the level of spikes. In the first paper, it is assumed that a neuron codes for a hidden variable, corresponding to an external property of the world. The neuron must infer that variable from a set of observations, which are independent Poisson inputs whose rates depend on the binary value. The neuron codes for other neurons (as opposed to the external observer), that is, it is postulated that the log odds of the hidden variable are estimated from the spike train produced by the train as a sum of PSPs in a target neuron. Thus, the decoding process is fixed, and the dynamics of the neuron can then be deduced from an optimization principle, that is, so that the decoded quantity is as close as possible to the true quantity.

One can write a differential-spike equation that describes how the log-odds evolve with the inputs. At any time, the estimated log-odds can be estimated by the neuron from its output spike train. A spike is then produced if it brings the estimation closer to the true value of the log-odds, calculated from the inputs.

In this proposition, the spiking process is deterministic and spiking is seen as a timed decision. This is completely different from rate-based theory, in which spikes are random and instantiate a time-varying quantity. Although this is a rather abstract sensory scenario, the idea that spikes could be corrective signals is powerful. It connects to the point I made in the previous post, that the key point in spike-based theories is not temporal precision or reproducibility, as is sometimes wrongly claimed, it is the fact that spikes between different neurons are coordinated. When this theory is extended to a population of neurons, an immediate consequence is that spiking decisions are dependent on the decisions made by other neurons, and it follows that although spiking is deterministic at the cellular level and precise at the functional level (estimation of the hidden variable), it may not be reproducible between trials. In fact, even the relative timing of neurons may not be reproducible – for exactly the same reason as in sparse coding theory.

Rate vs. timing (XIX) Spike timing precision and sparse coding

Spike-based theories are sometimes discarded on the basis that spike timing is not reproducible in vivo, in response to the same stimulus. I already argued that, in addition to the fact that this is a controversial statement (because for example this could be due to a lack of control of independent variables such as attentional state), this is not a case for rate-based theories but for stochastic theories.

But I think it also reveals a misunderstanding of the nature of spike-based theories, because in fact, even deterministic spike-based theories may predict irreproducible spike timing. Underlying the argument of noise is the assumption that spikes are produced by applying some operation on the stimulus and then producing the spikes (with some decision threshold). If the timing of these spikes is not reproducible between trials, so the argument goes, then there must be noise inserted at some point in the operation. However, spike-based theories, at least some of them, do not fit this picture. Rather, the hypothesis is that spikes produced by different neurons are coordinated so as to produce some function. But then there is no reason why spikes need to be produced at the same time by the same neurons in all trials in order to produce the same global result. What matters is that spikes are precisely coordinated, which means that the firing of one neuron depends on the previous firing of other neurons. So if one neuron misses a spike, for example, then it will affect the firing of other neurons, precisely so as to make the computation more reliable. In other words, it is implied by the hypothesis of precise spike-based coordination that the firing of a spike by a single neuron should impact the firing of all other neurons, which makes individual firing non-reproducible.

The theory of sparse coding is line with this idea. In this theory, it is postulated that the stimulus can be reconstructed from the firing of neurons. That is, each spike contributes a “kernel” to the reconstruction, at the time of the spike, and all such contributions are added together so that the reconstruction is as close as possible to the original stimulus. Note how this principle is in some way the converse of the previously described principle: the spikes are not described as the result of a function applied to the stimulus, but rather the stimulus is described as a function of the spikes. So spike encoding is defined as an inverse problem. This theory has been rather successful in explaining receptive fields in the visual (Olshausen) and auditory (Lewicki) systems. It is also meant to make sense from the point of view of minimizing energy consumption, as it minimizes the number of spikes required to encode the stimulus with a given precision. There are two interesting points here, regarding our present discussion. First, it appears that spikes are coordinated in the way I just described above: if one spike is missed, then the other spikes should be produced so as to compensate for this loss, which means there is a precise spike-based coordination between neurons. Second, the pattern of spikes is seen as a solution to an inverse problem. This implies that if the problem is degenerate, then there are several solutions that equally good in terms of reconstruction error. Imagine for example that two neurons contribute exactly the same kernel to the reconstruction – which is not useless if one considers the fact that firing rate is limited by the refractory period. Then on one given trial, either of these two neurons may spike. From the observer point of view, this represents a lack of reproducibility. However, this lack of reproducibility is precisely due to the fact that there is a precise spike-based coordination between neurons: to minimize the reconstruction error, just one of the two neurons should be active, and the timing should be precise too.

Sparse coding with spikes also implies that reproducibility should depend on the stimulus. That is, a stimulus that is highly redundant such as a sinusoidal grating makes a degenerate inverse problem, leading to lack of reproducibility of spikes, precisely because of the coordination between spikes; a stimulus that is highly informative such a movie of a natural scene should lead to higher reproducibility of spikes. Therefore, in the sparse coding framework, the spike-based coordination hypothesis predicts, contrary to rate-based theories, that spike time reproducibility should depend on the information content of the stimulus – in the sense that a more predictable stimulus leads more irreproducible spiking. But even when spiking is not reproducible, it is still precise.

Rate vs. timing (XVIII) Spiking as analog-digital conversion: the evolutionary argument

Following on the previous post, with the analog-digital analogy often comes the idea that the relation between rates and spikes is that of an analog-digital conversion. Or spikes are seen as an analog-digital conversion from the membrane potential. I believe this comes from the evolutionary argument that it seems that spikes appeared for fast propagation of information on long distances, and not because there is anything special about them in terms of computation. It is quite possible that this was indeed the evolutionary constraint that led to the appearance of action potentials (although this is pure speculation), but even if this is true, the reasoning is wrong: for example, the ability of humans to make tools might have developed because they stood up. Yet standing up does not explain tool-making at all. So standing up allows new possibilities, but these possibilities follow a distinct logic. Spikes might have appeared primarily to transmit information at long distances, but once they are there, they have properties that are used, possibly for other purposes, in new ways. In addition, that they appeared to transmit information and the information was analog does not mean information is now used in the same way. Consider: to transmit information over long distances, one uses Morse code on the telegraph. Do you speak to the telegraph? No, you change the code and use a discrete code that has little connection with the actual sound wave. Finally, even if all this makes sense, it still is not an argument in favor rate-based theories, because rate is an abstract quantity that is derived from spikes. So if we wanted to make the case that spikes are only there to carry a truly analog value, the membrane potential, then it would lead us to discard spikes as a relevant descriptive quantity, and a fortiori to discard rates as well. From a purely informational viewpoint (in the sense of Shannon), spikes produced by a neuron carry less information than its membrane potential, but rate carries even less information, since it is abstracted from spikes.

Rate vs. timing (XVII) Analog vs. digital

It is sometimes stated that rate-based computing is like analog computing while spike-based computing is like digital computing. The analogy comes from the fact, of course, that spikes are discrete whereas rates are continuous. But as any analogy, it has its limits. First of all, spikes are not discrete in the way digital numbers are discrete. In digital computing, the input is a stream of binary digits, coming one after another in a cadenced sequence. The digits are gathered by blocks, say of 16 or 32, to form words that stand for instructions or numbers. Let us examine these two facts with respect to spikes. Spikes do not arrive in a cadenced sequence. Spikes arrive at irregular times, and time is continuous, not digital. What was meant by digital is presumably that there can be a spike or there can be no spike, but there is nothing in between. However, given that there is also a continuous timing associated to the occurrence of a spike, a spike is better described as a timed event rather than as a binary digit. But of course one could decide to divide the time axis into small time bins, and associate a digit 0 when there is no spike and 1 when there is a spike. This is certainly true, but as one performs this process as finely as possible to approximate the real spike train, it appears that there are very few 1s drowned in a sea of 0s. This is what is meant by “event”: information is carried by the occurrence of 1s at specific times rather than by the specific combinations of 0s and 1s, as in digital computing. So in this sense, spike-based computing is not very similar to digital computing.

The second aspect of the analogy is that digits are gathered in words (of say 32 digits), and these words are assigned a meaning in terms of either an instruction or a number. Transposed to spikes, these “words” could be the temporal pattern of spikes of a single neuron, or perhaps more meaningfully a pattern of spikes across neurons, as in synchrony-based schemes, or across neurons and time, as in polychronization. Now there are two ways of understanding the analogy. Either a spike pattern stands for a number, and in this case the analogy is not very interesting, since this is pretty much saying that spikes implement an underlying continuous value, in other words this is the rate-based view of neural computation. Or a spike pattern stands for a symbol. This case is more interesting, and it may apply to some proposed spike-based schemes (like polychronization). It emphasizes the idea that unlike rate-based theories, spike-based theories are not (necessarily) related to usual mathematical notions of calculus (e.g. adding numbers), but possibly to more symbolic manipulations.

However, this does not apply to all spike-based theories. For example, in Sophie Denève’s spike-based theory of inference (which I will describe in a future post), spike-based computation actually implements some form of calculus. But in her theory, analog signals are reconstructed from spikes, in the same way as the membrane potential results from the action of incoming spikes, rather than the other way around as in rate-based theories (i.e., a rate description is postulated, then spikes are randomly produced to implement that description). So in this case the theory describes some form of calculus, but based on timed events.

This brings me to the fact that neurons do not always interact with spikes. For example, in the retina, there are many neurons that do not spike. There are also gap junctions, in which the membrane potentials of several neurons directly interact. There are also ephaptic interactions (through the extracellular field potential). There is also evidence that the shape of action potentials can influence downstream synapses (see a recent review by Dominique Debanne). In these cases, we may speak of analog computation. But this does not bring us closer to rate-based theories. In fact, quite the opposite: rates are abstracted from spikes, and stereotypical spikes are an approximation of what really goes on, which may involve other physical quantities. The point here is that firing rate is not a physical quantity as the membrane potential, for example. It is an abstract variable. In this sense, spike-based theories, because they are based on actual biophysical quantities in neurons, might be closer to what we might call “analog descriptions” of computation than rate-based theories.

Rate vs. timing (XVI) Flavors of spike-based theories (5) Rank order coding

I started with an overview of spike-based theories based on synchrony. I would like to stress that synchrony-based theories should not be mistaken for theories that predict widespread synchrony in neural networks. In fact, quite the opposite: as synchrony is considered as a meaningful event, it is implied that it is a rare event (since otherwise it would not be informative). But there are also theories that do not assign any particular role to synchrony, which I will discuss now.

One popular theory based on asynchrony is rank order coding or “first spike” theories. It was popularized in particular by Simon Thorpe, who showed that humans can categorize faces in such a small time that any neuron involved in the processing chain could fire little more than one spike. This observation discards theories based on temporal averages, both rate-based theories and interval-based theories. Instead, Simon Thorpe and colleagues proposed that the information is carried by the order in which spikes are fired. Indeed, receptors that are more excited (receiving more light) generally fire earlier, so that the order in which receptors are excited carries information that is isomorphic to the pattern of light on the retina. However, by itself, the speed of processing does not discard processing schemes that do not require temporal averages, for example synfire chains or rate-based schemes based on spatial averages. Indeed it is known that the speed at which the instantaneous firing rate of a population of noisy neurons can track a time-varying input is very fast and is not limited by the membrane time constant – an interesting point is that this fact is consistent with integrate-and-fire models and not with isopotential Hodgkin-Huxley models, but this is another story. However, one argument against rate-based schemes is that they are much less energetically efficient, since information is used only after averaging. To be more precise, a quantity can theoretically be estimated from the firing of N neurons with precision of order 1/N if their responses are coordinated, but of order 1/√N if the neurons are independent. In other words, the same level of precision requires N² neurons in a rate-based scheme, vs. N neurons in a spike-based scheme.

Computationally, first spike codes are not fundamentally different, at a conceptual level, from standard rate-based codes, because first spike latency is monotonically related to input intensity. However, one interesting difference is that if only the rank order, and not the exact timing, is taken into account, then this code becomes invariant to monotonous transformations of input intensity, for example global changes in contrast or luminance. However it is not invariant to more complex transformations.

Rank order codes are also different at a physiological level. Indeed an interesting aspect of this theory is that it acknowledges a physiological fact that is ignored by both rate-based theories and synchrony-based theories, namely the asymmetry between excitation and inhibition in neurons. How can a neuron be sensitive to the temporal order of its inputs? In synchrony-based theories, which rely on excitation, neurons are sensitive to the relative timing of their inputs rather than to their temporal order. Indeed temporal order is discontinuous with respect to relative timing: it abruptly switches at time lag 0. Such a discontinuity is provided by inhibition: excitation followed by inhibition is more likely to trigger a spike than inhibition followed by excitation. The asymmetry is due to the fact that spikes are produced when the potential exceeds a positive threshold (i.e., the trajectory crosses the threshold from below).

One criticism of rank order coding is that it requires a time reference. Indeed, when comparing two spike trains, any spike is both followed and preceded by a spike from the other train, unless only the “first spike” is considered. Such a time reference, which defines the notion of “first spike”, could be the occurrence of an ocular saccade, or the start of an oscillation period if there is a global oscillation in the network that can provide a common reference.

Rate vs. timing (XV) Flavors of spike-based theories (4) Synchrony as a sensory invariant

I finish this overview of synchrony-based theories with my recent proposal (PLoS Comp Biol 2012). In the next posts, I will discuss theories based on asynchrony. In the theories I have described so far, the starting point is a code based on spike timing, in general a spatiotemporal pattern of spikes assumed to represent some sensory input. But the connection between the sensory input and the spike pattern is not addressed, or at least not considered as a central issue. My proposition connects spike-based computation with the psychological theory of James Gibson, specifically the notion of structural invariant. Gibson starts his book “An ecological approach to visual perception” by criticizing the idea that perception is the process of inferring the objective properties of the world from ambiguous patterns of sensory data, as is often postulated. Indeed, since perception is the source of all knowledge, it is inconsistent to view the objective properties of the world as preexisting to perception. But how then can one know anything about the world?

I will rephrase Gibson’s thinking in a different way by using the dictionary analogy. Inferring the objective world from an image or some sensory data is like looking in a dictionary for the translation of a word in one’s native language. In fact, this is precisely what is generally meant by the “neural coding” metaphor. But this cannot be used to understand a new word in one’s own native language. Instead one uses a different kind of dictionary, in which the word is defined in relationship with other words. Thus the definition of objects in the world is relational, not inferential. Inference can only be secondary, since one must first know what is to be inferred.

How does this relate to perception? Gibson argues that information about the world is present in the invariant structure of sensory inputs, that is, in properties of sensory inputs (relationships) that persist through time, which is to say: the laws that sensory inputs follow. More precisely, invariant structure is a relationship that is invariant with respect to some change. The notion has been extended to sensorimotor relationships by Kevin O’Regan. Two simple examples in hearing are in pitch perception and sound localization. Sounds that evoke a pitch are generally periodic. Periodicity is a relationship on the sensory input, i.e., S(t+T)=S(t) for all times (T is the period and S(t) is the acoustical pressure), and it is precisely this relationship, rather than the spectrum of the sound, which defines the pitch (for example pitch is unchanged if the fundamental frequency is missing). This relationship is not spatial because it is unaffected by movements. In the same way, a sound source produces two acoustical waves at the two ears that have specific relationships, for example (if sound diffraction is neglected) the wave at the contralateral ear is a delayed version of the wave at the ipsilateral ear. This relationship is spatial because it is affected by one’s movements. Besides, there is a systematic relationship between interaural delay and head position that is isomorphic to the source direction. Therefore this relationship can be identified to the source direction, without the need for an externally defined notion of physical angle. When a sound is presented in a noisy environment, the direction has to be inferred since it is ambiguous, but what is inferred is the relationship that defines the direction.

How does this relate to synchrony? Simply put, synchrony is a relationship defined through time, so it qualifies as invariant structure. In my paper, I show how this relationship between spike timings can correspond to a relationship between sensory inputs by introducing the concept of “synchrony receptive field” (SRF). The SRF of a given pair of neurons is the set of sensory signals that elicit synchronous spiking in the two neurons (it can be extended to a group of neurons). Suppose the two neurons receive different versions of the sensory signal S: F(S) and G(S). Then assuming a deterministic mapping from signals to spikes, synchrony reflects the relationship F(S) = G(S), a relationship defined on the sensory inputs. Therefore, across the neural population, synchrony patterns reflect the set of relationships on the sensory inputs, and neurons that respond to coincidences signal these relationships.

This mechanism can be used practically to recognize relationships, for example: detecting an odor defined by ratios of receptor affinities, estimating the pitch of a sound, estimating the location of a sound source (Goodman and Brette, PLoS CB 2010), estimating binocular disparities. The key computational benefit is that it solves the difficult problem of invariance, that is, the fact that objects of perception are invariant under many different perspectives. For example a face can be seen under different angles, or the same sound source can produce different sounds at the same location. To be more precise, the problem is dissolved by this approach rather than solved. Indeed, the key insight is that invariance is only a problem for an inferential process. When the objects to be perceived are defined instead by relationships, then there is no invariance problem, since a relationship is itself an invariant. For example, periodicity is a relationship that is invariant to the spectrum of a sound.

The theory connects to the other spike-based theories I mentioned previously. Indeed sensory relationships are reflected by synchrony between neurons (or relative spike timing, considering conduction delays), and it addresses the problem of binding in the same way as synfire chains: sensory signals that are not temporally coherent, and therefore not originating from the same object, cannot produce synchronous firing. It also connects with the polychronous theory of working memory: the spike pattern that is stored in that theory corresponds here to a sensory relationship. This makes it is possible to store sensory relationships in the form of spike timing relationships, without the need for an explicit conversion to a “rate code”.

On the empirical side, the theory relies on the fact that neurons operate in a fluctuation-driven regime, in which excitation and inhibition are approximately balanced (or inhibition dominant), as empirically observed. But shouldn’t this theory predict widespread synchrony in neural populations, unlike what is observed in the brain? In fact it should not. First of all, synchrony is only informative if it is a rare event. This is precisely what is captured by the concept of synchrony receptive field: synchrony occurs only for specific sensory signals (or more precisely, sensory relationships). Even though I did not include it in the paper, it would actually make sense that correlations that are not stimulus-specific (i.e., those that can be predicted) are minimized as much as possible. This would support the idea that recurrent inhibition is tuned to cancel excitatory correlations (see my previous post), which would produce weak correlations on average.

Rate vs. timing (XIV) The neural "code"

I am making a break before I continue on my review of spike-based theories, and I want to comment on the notion of “neural code”. These are keywords that are used in a large part of the neuroscience literature, and I think they are highly misleading. For example, you could say that neurons in V1 “code for orientation”. But what this statement refers to, in reality, is simply that if we record the response of such a neuron to an oriented bar, then we observe that its firing rate is modulated by the orientation, peaking at an orientation that is then called the “preferred orientation”. First of all, the notion of a “preferred orientation” is just a definition tied to the specific experimental protocol (the same is true for the notion of best delay in sound localization). Empirically speaking, it is an empty statement. In particular, by itself it does not mean that the cell actually “prefers” some orientations to others in any way, because a preferred orientation can be defined in any case and could be different for different protocols – it is just the stimulus parameter giving maximum response. So the only empirical statement associated to the claim “the neuron codes for orientation” is in fact: the neuron’s firing rate varies with orientation. Therefore, using the word “codes” is just a more appealing way of saying “varies”, but the empirical content is actually no more than “varies”.

In what sense then can we say that the neuron “codes” for orientation? Coding means presenting some information in a way that can be decoded. That is, the neuron codes for an orientation with its firing rate in the sense that from its firing rate it is possible to infer the orientation. Here we get to the first big problem with the notion of a “neural code”. If the firing rate varies with orientation and one knows exactly how (quantitatively), then of course it is possible to infer some information about orientation from the firing rate. The way you would decode a particular firing rate into an estimated orientation is by looking at the tuning curve, obtained by the experimental protocol, and look for the orientation that gives the best matching firing rate. But this means that the decoding process, and therefore the code, is meant from the experimenter’s point of view, not from the organism’s point of view. The organism does not know the experimental protocol, so it cannot make the required inference. If all the organism can use to decode the orientation is a number of spikes, then clearly this task is nearly impossible, because without additional knowledge, a tremendous number of stimuli could produce that same number of spikes (e.g. by varying contrast or simply presenting something else than a bar). Thus the first point is that the notion of a code is experimenter-centric, so talking about a “neural code” in this sense is highly misleading, as the reader of this code is not neurons but the experimenter.

So the first point is that, if the notion of a neural code is to make any sense at all, it should be refined so as to remove any reference to the experimental protocol. One clear implication is that the idea that a single neuron can code for anything is highly questionable: is it possible to infer anything meaningful about the world from a single number (spike count), and no a priori knowledge? Perhaps the joint activity of a set of neurons may make more sense. This reduces the interest of “tuning curves” in terms of coding – it may still be informative about what neurons “care about”, but not about how they represent information, if there is such a thing. Secondly, removing any reference to the experimental protocol means that one can speak of a neural code for orientation only if it does not depend on other aspects, e.g. contrast. Indeed if the responses were sensitive to orientation but also to everything else in the stimulus, how could one claim that the neuron codes for orientation? Finally, thinking of a code with a neural observer in mind means that, perhaps, not all codes make sense. Indeed, is the function of V1 to “represent” the maximum amount of visual information? This view, and the search for “optimal codes” in general, seems very odd from the organism’s point of view: why devote so many neurons and so much energy to represent exactly the same amount of information that is already present in the retina? If a representation has any use, then this representation must be different in nature from the original presentation, and not just in content. So the point is not about how much information there is, but in what form it is represented. This means that codes cannot be envisaged independently of a potential decoder, i.e., a specific way in which neurons use the information.

I now come to a deeper criticism of the notion of neural code. I started by showing that the notion is often meant in the sense of a code for the observer, not for the brain. But let us say that we have fixed that error and we are now looking for neural codes, with a neural-centric view rather than an experimenter-centric view. But still, the methodology is: looking at neural responses (rates or spike timings) and trying to find how much information there is and under what form. Clearly then, the notion that neurons code for things is not an empirical finding: it is an underlying assumption of the methodology. It starts, not ends, by assuming that neurons fire so that the rest of the brain can observe it and take the information it sees in this firing. It is postulated, not observed, that what neurons do is produce some form of representation for the rest of the brain to see. This appears to be very centered on the way we, external observers, acquire knowledge about the brain, and it has a strong flavor of the homunculus fallacy.

I suggest we consider another perspective on what neurons do. Neurons are cells that continuously change in many aspects, molecular and electrical. Even though we may want to describe some properties of their responses, spikes are transient signals, there is nothing persistent in them in the same way as a painting. So neurons do not represent the world in the same way as a painter would represent the world. Second, spikes are not things that a neuron leaves there for observers to see, like the pigments on a painting. On the contrary, a neuron produces a spike and actively sends it to target neurons, where changes will occur because of this spike. This is much more like an action than like a representation. Thus it is wrong to say that the postsynaptic neuron “observes” the activity of presynaptic neurons. Rather, it is influenced by it. So neural activity is not a representation, it is rather an action.

To briefly summarize this post: neurons do not code. This is a view that can only be adopted by an external observer, but it is not very meaningful to describe what neurons do. Perhaps it is more relevant to say that neurons compute. But the best description, probably, is to say that neurons act on other neurons by means of their electrical activity. To connect with the general theme of this series, these observations emphasize the fact that the basis of neural computation is truly spikes, and that rates are an external observer-centric description of neural activity.