Affordances, subsumption, evolution and consciousness

James Gibson defended the idea that what we perceive of the environment is affordances, that is, the possibilities of interactions they allow. For example, a knob affords twisting, or the ground affords support. The concept of affordance makes a lot of sense, but Gibson also insisted that we directly perceive these affordances. It has never been very clear to me what he meant by that. But following recent discussions, I have thought of a way in which this statement might make sense - although I have no idea whether this is what Gibson meant.

The way sensory systems work is traditionally defined as an early extraction of “features”, like edges, which are then combined through a hierarchical architecture into more and more complex things, until one gets “an object”. In this view, affordances are obtained at the end of the chain, and so it is not direct at all. In robotics, another kind of architecture was proposed by Rodney Brooks in the 1980s, the “subsumption architecture”. It was meant as a way to make robots in an incremental way, by progressively adding layers of complexities. In his example, the first layer of the robot would be a simple control system by which external motor commands produce movement, and there is a sonar that computes a repulsive force in case there is a wall in front of it, and the force is sent to the motor module. Then there is a second layer that makes the robot wander. It basically randomly chooses a direction at regular intervals, and it combines it with the force computed by the sonar in the first layer. The second layer is said to “subsume” the first one, i.e., it takes over. Then there is another level on top of it. The idea in this architecture is that the set of levels below any level is functional, it can do something on its own. This is quite different from standard hierarchical sensory systems, in which the only purpose of each level is to send information to the next level. Here we get to Gibson’s affordances: if the most elementary level must be functional, then what it senses is not simple features, but rather simple affordances, simple ways to interact with its environment. So in this view, what is elementary in perception is affordances, rather than elementary sensations.

I think it makes a lot of sense from an evolutionary point of view that sensory systems should look like subsumption architectures rather than standard hierarchical perception systems. If each new structure (say, the cortex) is added on top of an existing set of structures, then the old set of structures has a function by itself, independently of the new structure. Somehow the old set is “subsumed” by the new structure, and the information this new structure gets must then already have a functional meaning. This would mean that affordances are the basis, and not the end of result, of the sensory system. In this sense, perhaps, one might say that affordances are “directly” perceived.

When thinking about what it means for consciousness, I like to refer to the man and horse analogy. The horse is perfectly functional by itself. It can run, it can see, etc. Now the man on top of it can “subsume” the horse. He sends commands to it so as to move it where he wants, and also gets signals from the horse. The man is conscious, but he has no idea of what the horse feels, for example how the horse feels the ground. All the sensations that underlie the man-horse’s ability to move around are inaccessible to the conscious man, but it is not a problem at all for the man to go where he wants to.

Now imagine that the man is blind. If there is an obstacle in front of the horse, the horse might stop, perhaps get nervous, things that the man can feel. The man cannot feel the wall in terms of “raw sensations”, but he can perceive that there is something that blocks the way. In other words, he can perceive the affordance of the wall – something that affords blocking, without seeing the wall.

So in this sense, it does not seem crazy anymore that what we directly perceive (we = our conscious self) is made of affordances rather than raw sensations.

What is computational neuroscience? (II) What is theory good for?

To answer this question, I need to write about basic notions of epistemology (the philosophy of knowledge). Epistemology is concerned in particular with what knowledge is and how it is acquired.

What is knowledge? Essentially, knowledge is statements about the world. There are two types of statements. First there are specific statements or “observations”, for example, “James has two legs”. But “All men have two legs” is a universal statement: it applies to an infinite number of observations, about men I have seen but also about men I might see in the future. We also call universal statements “theories”.

How is knowledge acquired? The naive view, classical inductivism, consists in collecting a large number of observations and generalizing from them. For example, one notes that all men he has seen so far have two legs, and concludes that all men have two legs. Unfortunately, inductivism cannot produce universal statements with certainty. It is well possible that one day you might see a man with only one leg. The problem is there are always an infinite number of universal statements that are consistent with any finite set of observations. For example, you can continue a sequence of finite numbers with any numbers you want, and it will still give you a possible a sequence of numbers.

Therefore, inductivism cannot guide the development of knowledge. Karl Popper, probably the most influential philosopher of science of the twentieth century, proposed to solve this problem with the notion of falsifiability. What distinguishes a scientific statement from a metaphysical statement is that it can be disproved by an experiment. For example, “all men have two legs” is a scientific statement, because the theory could be disproved by observing a man with one leg. But “there is a God” is not a scientific statement. This is not to say that these statements are true or not true, but that they have a scientific nature or not (but note that, by definition, a metaphysical statement can have no predictable impact on any of our experience, otherwise this would produce a test of that statement).

Popper’s concept of falsifiability has had a huge influence on modern science, and it essentially determines what we call “experimental work” and “theoretical work”. In Popper’s view, an experiment is an empirical test designed to falsify a theory. More generally, it is a situation for which different theories predict different outcomes. Note how this concept is different from the naive idea of “observing the laws of nature”. Laws of nature cannot be “observed” because an experiment is a single observation, whereas a law is a universal statement. Therefore, from a logical standpoint, the role of an experiment is rather to distinguish between otherwise consistent theories.

The structure of a typical experimental paper follows this logic: 1) Introduction, in which the theoretical issues are presented (the different hypotheses about some specific subject), 2) Methods, in which the experiment is described in details, so as to be reproducible, 3) Results, in which the outcomes are presented, 4) Discussion, in which the outcomes are shown to corroborate or invalidate various theories. Thus, an experimental paper is about formulating and performing a critical test of one, or usually several, theories.

Popper’s line of thinking seems to imply that knowledge can only progress through experimental work. Indeed theories can either be logically consistent or inconsistent, so there is no way to distinguish between logically consistent theories. Only empirical tests can corroborate or invalidate theories, and therefore produce knowledge. Hence the occasional demeaning comments that any theoretician has heard, around the idea that theories are mind games for a bunch of smart math-oriented people. That is, theory is useless since only empirical work can produce scientific knowledge.

This is a really paradoxical remark, for theory is the goal of scientific progress. Science is not about accumulating data, it is about finding the laws of nature, a.k.a. theories. It is precisely the predictive nature of science that makes it useful. How can it be that science is about making theories, but that science can only progress through empirical work?

Maybe this is a misunderstanding of Popper’s reasoning. Falsifiability is about how to distinguish between theories. It clarifies what empirical work is about, and what distinguishes science from metaphysics. But it says nothing about how theories are formulated in the first place. Falsifiability is about empirical validation of theories, not about the mysterious process of making theories, which we might say is the “hard problem” of philosophy of science. Yet making theories is a central part of the development of science. Without theory, there is simply no experiment to be done. But more importantly, science is made of theories.

So I can now answer the question I started with. Theories constitute the core of any science. Theoretical work is about the development of theories. Experimental work is about the testing of theories. Accordingly, theoretical papers are organized quite differently from experimental papers, because the methodology is very different, but also because there is no normalized methodology (“how it should be”). A number of computational journals insist on enforcing the structure of experimental papers (introduction / methods / results / discussion), but I believe this is due to the view that simulations are experiments (Winsberg, Philosophy of Science 2001), which I will discuss in another post.

Theory is often depicted as speculative. This is quite right. Theory is, in essence, speculative, since it is about making universal statements. But this does not mean that theory is nonsense. Theories are usually developed so as to be consistent with a body of experimental data, i.e., they have an empirical basis. Biological theories also often include a teleonomic element, i.e., they “make sense”. These two elements impose hard constraints on theories. In fact, they are so constraining that I do not know of any theory that is consistent with all (or even most) experimental data and that makes sense in a plausible ecological context. So theory making is about finding principled ways to explain existing data, and at the same time to explain biological function. Because this is such a difficult task, theoretical work can have some autonomy, in the sense that it can produce knowledge in the absence of new empirical work.

This last point is worth stressing, because it departs significantly from the standard Popperian view of scientific progress, which makes it a source of misunderstandings between theoreticians and experimenters. I am referring to the complexity of biological organisms, shaped by millions of years of evolution. Biological organisms are made of physical things that we understand at some level (molecules), but at the same time they serve a project (the global project being reproductive invariance, in the words of Jacques Monod). That they serve a project is not the simple result of the interaction of these physical elements, rather they are the result of evolutionary pressure. This means that even though on one hand we understand physics, or biophysics, to a high degree of sophistication, and on the other hand there are well established theories of biological function, there still is a huge explanatory gap between the two. This gap is largely theoretical, in the sense that we are looking for a way to make these two aspects logically consistent. This is why I believe theoretical work is so important in biology. It also has two consequences that can be hard to digest for experimenters: 1) theory can be autonomous to some extent (i.e., there can be “good” and “bad” theories, independently of new empirical evidence), 2) theoretical work is not necessarily aimed at making experimental predictions.

This discussion raises many questions that I will try to answer in the next posts:

- Why are theoretical and experimental journals separate?

- Should theories make predictions?

- Should theories be consistent with data?

- What is a “biologically plausible” model? And by the way, what is a model?

- Is simulation a kind of experiment?

Rate vs. timing (VI) Synaptic unreliability

How much intrinsic noise is there in a neuron? This question would deserve a longer post, but here I will just make a few remarks. In vitro, when the membrane potential is recorded in current-clamp, little noise is seen. There could be hidden noise in the spike generating process (i.e., in the sodium channels), but when a time-varying current in injected somatically into a cortical neuron, the spike trains are also highly reproducible (Mainen & Sejnowski, 1995). This means that the main source of intrinsic noise in vivo is synaptic unreliability.

Transmission at a given synapse is unreliable, in general. That is, there is a high probability of transmission failure, in which there is a presynaptic spike but no postsynaptic potential. However, an axon generally contacts a postsynaptic neuron at multiple release sites, which we may consider independent. If there are N sites with a transmission probability p, then the variance of the noise represents a fraction x=(1-p)/(pN) of the variance of the signal (expected PSP size). We can pick some numbers from Branco & Staras (2009). There seems to be quite different numbers depending on studies, but it gives an order of magnitude. For cat and rat L2/3 pyramidal cells, we have for example N=4 and p=0.5 (ref. 148). This gives x=0.25. Another reference (ref. 149) gives x=0.07 for the same cells.

These numbers are not that big. But it is possible that transmission probability is lower in vivo. So we have to recognize that synaptic noise might be substantial. However, even if it is true, it is an argument in favor of the stochasticity of neural computation, not in favor of rate-based computation. In addition, I would like to add that synaptic unreliability has little impact on theories based on synchrony and coincidence detection. Indeed, a volley of synchronous presynaptic spikes arriving at a postsynaptic neuron has an essentially deterministic effect, by law of large numbers. That is, synchronous input spikes are equivalent to multiple release sites. If there are m synchronous spikes, then the variance of the noise represents a fraction x=(1-p)/(pmN) of the variance of the signal (compound PSP). Taking the same numbers as above, if there are 10 synchronous spikes then we get x=0.025 (ref. 148) and x=0.007 (ref. 149), i.e., an essentially deterministic compound PSP. And we have shown that neurons are very sensitive to fast depolarizations in a background of noise (Rossant et al. 2011). The theory of synfire chains is also about the propagation of synchronous activity in a background of noise, i.e., taking into account synaptic unreliability.

In summary, the main source of intrinsic noise in neurons is synaptic noise. Experimental figures from the literature indicate that it is not extremely large but possibly substantial. However, as I noted in previous posts, the presence of large intrinsic noise does invalidate spiked-based theories but deterministic theories. In addition, synaptic noise has no impact on synchronous events, and therefore it is essentially irrelevant for synchrony-based theories.

What is computational neuroscience? (I) Definitions and the data-driven approach

What is computational neuroscience? Simply put, it is the field that is concerned with how the brain computes. The word “compute” is not necessarily an analogy with the computer, and it must be understood in a broad sense. It simply refers to the operations that must be carried out to perform cognitive functions (walking, recognizing a face, speaking). Put this way, it might seem that this is pretty much the entire field of neuroscience. What distinguishes computational neuroscience, then, is that this field seeks a mechanistic understanding of these operations, to the point that they could potentially be simulated on a computer. Note that this means neither that computational neuroscience is mostly about simulating the brain, nor that the brain is thought of as a computer. It simply refers to the materialistic assumption that, if all the laws that underlie cognition are known in details, then it should be possible to artificially reproduce them (assuming sufficient equipment).

Another related terminology is “theoretical neuroscience”. This is somewhat broader than computational neuroscience, and is probably an analogy to theoretical physics, a branch of physics that relies heavily on mathematical models. Theoretical neuroscience is not necessarily concerned with computation, at least not directly. One example could be the demonstration that action potential velocity is proportional to diameter in myelinated axons, and to the square root of the diameter in unmyelinated axons. This demonstration uses cable theory, a biophysical theory describing the propagation of electrical activity in axons and dendrites.

“Quantitative neuroscience” also refers to the use of quantitative mathematical models as a tool to understand brain function or dynamics, but the substitution of “quantitative” for “theoretical” suggests that the field is more concerned with data analysis (as opposed to theories of how the brain works).

Finally, “neural modeling” is concerned with the use of quantitative neural models, in general biophysical models. The terminology suggests a data-driven approach, i.e., building models of neural networks from experimental measurements, based on existing theories. This is why I am somewhat uneasy with this terminology, for epistemological reasons. The data-driven approach implicitly assumes that it is possible and meaningful to build a functioning neural network from a set of measurements alone. This raises two critical issues. One is that it is based on what Francisco Varela called “neurophysiological subjectivism” (see this related post), the idea that perception is the result of neural network dynamics. Neurophysiological subjectivism is problematic because (in particular) it fails to fully recognize the defining property of living beings, which is teleonomy (in other words, function). Living organisms are constrained on one hand by their physical substrate, but on the other hand this substrate is tightly constrained by evolution – this is precisely what makes them living beings and not just spin glasses. The data-driven approach only considers the constraints deriving from measurements, not the functional constraints, but this essentially amounts to denying the fact that the object of study is part of a living being. Alternatively, it assumes that measurements are sufficiently constraining that function is entirely implied, which seems naive.

The second major issue with the data-driven approach is that it has a strong flavor of inductivism. That is, it implicitly assumes that a functioning model is directly implied by a finite set of measurements. But inductivism is a philosophical error, for there are an infinite number of theories (or “models”) consistent with any finite set of observations (an error pointed out by Hume, for example). In fact, Popper and his followers also noted that inductivism commits another philosophical error, which is to think that there is such a thing as a “pure observation”. Experimental results are always to be interpreted in a specific theoretical context (a.k.a. the “Methods” section). One does not “measure” a model. One performs a specific experiment and observes the outcome with tools, which are themselves based on currently accepted theories. In other words, an experimental result is the answer to a specific question. But the type of question is not “What is the time constant of the model?”, but rather “What exponential function can I best fit to the electrical response of this neuron to a current pulse?”. Measurements may then provide constraints on possible models, but they never imply a model. In addition, as I noted above, physical constraints (implied by measurements) are only one side of the story, functional constraints are the other side. Neglecting this other side means studying a “soup of neurons”, not the brain.

In summary, it is often stated or implied that “realistic” models are those that are based on measurements: this is 1) an inductivist mistake, 2) a tragic disregard of what defines living beings, i.e. functional constraints.

I will end this post by asking a question: what is a better description of the brain? A soup of “realistic” neurons or a more conceptual mechanistic description of how interacting neurons support cognitive functions?

Rate vs. timing (V) Fast rate-based coding

Misconception #4: “A stochastic spike-based theory is nothing else than a rate-based theory, only at a finer timescale”.

It is sometimes claimed or implied that there is no conceptual difference between the two kinds of theories, the only difference being the timescale of the description (short timescale for spike-based theories, long timescale for rate-based theories). This is a more subtle misconception, which stems from a confusion between coding and computation. If one only considers the response of a neuron to a stimulus and how much information there is in that response about the stimulus, then yes, this statement makes sense.

But rate-based and spike-based theories are not simply theories of coding, they are also theories of computation, that is, of how responses of neurons depend on the responses of other neurons. The key assumption of rate-based theories is that it is possible and meaningful to reduce this transformation to a transformation between analog variables r(t), the underlying time-varying rates of the neurons. These are hidden variables, since only the spike trains are observable. The state of the network is then entirely defined by the set of time-varying rates. Therefore there are two underlying assumptions: 1) that the output spike train can be derived from its rate r(t) alone, 2) that a sufficiently accurate approximation of the presynaptic rates can be derived from the presynaptic spike trains, so that the output rate can be calculated.

Since spike trains are considered as stochastic with (expected) instantaneous rate r(t), assumption #1 means that spike trains are stochastic point processes defined from and consistent with the time-varying rate r(t) – they could be Poisson processes, but not necessarily. The key point here is that the spiking process is only based on the quantity r(t). This means in particular that the source of noise is independent between neurons.

The second assumption means that the operation performed on input spike trains is essentially independent of the specific realizations of the random processes. There are two possible cases. One alternative is that the law of large numbers can be applied, so that integrating inputs produces a deterministic value that depends on the presynaptic rates. But then the source of noise, which produces stochastic spike trains from a deterministic quantity, must be entirely intrinsic to the neuron. Given what we know from experiments in vitro (Mainen and Sejnowski, 1995), this is a fairly strong assumption. The other alternative is that the output rate depends on higher statistical orders of the total input (e.g. variance) and not only on the mean (e.g. through the central limit theorem). But in this case, the inputs must be independent, for otherwise it would not be possible to describe the output rate r(t) as a single quantity, since the transformation would also depend on higher-order quantities (correlations).

In other words, the assumptions of rate-based theories mean that spike trains are realizations of independent random processes, with a source of stochasticity entirely intrinsic to the neuron. This is a strong assumption that has little to do with the description timescale.

This assumption is also known to be inconsistent in general in spiking neural network theory. Indeed it is possible to derive self-consistent equations that describe the transformation between the input rates of independent spike trains and the output rate of an integrate-and-fire model (Brunel 2001), but these equations fail unless one postulates that connections between neurons are sparse and random. This postulate means that there are no short cycles in the connectivity graph, so that inputs to a neuron are effectively independent. Otherwise, the assumption of independent outputs is inconsistent with overlaps in inputs between neurons. Unfortunately, neural networks in the brain are known to be non-random and with short cycles (Song et al. 2005).

To be fair, it is still possible that neurons that share inputs have weakly correlated outputs, if inhibition precisely tracks excitation (Renart et al. 2010). But it should be stressed that it is the assumptions of rate-based theories that require a specific non-trivial mechanism, rather than those of spike-based theories. It is ironic that spike-based theories are sometimes depicted as exotic by tenants of rate-based theories, while the burden of proof should in fact reside on the latter.

To summarize this post: the debate of rate vs. timing is not about the description timescale, but about the notion that neural activity and computation may be entirely and consistently defined by the time-varying rates r(t) in the network. This boils down to whether neurons spike in a stochastic independent manner, conditionally to the input rates. It is worth noting that this is a very strong assumption, with currently very little evidence in favor, and a lot of evidence against.

Rate vs. timing (IV) Chaos

Misconception #3: “Neural codes can only be based on rates because neural networks are chaotic”. Whether this claim is true or not (and I will comment on it below), chaos does not imply that spike timing is irrelevant. To draw this conclusion is to commit the same category error as I discussed in the previous post, i.e., confusing rate vs. timing and stochastic vs. deterministic.

In a chaotic system, nearby trajectories quickly diverge. This means that it is not possible to predict the future state from the present state, because any uncertainty in estimating the present state will result in large changes in future state. For this reason, the state of the system at a distant time in the future can be seen as stochastic, even though the system itself is deterministic.

Specifically, in vitro experiments suggest that individual neurons are essentially deterministic devices (Mainen and Sejnowski 1995) – at least the variability seen in in vitro recordings is often orders of magnitude lower than in vivo. But a system composed of interacting neurons can be chaotic, and therefore for all practical aspects their state can be seen as random, so the chaos argument goes.

The fallacy of this argument can be seen by considering the prototypical chaotic system, climate. It is well known that the weather cannot be predicted more than 15 days in the future, because even tiny uncertainties in measurements make the climate models diverge very quickly. But this does not mean that all you can do is pick a random temperature according to the seasonal distribution. It is still possible to make short term predictions, for example. It also does not mean that climate dynamics can be meaningfully described only in terms of mean temperatures (and other mean parameters). For example, there are very strong correlations between weather events occurring at nearby geographical locations. Chaos implies that it is not possible to make accurate predictions in the distant future. It does not imply that temperatures are random.

In the same way, the notion that neural networks are chaotic only implies that one cannot predict the state of the network in the distant future. This has nothing to do with the distinction between rate and spike timing. Rate (as mean seasonal temperature) may still be inadequate to describe the dynamics of the system, and firing may still be correlated across neurons.

In fact the chaos argument is an argument against rate-based theories, precisely because a chaotic system is not a random system. In particular, in a chaotic system, there are lawful relationships between the different variables. Taking the example of climate again, the solutions of the Lorenz equations (a model of atmostpheric convection) live in a low-dimensional manifold with a butterfly shape known as the Lorenz attractor. Even though one cannot predict the values of the variables in the distant future, these variables evolve in a very coordinated way. It would be a mistake to replace them by their average values. Therefore, if it is true than neural networks are chaotic, then it is probably not true that their dynamics can be described in terms of rates only.

I will end this post by commenting on the notion that neural networks are chaotic. I very much doubt that chaos is an adequate concept to describe spiking dynamics. There are different definitions of a chaotic system, but essentially they state that a chaotic system is a system that is very sensitive to initial conditions, in the sense that two trajectories that are initially very close can be very far apart after a relatively short time. Now take a neuron and inject a constant current: it will fire regularly. In the second trial, inject the exact same current but 1 ms later. Initially the state of the neuron is almost identical in both trials. But when the neuron fires in the first trial, its membrane potential diverges very quickly from the trajectory of the second trial. Is this chaos? Of course not, because the trajectories meet again about 1 ms later. In fact, I showed in a study of spike time reliability in spiking models (Brette and Guigon, 2003) that even if the trajectories diverge between spikes (such as with the model dv/dt=v/tau), spike timing can still be reliable in the long run in response to fluctuating inputs. This counter-intuitive property can be seen as nonlinear entrainment.

In summary, 1) chaos does not support rate-based theories, it rather invalidates them, and 2) chaos is probably not a very meaningful concept to describe spiking dynamics.

Rate vs. timing (III) Another category error

Misconception #2: “Neural responses are variable in vivo, therefore neural codes can only be based on rates”. Again, this is a category error. Neural variability (assuming this means randomness) is about determinism vs. stochasticity, not about rate vs. timing. There can be stochastic or deterministic spike-based theories.

I will expand on this point, because it is central to many argumentations in favor of rate-based theories. There are two ways to understand the term "variable" and I will first discard the meaning based on temporal variability. Interspike intervals (ISIs) are highly variable in the cortex (Softky and Koch, 1993), and their distribution is close to an exponential (or Gamma) function, as for Poisson processes (possibly with a refractory period). This could be interpreted as the sign that spike trains are realizations of random point processes. This argument is very weak, because the exponential distribution is also the distribution with maximum entropy for a given average rate, which means that maximizing the information content in the timing of spikes of a single train also implies an exponential distribution of ISIs. Temporal variability cannot distinguish between rate-based and spike-based theories.

Therefore the only reasonable variability-based argument in support of the rate-based view is the variability of spike trains across trials. In the cortex (but not so much in some early sensory areas such as the retina and some parts of the auditory brainstem), both the timing and number of spikes produced by a neuron in response to a given stimulus varies from one trial to another. This means that the response of a neuron to a stimulus cannot be described by a deterministic function. In other words, the stimulus-output relationship of neurons is stochastic. This is the only fact that this observation tells us (note that we may also argue that stochasticity only reflects uncertainty on hidden variables). That this stochasticity is entirely captured by an intrinsic time-varying rate signal is pure speculation at this stage. Therefore, the argument of spike train variability is about stochastic vs. deterministic theories, not about rate-based vs. spike-based theories. It only discards deterministic spike-based theories based on absolute spike timing. However, the prevailing spike-based theories are based on relative timing across different neurons (for example synchrony or rank order), not on absolute timing.

In fact, the argument can be returned against rate-based theories. It is often written or implied that rate-based theories take into account biological variability, whereas spike-based theories do not. But actually, quite the opposite is true. Rate-based theories are fundamentally deterministic, and a deterministic description is obtained at the cost of averaging noisy responses over many neurons, or over a long integration time. On the other hand, spike-based theories take into account individual spikes, and therefore do not rely on averaging. In other words, it is not that rate-based descriptions account for more observed variability, it is just that they acknowledge that neural responses are noisy, but they do not account for any variability at all. Accounting for more variability would require stochastic spike-based accounts. This confusion may stem from the fact that spike-based theories are often described in deterministic terms. But as stressed above, rate-based theories are also described in deterministic terms.

Throwing dice can be described by deterministic laws of mechanics. The fact that the outcomes are variable does not invalidate the laws of mechanics. It simply means that noise (or chaos) is involved in the process. Therefore criticizing spike-based theories for not being stochastic is not a fair point, and stochasticity of neural responses cannot be a criterion to distinguish between rate-based and spike-based theories.

Rate vs. timing (II) Rate in spike-based theories

To complement the previous post, I will comment on what firing rate means in spike-based theories. First of all, rate is important in spike-based theories. The timing of a spike can only exist if there is a spike. Therefore, the firing rate determines the rate of information in spike-based theories, but it does not determine the content of information.

A related point is energy consumption. The energy consumption of a cell is essentially proportional to the number of spikes it produces (taking into account the cost of synaptic transmission to target neurons) (Attwell and Laughlin, 2001). It seems reasonable to think that the organism tries to avoid any waste of energy, therefore a cell that fires at high rate must be doing something important. In terms of information, it is likely that the amount of information transmitted by a neuron is roughly proportional, or at least correlates with its firing rate.

For these two observations, it follows that, in spike-based theories, firing rate is a necessary correlate of information processing in a neuron. This stands in contrast with rate-based theories, in which rate is the basis of information processing. But both types of theories predict that firing rates correlate with various aspects of stimuli – and therefore that there is information about stimuli in firing rates, at least for an external observer.

Rate vs. timing (I) A category error

This post starts a series on the debate between rate-based and spike-based theories of neural computation and coding. My primary goal is to clarify the concepts. I will start by addressing a few common misconceptions about the debate.

Misconception #1: “Both rate and spike timing are important for coding, so the truth is in between”. This statement, I will argue, is what philosophers would call a “category error”: it is not that only one of the alternatives can be right, it is just that the two alternatives belong to different categories.

Neurons mainly communicate with each other using trains of spikes – at least this is what the rate-timing debate is concerned about. A spike train can be completely characterized by the timing of its spikes. The firing rate, on the other hand, is an abstract definition, that is only valid in a limit, which involves an infinite number of spikes. For example, it can be defined for a single neuron as a temporal average: the inverse of the mean inter-spike interval. It appears that rate is defined from the timing of spikes. Thus these are two different concepts: spike timing is what defines spike trains, whereas rate is an abstract mathematical construction on spike trains. Therefore the rate vs. timing debate is not about which one is right, but about whether rate is a sufficiently good description of neural activity. Spike-based theories do not necessarily claim that rate does not matter, they refute the notion that rate is the essential quantity that matters.

There are different ways to define the firing rate: over time (number of spikes divided by the duration, in the limit of infinite duration), over neurons (average number of spikes in a population of neurons, in the limit of an infinite number of neurons) or over trials (average number of spikes over an infinite number of trials). In the third definition (which might be the prevailing view), the rate is seen as an intrinsic time-varying signal r(t) and spikes are seen as random events occurring at rate r(t). In all these definitions, rate is an abstract quantity defined on the spike trains. Therefore when stating that the neural “code” is based on rates rather than spike timing, what is meant is that the concept of rate captures most of the important details of neural activity and computation, while precise spike timing is essentially meaningless. On the other hand, when stating that spike timing matters, it is not meant that rate is meaningless; it simply means that precise timing information cannot be discarded. Thus, these are not two symmetrical views: the stronger assumptions are on the side of the rate-based view. Now of course each specific spike-based theory makes a number of possibly strong assumptions. But the general idea that the neural “code” is based on individual spikes and not just rates is not based on strong assumptions. The rate-based view is based on an approximation, which may be a good one or a bad one. This is the nature of the rate vs. timing debate.

On the role of voluntary action in perception

The sensorimotor theory of perception considers that to perceive is to understand the effect of active movements on sensory signals. Gibson’s ecological theory also places an emphasis on movements: information about the visual world is obtained by producing movements and registering how the visual field changes in lawful ways. Poincaré also defined the notion of space in terms of the movements required to reach an object or compensate for movements of an object.

Information about the world is contained in the sensorimotor “contingencies” or “invariants”, but why should it be important that actions are voluntary? Indeed, one could see movements as another kind of sensory information (e.g. proprioceptive information, or “efferent copy”), and a sensorimotor law is then just a law defined on the entire set of accessible signals. I will propose two answers below. I only address the computational problem (why it is useful), not the problem of consciousness.

Why would it make a difference that action is voluntary? The first answer I will give comes from ideas discussed in robotics and machine learning, and known as active learning, curiosity or optimal experiment design. Gibson makes this remark that the term “information” is misleading when talking about the sensory inputs. Senses cannot be seen as a communication channel, because the world does not send messages to be decoded by the organism. In fact rather the opposite is true: the organism actively seeks information about the world by making specific actions that improve its knowledge. A good analogy is the game “20 questions”. One participant thinks of an object or person. The other tries to discover it by asking questions that can only be answered by yes or no. She wins if she can guess the object with fewer than 20 questions. Clearly it is very difficult to guess using the answer to a random question. But by asking smart questions, one can quickly narrow the search to the right object. In fact with 20 questions, one can discover up to 220 = a million objects. Thus voluntary action is useful for efficiently exploring the world. Here by “voluntary” it is simply meant that action is a decision based on previous knowledge, which is intended to maximally increase future knowledge.

I can see another way in which voluntary action is useful, by drawing an analogy with philosophy of science. If perception is about inferring sensory or sensorimotor laws, then it raises an issue common to the development of science, which is how to infer universal laws from a finite set of observations. Indeed there are an infinite number of universal laws that are consistent with any finite set of observations – this is the problem of inductivism. Karl Popper argued that science progresses not by inferring laws, but by postulating falsifiable theories and testing them with critical experiments. Thus action can be seen as the test of a perceptual hypothesis. Perception without action is like science based on inductivism. Action can decide between several consistent hypotheses, and the fact that it is voluntary is what makes it possible to distinguish between causality and correlation (a fundamental problem raised by Hume). Here “voluntary” means that the action could have been different.

In summary, voluntary action can be understood as the test of a perceptual hypothesis, and it is useful both in establishing causal relationships and in efficiently exploring relevant hypotheses.