J'ai soumis une contribution aux assises de la recherche intitulée: "Une analyse économique du système de recherche". Il s'agit de quelques remarques simples, mais d'une portée générale, sur l'organisation du système de recherche.
J'ai soumis une contribution aux assises de la recherche intitulée: "Une analyse économique du système de recherche". Il s'agit de quelques remarques simples, mais d'une portée générale, sur l'organisation du système de recherche.
In a recent paper, I explained how to compute with neural synchrony, by relating synchrony with the Gibsonian notion of sensory invariants. Here I will briefly recapitulate the arguments and try to explain what can and cannot be done with this approach.
First of all, neural synchrony, as any other concept of neural code, should be defined from the observer point of view, that is, from the postsynaptic point of view. Detecting synchrony is detecting coincidences. That is, a neural observer of neural synchrony is a coincidence detector. Now coincidences are observed when they occur in the postsynaptic neuron, not when the spikes are produced by the presynaptic neurons. Spikes travel along axons and therefore generally arrive after some delay, which we may consider fixed. This means that in fact, coincidence detectors do not detect synchrony but rather specific time differences between spike trains.
I will call these spike trains Ti(t), where i is the index of the presynaptic neuron. Detecting coincidences means detecting relationships Ti(t)=Tj(t-d), where d is a delay (for all t). Of course we may interpret this relationship in a probabilistic (approximate) way. Now if one assumes that the neuron is a somewhat deterministic device that transforms a time-varying signal S(t) into a spike train T(t), then detecting coincidences is about detecting relationships Si(t)=Sj(t-d) between analog input signals.
To make the connection with perception, I then assume that the input signals are determined by the sensory input X(t) (which could be a vector of inputs), so that Si(t)=Fi(X)(t). So computing with neural synchrony means detecting relationships Fi(X)(t)=Fj(X)(t-d), that is, specific properties of the stimulus X (Fi is a linear or nonlinear filter). You could see this as a sensory law that the stimulus X(t) follows, or with Gibson’s terminology, a sensory invariant (some property of the sensory inputs that does not change with time).
So this theory describes computing with synchrony as the extraction sensory invariants. The first question is, can we extract all sensory invariants in this way? The answer is no, only those relationships that can be written as Fi(X)(t) = Fj(X)(t-d) can be detected. But then isn’t the computation already done by the primary neurons themselves, through the filters Fi? This would imply that synchrony does not achieve anything, computationally speaking. But this is not true. The set of relationships between signals Fi(X)(t) is not the same thing as the set of signals themselves. For once, there are more relationships than signals: if there are N encoding neurons, then there are N2 relationships, times the number of allowed delays. But more importantly, a relationship between signals does not have the same nature as a signal. To see this, consider just two auditory neurons, one that responds to sounds from the left ear only, and one that responds to sounds from the right ear (and neglect sound diffraction by the head to simplify things). None of these neurons is sensitive at all to the location of the sound source. But the relationships between the input signals to these two neurons are informative of sound location. Relationships and signals are two different things: a signal is a stream of numbers, while a relationship is a universal statement on these numbers (aka “invariant”). So to summarize: synchrony represents sensory invariants, which are not represented in the individual neurons, but only a limited number of sensory invariants. For example, if the filters Fi are linear, then only linear properties of the sensory input can be detected. Thus, sensory laws are not produced but rather detected, among a set of possible laws.
Now the second question: is computing with synchrony only about extracting sensory invariants? The answer is also no, because the theory is based on the assumption that the input signals to the neurons and their synchrony are mostly determined by the sensory inputs. But they could also depend on “top-down” signals. Synchrony could be generated by recurrent connections, that is, synchrony could be the result of a computation rather than (or in addition to) the basis of computation. Thus, to be more precise, this theory describes what can be computed with stimulus-induced synchrony. In Gibson’s terminology, this would correspond to the “pick-up” of information, i.e., the information is present in the primary input, preexisting in the form of the relationships between transformed sensory signals (Fi(X)), and one just needs to observe these relationships.
But there is an entire part of the field that is concerned with the computational role of neural oscillations, for example. If oscillations are spatially homogeneous, then it does not affect the theory – it may in fact be simply a way to transform similarity of slowly varying signals into synchrony (this mechanism is the basis of Hopfield and Brody’s olfactory model). If they are not, in particular if they result from interactions between neurons, then this is a different thing.
I am writing this post from the Sensory Coding and Natural Environment conference in Vienna. It’s a very interesting conference about a topic that I like very much, but it strikes me that many approaches I have seen seem to miss the point of what is natural about natural sensory signals.
So what is natural about natural sensory signals? It seems that a large part of the field, from I have heard, answers that these are signals that have natural statistics. For example, they have particular second and higher order statistics, both spatially and temporally. While this is certainly true to some extent, I don’t find it a very satisfying answer.
Suppose I throw a rock in the air, and I can see its movement until it reaches the ground. The visual signals that I capture can be considered “natural”. What is natural about the motion of the rock, is it that the visual signals have particular statistics? Probably they do, but to me a more satisfying answer is that it follows the law of gravitation. Efficient coding approaches often tend to focus on statistics, because “the world is noisy” (or, “the brain is noisy”). However, even though there are turbulences in the air, describing the motion of the rock as obeying to the law of gravitation (possibly with some noise) is still more satisfying than describing its higher order statistics – and possibly more helpful for an animal too.
In other words, I propose that what is natural about sensory signals is that they follow the laws of nature.
By the way, this view is completely in agreement with Barlow’s efficient coding principle, which postulates that neurons encode sensory information in an efficient way, i.e., they convey a maximum amount of information with a minimum number of spikes. Indeed representing the laws that govern sensory signals leads to a parsimonious description of these signals.
In his book on vision, David Marr acknowledges the fact that a major computational issue for sensory systems is to extract relevant information in a way that is invariant to a number of changes in the world. For example, to recognize a face independently of its orientation and distance. Here we hit a major difference between representational theories and what I shall call structural theories, such as Gibson’s ecological theory (see my post on the difference between these two theories). In a representational theory, invariant processing is obtained by building a representation that is itself invariant to a number of transformations (e.g. translations, rotations). How can this representation be built? There are two ways: either it is wired (innate) or it is acquired, learned by associating many transformed instances of the same object with the same “percept”. So in a representational theory, dealing with invariance is a tedious learning process requiring supervision. In a structural theory, the problem actually does not exist, because the basis of perception is precisely invariants.
I will give an example in hearing. There are two theories of pitch perception. Pitch is the percept associated to how low or high a musical note is. It mostly corresponds to the periodicity of the sound wave. Two periodic sounds with the same repetition rate will generally have the same pitch. But they may have different timbres, i.e., different spectral contents. In the spectral or template theory, there is an initial representation of sounds consisting as a spectral pattern. It is then compared with the spectral patterns of reference periodic sounds with various pitches, the templates. These templates need to be learned, and the task is not entirely trivial because periodic sounds with the same pitch can have non-overlapping spectra (for example a pure tone, and a complex tone without the first harmonic). The spectral theory of pitch is a representational theory of pitch. In this account, there is nothing special about pitch, it is just a category of sound spectra.
The temporal theory of pitch, on the other hand, postulates that the period of a sound is detected. I call it a structural theory because pitch corresponds to a structural property of sounds, their periodicity. One can observe that the same pattern in the sound wave is repeated, at a particular rate, and this observation does not require learning. Now this means that if two sounds with the same period are presented, I can immediately recognize that they share the same structural property, i.e., they have the same pitch. Learning, in a structural theory, only means associating a particular structure with a label (say, the name of a musical note). The invariance problem disappears in a structural theory, because the basis of the percept is an invariant: the periodicity does not depend on the sound’s spectrum. This also means that sounds that elicit a pitch percept are special because they have a particular structure. In particular, periodic sounds are predictable. White noise, on the other hand, has no structure and does not elicit a pitch percept.
In his book “Vision”, David Marr briefly comments on James Gibson’s ecological approach, and rejects it. He makes a couple of criticisms that I think are fair, for example the fact that Gibson seemed to believe that extracting meaningful invariants from sensory signals is somehow trivial, while it is a difficult computational problem. But David Marr seems to have missed the important philosophical points in James Gibson’s work. These points have also been made by others, for example Kevin O’Regan, Alva Noë, but also Merleau-Ponty and many others. I will try to summarize a few of these points here.
I quote from David Marr: “Vision is a process that produces from images of the external world a description that is useful to the viewer and not cluttered with irrelevant information”. There are two philosophical errors in this sentence. First, that perception is the production of a representation. This is a classical philosophical mistake, the homunculus fallacy. Who then sees this representation? Marr even explicitly mentions a “viewer” of this representation. One would have to explain the perception of this viewer, and this reasoning leads to an infinite regress.
The second philosophical mistake is more subtle. It is to postulate that there is an external source of information, the images in the retina, that the sensory system interprets. This is made explicit later in the book: “(...) the initial representation is in no doubt – it consists of arrays of image intensity values as detected by the photoreceptors in the retina”. This fact is precisely what Gibson doubts at the very beginning of his book, The Ecological Approach to Visual Perception. Although it is convenient to speak of information in sensory signals, it can be misleading. It makes a parallel with Shannon’s theory of communication, but the environment does not communicate with the observer. Surfaces reflect light waves in all directions. There is no message in these waves. So the analogy between a sensory system and a communication channel is misleading. The fallacy of this view is fully revealed when one considers the voluntary movements of the observer. The observer can decide to move and capture different sensory signals. In Gibson’s terminology, the observer samples the ambient optic array. So what is primary is not the image, it is the environment. Gibson insists that a sensory system cannot be reduced to the sensory organ (say, the eyes and the visual cortex). It must include active movements, embedded in the environment. This is related to the embodiment theory.
We tend to feel that what we see is like the image of a high-resolution camera. This is a mistake due to the immediate availability of visual information (by eye movements). In reality, a very small part of the visual field has high resolution, and a large part of the retina has no photoreceptors (the blind spot). We do not feel this because when we need the information, we can immediately direct our eyes towards the relevant target in the visual field. There is no need to postulate that there is an internal high-resolution representation in which we can move our “inner eye”. Rodney Brooks, a successful researcher in artificial intelligence and robotics, once stated “the world is its own best model”. The fact that we actually do not have a high-resolution mental representation of the visual world (an image in the mind) has been demonstrated spectacularly through the phenomena of change blindness and inattentional blindness, in which a major change in an image or movie goes unnoticed (see for example this movie).
What is the difference between neural correlation and neural synchrony? As I am interested in the role of synchrony in neural computation, I often hear the question. I will try to give a few answers here.
A simple answer is: it’s a question of timescale. That is, synchrony is correlation at a fine timescale, or more precisely, at a timescale shorter than the integration time constant of the neuron. In this sense, the term synchrony implicitly acknowledges that there is an observer of these correlations. This usage is consistent with the fact that neurons are very sensitive to the relative timing of their inputs within their integration time constant (see our recent paper in J Neurosci on the subject).
However, although I have been satisfied with this simple answer in the past, I now feel that it misses the point. I think the distinction rather has to do with the distinction between the two main theories of neural computation, rate-based theories vs. spike-based theories. The term “correlation” is often used in the context of rate-based theories, whereas the term “synchrony” is used in general in the context of spike-based theories (as in my recent paper on computing with neural synchrony). The difference is substantial, and it does not really have to do with the timescale. A correlation is an average, just as a firing rate is an average. Therefore, by using the term correlation, one implicitly assumes that the quantities of interest are averages. In this view, correlations are generally seen as modulating input-output properties of neurons, in a rate-based framework, rather than being the substance of computation. But when using the term synchrony, one does not necessarily refer to an average, simply to the fact that two spikes occur at a similar time. For example, in my recent paper on computing with neural synchrony, I view coincidence detection as the detection of a rare event, that is, a synchrony event that is unlikely to occur by chance. If one takes this view further, then meaningful synchrony is in fact transient, and therefore the concept cannot be well captured by an average, i.e., by correlation.
The distinction might not be entirely obvious, so I will give a simple example here. Consider two Poisson inputs with rate F. Consider one spike from neuron A. The probability that neuron B spikes within time T after this spike can be calculated (integral of an exponential distribution), and for small T it is essentially proportional to T (and to F squared). If T is very small and the two inputs are independent, this event will almost never happen. So if it does happen, even just once, then it is unexpected and therefore meaningful, since it means that the assumption of independence was probably wrong. In a way, a coincidence detector can be seen as a statistical test: it tests the coincidence of input spikes against the null hypothesis, which is that inputs are independent. A single synchrony event can make this test fail, and so the concept cannot be fully captured by correlation, which is an average.
To summarize, synchrony is not about determinism vs. stochasticity, it is not about correlation on a very fine timescale, or about very strong correlation, it is about relative timing in individual spiking events, and about how likely such an event is likely to occur by chance under an independence hypothesis.
Boris Gourévitch and I have just published a paper in ecological acoustics:
Gourévitch B and Brette R (2012). The impact of early reflections on binaural cues. JASA 132(1):9-27.
This is a rather technical paper in which we investigate how binaural cues (ITDs, ILDs) are modified in an ecological environment in which there are reflections. Indeed most sound localization studies use HRTFs recorded in anechoic conditions, but apart perhaps from flying animals, anechoic conditions are highly unecological. That is, even in free field, there is always at least a ground on which sounds waves reflect. In this paper, we focus on early reflections. In the introduction, we motivate this choice by the fact that the precedence effect (perceptual suppression of echoes) only acts when echoes arrive after a few ms, and therefore early reflections should not be suppressed. Another, perhaps simpler, argument is that in a narrow frequency band, a sound will always interfere with its echo when the echo arrives less than a couple of periods after the direct sound. Therefore, early reflections produce interferences, seen in the binaural cues. An important point is that these are deterministic effects, not variability. In the paper, we analyze these effects quantitatively with models (rigid spheres and sophisticated models of sound absorption by the ground). One implication is that in ecological environments and even with a single sound source and in the absence of noise, there may be very large interaural time differences, which carry spatial information.
It seems that I haven't written a blog entry for three years now. Oops! I figured I could use this blog to announce new papers. Since I have not written here for a while, it will be a long post!
Here is my most recently published paper:
Brette R (2012). Computing with neural synchrony. PLoS Comp Biol. 8(6): e1002561. doi:10.1371/journal.pcbi.1002561.
It is a theoretical paper (no experimental data). I try to address two questions: 1) does synchrony matter?, 2) what is synchrony good for, from a functional point of view?
Does synchrony matter?
In fact, we addressed the first question in a paper published last year, combining theory and slice experiments:
Rossant C, Leijon S, Magnusson AK, Brette R (2011). Sensitivity of noisy neurons to coincident inputs. J Neurosci 31(47):17193-17206.
In the PLoS paper above, I complement this analysis with another point of view, so I'll try to summarize the results of both papers here. Synchrony (that is, correlations at a short timescale, comparable to the integration time constant of neurons) has been observed in a number in experimental studies. But some authors would argue that it is a meaningless correlate of neural network dynamics. Our approach is very straightforward: what is, quantitatively, the impact of coincident input spikes on the output of a postsynaptic neuron? Is it a mild modulation, or is it very large? There are different ways to approach the issue, but essentially the results are: extremely large.
One approach is based on the analysis of the impact of correlations on the membrane potential distribution. Neurons generally live (in vivo) in a fluctuation-driven regime: they fire irregularly, in response to fluctuations of their total synaptic input above the spike threshold. In this regime, the output rate depends not only on the mean input, but also on its variance. The mean does not depend on input correlations. But the variance does. When the inputs are independent, the variance scales as N, the number of synapses. When they are not, the variance is the sum of covariances between all pairs of inputs, and there are about N² of them. This means that even if pairwise correlations are of order 1/N, they still have a major impact on the neuron's output. Therefore, for correlations not to matter would require specific cellular or network mechanisms that cancel these correlations at all times.
The second approach shows that even if these correlations are somehow cancelled, the neuron still remains highly sensitive to coincident inputs. Consider background synaptic activity, resulting in a membrane potential distribution peaking below threshold. If this is the sum of N independent inputs (or with cancelled correlations), the standard deviation is of order N^(1/2). Now add p coincident input spikes. You can easily show that these input spikes cause a number of extra output spikes that increases supralinearly with p. In the J Neurosci paper, we calculate it and the formula agrees very well with slice experiments (in a cortical neuron). The calculations show that a very small number of input spikes are enough to produce an output. This is related to the "coincidence advantage" discussed by Moshe Abeles in the 1980s, and a number of experimental papers by Usrey, Alonso and Reid on the LGN->V1 projection. We also show that very tiny modifications in spike timing completely change the output rate, even though pairwise correlations remain very close to zero.
In the PLoS paper I show another point of view, based on signal detection theory. You consider the problem of detecting coincident inputs in a background noise, from just observing the membrane potential. Again, because the signal scales linearly with the number p of coincident spikes but the noise (independent inputs) scales with the square root of the total number N of inputs, even a very small number of coincident input spikes is highly detectable.
Coincidence detection is also enhanced by the fact the spike threshold adapts to the membrane potential. In this paper we explain what it means for synaptic integration and coincidence detection:
Platkiewicz J and Brette R (2011). Impact of Fast Sodium Channel Inactivation on Spike Threshold Dynamics and Synaptic Integration. PLoS Comp Biol 7(5): e1001129. doi:10.1371/journal.pcbi.1001129.
Finally in the synchrony paper, I also show that synchronously activated synapses tend to be potentiated by STDP, and so synchrony has a huge effect both on input-output properties and on learning. This finding about correlated inputs being selected is not new, I just put it in a functional context.
What is synchrony good for?
But the more important point in the PLoS paper is about the possible function of synchrony. To be useful, synchrony has to be stimulus-dependent. So I define the "synchrony receptive field" of a given pair or group of neurons as the set of stimuli that produce synchronous responses in these neurons. I give a toy example with duration-selection neurons, but the more interesting stuff is the application to sensory modalities.
The idea is the following. One of the main function of a sensory system is to extract invariant structure in sensory signals. "Structure" simply means something in the sensory signals that distinguish them from random inputs. For example, sounds produced by an actual source in the environment produces two binaural signals that have very precise relationships between them, that depend on the source location. The task of the sensory system is to identify these relationships. "Invariant" refers to the idea that, in a fixed environment, relationships that remain fixed when something else changes (for example the observer's position) tells you something about the environment. For example, in binaural hearing, the source of change is simply the source signal itself, and what persists is the relationship between the two binaural signals, which only depends on the location of the sound source. This idea has been articulated in particular by James Gibson, a visual psychologist who developed an "ecological theory" of perception. From a computational point of view, an interesting thing in this theory is that the problem of invariance is solved from the start: instead of trying to design operations that are invariant to a number of transformations, you trigger these transformations and observe the relationships that persist (e.g. for vision, by moving and looking at the changes in the visual input). Note that this is very different from the standard point of view in most neural network theory, in which the main paradigm is pattern recognition, inspired largely by David Marr's work in vision. Here we do not compare the sensory inputs with previously stored patterns, we look for structure (relationships) in these sensory signals.
What's the relationship with synchrony? The idea is very simple. If there is some structure, that is, some regularity in sensory signals, then neurons with different receptive fields will encode these sensory signals into spike trains, in which sensory structure is seen as synchrony patterns. In the paper, I show that these synchrony patterns are intrinsically invariant to a number of transformations, just because the structure itself is invariant. I then develop mostly a (simplified) example of odor recognition, and I show how a simple network based on these ideas can perform concentration-invariant odor recognition, and how the network can learn to recognize an odor and can generalize across a large range of concentrations.
Two years ago, we applied this principle to binaural hearing:
Goodman DF and R Brette (2010). Spike-timing-based computation in sound localization. PLoS Comp Biol 6(11): e1000993. doi:10.1371/journal.pcbi.1000993.
This was really a "proof-of-principle" study, where we showed how we can use these ideas to identify the location of a sound source using realistic acoustic signals, coming from a virtual auditory environment (with real diffraction properties and natural sounds). We have refined the model and are currently trying to test it with in vivo recordings.
At the end of the synchrony paper, I also mention an application in vision: the synchrony receptive field of two LGN neurons is an edge with a particular orientation. I was very excited to read this week that a group has actually found exactly that, using multiple recordings in the LGN with gratings:
Stanley et al. (2010). Visual Orientation and Directional Selectivity through Thalamic Synchrony. The Journal of Neuroscience, 27 June 2012, 32(26):9073-9088.
And they show that this could provide the contrast-invariance property that is seen in V1 neurons. In fact, during my PhD I worked exactly on this problem, using simple models, and I predicted these findings: it's only in my thesis in French, unfortunately.
I have to clean up my code before it is usable, but I will upload it soon to modeldb and Brian examples!
I recently came across a web page that described a computer working with water instead of electricity. There is a very smart idea of a logical gate with two input jets of water and two outputs: when there is no input water, there is no output; when there is one input, the water flows through ouput #1; when there are two input jets, they collide and the water gets diverted to output #2. Therefore output #1 is a XOR gate and output #2 is an AND gate.
I was wondering how to make a hydraulic model of a neuron (which would be analog rather than digital). It could be an interesting educational tool. You could imagine a container where water would flow from the top, analog to the input current, and the water level would be the membrane potential v. In this simple configuration, it corresponds to the perfect integrator model: , where I is the input flow (in units of volume/time) and C is the area of the container section. I chose C for this parameter because it clearly plays the role of the membrane capacitance.
Now a simple way to implement the current leak is to cut a hole at the bottom of the container. Then the water flow through that hole is proportional to , where a is the area of the hole. So we get the following equation:
where k is a proportionality factor. If the hole is cut at level (rather than at the bottom), we obtain:
which is a nonlinear leaky neuron model ().
The hard problem now is to implement spiking. Here I think we need mechanical elements: when the level reaches some mechanical element at the top, it would trigger the opening of a sink at the bottom, which would remain open as long as water flows through it (or as long the weight of water is above some critical level). Alternatively, when the weight of the water reaches a critical threshold, then the sink at the bottom opens, and it remains open as long as water is flowing (but I am not sure how to implement that property).
p.s.: to have a linear leaky neuron instead of nonlinear, one idea is to have the area a change with v as . To achieve that, one can imagine that the floor is mounted on a spring, so that the area of the hole increases with the weight of the water. If the width of the hole goes as (where x is the vertical position on the hole), then the flow through the hole is proportional to . If we want to avoid the rectification (), i.e., if we want water to flow in when the level is below , then we need to immerse the container in a very large (ideally infinite) container with water level .
I recently read a very interesting essay entitled "The importance of stupidity in scientific research" and I wanted to share it with you. The author observes that feeling stupid is an essential aspect of research, because you are supposed to understand something that nobody currently does (as opposed to trying to solve a very difficult exercise). That might be why many very good PhD students feel discouraged, because they don't realize that feeling stupid means you are probably doing something valuable! Here is a quote that inspires me as a PhD supervisor:
We don't do a good enough job of teaching our students how to be productively stupid