What is computational neuroscience? (XII) Why do scientists disagree?

A striking fact about the making of science is that in any field of research, there are considerable disagreements between scientists. This is an interesting observation, because it contradicts the naive view of science as a progressive accumulation of knowledge. Indeed, if science worked in this way, then any disagreement should concern empirical data only (e.g. whether the measurements are correct). On the contrary, disagreements often concern the interpretation of data rather than the data themselves. The interpretative framework is provided by a scientific theory, and there are often several of them in any field of research. Another type of disagreement concerns the judgment of how convincingly some specific piece of data demonstrates a particular claim.

There are two possibilities: either a large proportion of scientists are bad scientists, who do not correctly apply sound scientific methodology, or the adhesion to a theory and the judgment of particular claims are not entirely based on scientific principles. The difficulty with the first claim, of course, is that there is no systematic and objective criterion to judge what “good science” is and what “bad science” is. In fact, the very nature of this question is epistemological: how is knowledge acquired and how do we distinguish between different scientific theories? Thus, part of the disagreement between scientists is not scientific but epistemological. Epistemological questions are in fact at the core of scientific activity, and failure to recognize this point leads to the belief that there is a single way to do science, and therefore to dogmatism.

So why do scientists favor one theory rather than the other, given the same body of empirical data? Since the choice is not purely empirical, it must rely on other factors that are not entirely scientific. I would argue that a major determinant of the adhesion to a particular theory, at least in neuroscience, is the consonance with philosophical conceptions that the scientist holds. These conceptions may not be recognized as such, because many scientists have limited knowledge or interest in philosophy. One such conception would be, for example, that the objects of perception exist independently of the organism and that the function of a perceptual system is to represent them. Such a conception provides a framework in which empirical data are collected and interpreted, and therefore it is not generally part of the theoretical claims that are questioned by data. It is a point of view rather than a scientific statement, but it guides our scientific enquiry. Once we realize that we are in fact guided by philosophical conceptions, we can then start questioning these conceptions. For example, why would the organism need to represent the external world if the world is already there to be seen? Shouldn’t a perceptual system rather provide ways to act in the world rather than represent it? Who reads the “representation” of the world? Given that the world can only be accessed through the senses, how can this representation be interpreted in terms of the external world?

Many scientists deny that philosophy is relevant for their work, because they consider that only science can answer scientific questions. However, given that the adhesion of a scientist to a particular scientific theory (and therefore also the making of a scientific theory) is in fact guided by philosophical preconceptions, rejecting philosophy only has the result that the scientist may be guided by naive philosophical conceptions.

Finally, another determinant of the adhesion to a particular scientific theory is psychological and linked to the personal history of the scientist. The theory of cognitive dissonance, perhaps the most influential theory in psychology, claims that human psychology is determined by the drive to minimize the dissonance between different cognitive elements. For example, when a piece of evidence is presented that contradicts the beliefs of the scientist, this produces cognitive dissonance and a drive to reduce it. There are different ways to reduce it. One is that the scientist changes her mind and adopts another theory that is consistent with the new piece of data. Another one is that the piece of data is rejected or interpreted in a way that is consonant with the beliefs of the scientist, possibly by adding an ad hoc hypothesis. Another one is to add consonant elements, e.g. by providing new pieces of evidence that support the beliefs of the scientist. Another one is to seek consonant information and to avoid dissonant information (e.g. only read those papers that are most likely to support the beliefs of the scientist). The theory of cognitive dissonance predicts that the first way rarely occurs. Indeed, as the scientist develops his carrier within a given scientific theory, she develops more and more ways to discard dissonant pieces of information, seeks information that is consonant with the theory and by taking all these decisions, many of them public, increases the dissonance between her behavior and contradictory elements. An important and counter-intuitive prediction of the theory of cognitive dissonance is that contradictory evidence generally reinforces the beliefs of the scientist that is deeply committed to a particular theory.

In summary, a large part of scientific activity, including the making of and the adhesion to a scientific theory, relies on epistemological, philosophical and psychological elements.

What is computational neuroscience? (XI) Reductionism

Computational neuroscience is a field that seeks a mechanistic understanding of cognition. It has the ambition to explain how cognition arises from the interaction of neurons, to the point that if the rules that govern the brain are understood in sufficient detail, it should be in principle possible to simulate them on a computer. Therefore, the field of computational neuroscience is intrinsically reductionist: it is assumed that the whole (how the brain works) can be reduced to final elements that compose it.

To be more precise, this view refers to ontological reductionism. A non ontologically reductionist view would be for example vitalism, the idea that life is due to the existence of a vital force, without which any given set of molecules would not live. A similar view is that the mind comes from a non-material soul, which is not scientifically accessible, or at least not describable in terms of the interaction of material elements. One could also imagine that the mind arises from matter, but that there is no final intelligible element – e.g. neurons are as complex as the whole mind, and smaller elements are not more intelligible.
In modern science in general and in neuroscience in particular, ontological reductionism is fairly consensual. Computational neuroscience relies on this assumption. This is why criticisms of reductionism are sometimes wrongly perceived as if they were criticisms of the entire scientific enterprise. This perception is wrong because criticisms of reductionism are generally not about ontological reductionism but about other forms of reductionism, which are more questionable and controversial.

Methodological reductionism is the idea that the right way, or the only way, to understand the whole is to understand the elements that compose it. It is then assumed that the understanding of the whole (e.g. function) derives from this atomistic knowledge. For example, one would consider that the problem of memory is best addressed by understanding the mechanics of synaptic plasticity – e.g. how the activity of neurons changes the synapses between them. In genetics, one may consider that memory is best addressed by understanding which genes are responsible for memory, and how they control the production of proteins involved in the process. This is an assumption that is less consensual, in computational neuroscience or in science in general, including in physics. Historically, it is certainly not true that scientific enquiry in physics started from understanding microscopic laws before macroscopic laws. Classical mechanics came before quantum mechanics. In addition, macroscopic principles (such as thermodynamics and energy in general) and symmetry principles are also widely used in physics in place of microscopic laws (for example, to understand why soap makes spherical bubbles). However, this is a relatively weak criticism, as it can be conceived that macroscopic principles are derived from microscopic laws, even if this does not reflect the history of physics.

In life sciences, there are specific reasons to criticize methodological reductionism. The most common criticism in computational neuroscience is that, while function derives from the interaction of neurons, it can also be said that the way neurons interact together is indirectly determined by function, since living organisms are adapted to their environment through evolution. Therefore, unlike objects of physics, living beings are characterized by a circular rather than causal relationship between microscopic and macroscopic laws. This view underlies “principle-based” or “top-down” approaches in computational neuroscience. Note that this is a criticism of methodological reductionism, but not of ontological reductionism.

There is also a deeper criticism of methodological reductionism, following the theme of circularity. It stems from the view that the organization of life is circular. It has been developed by Humberto Maturana and Francisco Varela under the name “autopoiesis”, and by Robert Rosen under the name “M-R systems” (M for metabolism and R for repair). What defines an entity as living, before the fact that it may be able to reproduce itself, is the fact that it is able to live. It is such an obvious truth about life that it is easy to forget, but to maintain its existence as an energy-consuming organism is not trivial at all. Therefore, a living entity is viewed as a set of physical processes in interaction with the environment that are organized in such a way that they maintain their own existence. It follows that, while a part of a rock is a smaller rock, a part of a living being is generally not a living being. Each component of the living entity exists in relationship with the organization that defines the entity as living. For this reason, the organism cannot be fully understood by examining each element of its structure in isolation. This is so because the relationship between structure and organization is not causal but circular, while methodological reductionism assumes a causal relationship between the elements of structure and higher-order constructs (“function”). This criticism is deep, because it does not only claim that the whole cannot be understood by only looking at the parts, but also that the parts themselves cannot be fully understood without understanding the whole. That is, to understand what a neuron does, one must understand in what way it contributes to the organization of the brain (or more generally of the living entity).

Finally, there is another type of criticism of reductionism that has been formulated against attempts to simulate the brain. The criticism is that, even if we did manage to successfully simulate the entire brain, this would not imply that we would understand it. In other words, to reproduce is not to understand. Indeed we can clone an animal, and this fact alone does not give us a deep understanding of the biology of that animal. It could be opposed that the cloned animal is never exactly the same animal, but certainly the same could be said about the simulated brain. But tenants of the view that simulating a brain would necessarily imply understanding the brain may rather mean that such a simulation requires a detailed knowledge of the entire structure of the brain (ionic channels in neurons, connections between neurons, etc) and that by having this detailed knowledge about everything that is in the brain, we would necessarily understand the brain. This form of reductionism is called epistemic reductionism. It is somehow the reciprocal of ontological reductionism. According to ontological reductionism, if you claim to have a full mechanistic understanding of the brain, then you should be able to simulate it (providing adequate resources). Epistemic reductionism claims that this is not only a necessary condition but also a sufficient condition: if you are able to simulate the brain, then you fully understand it. This is a much stronger form of reductionism.

Criticisms of reductionism can be summarized by their answers to the question: “Can we (in principle, one day) simulate the brain?”. Critics of ontological reductionism would answer negatively, arguing that there is something critical (e.g., the soul) that cannot be simulated. Critics of epistemic reductionism would answer: yes, but this would not necessarily help us understanding the brain. Critics of methodological reductionism would answer: yes, and it would probably require a global understanding of the brain, but it could only be achieved by examining the organism as a system with an organization rather than as a set of independent elements in interaction.

What is computational neuroscience? (X) Reverse engineering the brain

One phrase that occasionally pops up when speaking of the goal of computational neuroscience is “reverse engineering the brain”. This is quite an interesting phrase from an epistemological point of view. The analogy is to see the brain as an engineered device, the “engineer” being evolution, of which we do not possess the design plans. We are supposed to understand it by opening it, and trying to guess what mechanisms are at play.

What is interesting is that observing and trying to understand the mechanisms is basically what science is about, not only neuroscience, so there must be something else in this analogy. For example, we would not describe the goal of astronomy as reverse engineering the planets. What is implied in the phrase is the notion that there is a plan, and that this plan is meant to achieve a function. It is a reference to the teleonomic nature of life in general, and of the nervous system in particular: the brain is not just a soup of neurons, these neurons coordinate their action so as to achieve some function (to survive, to reproduce, etc).

So the analogy is meaningful from this point of view, but as any analogy it has its limits. Is there no difference between a living being and an engineered artifact? This question points at what is life, which is a very broad question, but here I will just focus on two differences that I think are relevant for the present matter.

There is one very important specificity that was well explained by the philosopher Humberto Maturana (“The Organization of the living”, 1974). Engineered things have a structure that is designed so as to fulfill some function, that is, they are made of specific components that have to be arranged in a specific way, according to a plan. So all you need to understand is the structure, and its relation with the function. But as Maturana pointed out, living things have a structure (the body, the wiring of neurons, etc) but they also have an organization that produces that structure. The organization is a set of processes that produce the structure, which is itself responsible for the organization. But what defines the living being is its organization, not its structure, which can change. In the case of the nervous system, the wiring between neurons changes dramatically in the course of life, or even in the course of one hour, and the living being remains the same. The function of the organization is to maintain the conditions for its existence, and since it exists in a body interacting with an external environment, it is in fact necessary that the structure changes so as to maintain the organization. This is what is usually termed “plasticity” or “learning”. Therefore living things are defined by their organization, while engineered things are defined by their structure.

This is one aspect in which the engineering analogy is weak, because it misses this important distinction. Another one is that an engineered thing is made by an engineer, that is, by someone external to the object. Therefore the function is defined with respect to an external point of view. The plan would typically include elements that are defined in terms of physics, concepts that can only be grasped and measured by some external observer with appropriate tools. But a living organism only has its own senses and ways of interacting with the environment to make sense of the world. This is true of the nervous system as a whole, but also of individual cells: a cell has ways of interacting with other cells and possibly with the outside world, but it does not have a global picture of the organism. For example, an engineer plan would specify where each component should go, e.g. with Euclidian coordinates. But this is not how development can work in a living thing. Instead, the plan should come in the form of mechanisms that specify not “where” a thing is, but rather “how to get there”, or perhaps even when a component should transform into a new component – specific ways of interacting that end up in the desired result.

Therefore the nature of the “plan” is really quite different from the plan of an engineer. To make my point, I will draw an analogy with philosophy of knowledge. A plan is a form of knowledge, or at least it includes some knowledge. For example, if the plan includes the statement “part A should be placed at such coordinates”, then there is an implicit knowledge on part of the organism that executes the plan about Euclidian geometry. For an engineer, knowledge comes from physics, and is based on the use of specific tools to measure things in the world. But for a cell, knowledge about the world comes just from the interaction with the world: different ways to sense it (e.g. incoming spikes for a neuron), different ways to act on it (e.g. producing a spike, releasing some molecules in the extracellular medium). A plan can be specified in terms of physics if it is to be executed by an engineer, but it cannot be specified in these terms if it is to be executed by a cell: instead, it would be specified in terms of mechanisms that make sense given the ways the cell can interact with the world. Implicit knowledge about the world that is included in an engineer plan is what I could call “metaphysical knowledge”, in relationship with the corresponding notion in philosophy of science.

Science is made of universal statements, such as the law of gravitation. But not all statements are scientific, for example “there is a God”. In philosophy of science, Karl Popper proposed that a scientific statement is one that can potentially be falsified by an observation, whereas a metaphysical statement is a statement that cannot be falsified. For example, the statement “all penguins are black” is scientific, because I could imagine that one day I see a white penguin. On the other hand, the statement “there is a God” is metaphysical, because there is no way I can check. Closer to the matter of this text, the statement “the world is actually five-dimensional but we live in a three-dimensional subspace” is also metaphysical because independently of whether it is true or not, we have no way to confirm it or to falsify it given the way we interact with the world.

So what I call “metaphysical knowledge” in an engineer plan is knowledge that cannot be corroborated or falsified by the organism that executes the plan, given its senses and possibilities for action. For example, consider the following statement: neurons in the lateral geniculate nucleus project to the occipital region of the brain. This includes metaphysical knowledge about where that region is, which is specified from the point of view of an external observer. This cannot be a biological plan. Instead, a biological plan would rather have to specify what kind of interaction a growing axon should have with its environment in order to end up in the desired region.

In summary, although the phrase “reverse engineering” acknowledges the fact that, contrary to physical things of nature such as planets, living things have a function, it misses several important specificities of life. One is that living things are defined by their organization, rather than by the changing structure that the organization produces, while engineered things are defined by their structure. Another one is that the “plan”, which defines that organization, is of a very different nature than the plan made by and for an engineer, because in the latter case the function and the design are conceived from an external point of view, which generally includes “metaphysical knowledge”, i.e., knowledge that cannot be grasped from the perspective of the organism.

What is computational neuroscience? (IX) The epistemological status of simulations

Computational neuroscience is not only about making theories. A large part of the field is also about simulations of neural models on a computer. In fact, there is little theoretical work in neuroscience that does not involve simulations at some stage. The epistemological status of simulations is quite interesting, and studies about it in philosophy of knowledge are quite recent. There is for example the work of Eric Winsberg, but I believe it mostly addresses questions related to physics. In particular, he starts one of his most cited papers (“Simulations, models and theories”, 2001) by stating: “I will be talking about the use of computers for modeling very complex physical phenomena for which there already exist good, well-understood theories of the processes underlying the phenomena in question”. This is an important distinction, and I will come back to it.

What is interesting about simulations from an epistemological viewpoint is that from a strictly Popperian viewpoint, simulation is useless. Indeed it looks like a sort of experiment, but there is no interaction with the world. It starts from a theory and a set of factual statements, and derives another set of factual statements. It is neither the making of a theory (no universal statement is produced), nor the test of a theory. So why is it that simulation is used so broadly?

In fact there are different types of simulation work. Broadly speaking, we may think of two categories: theory-driven simulations, and data-driven simulations.

I will start with theory-driven simulations. There are in fact two different motivations to use simulations in theoretical work. One is exploratory: simulations are used in the process of making theories, because the models are so complex so that it may be difficult to predict their behavior. This is a general problem with so-called complex systems. Simulations are then used for example to explore the effect of various parameters on the behavior of the model, or to see whether some property can appear given a set of rules, etc. Another motivation is to test a theory. Now this may seem odd since we are not speaking of an empirical test. First of all, this apparent oddity perhaps stems from the myth that theoretical work is essentially about making logical deductions from initial statements. But in reality, especially in biology where models can be very complex, theoretical work almost invariably involves some guess work, approximations, and sometimes vaguely justified intuitions. Therefore, it makes sense to check the validity of these approximations in a number of scenarios. For example, in my paper with Jonathan Platkiewicz about the spike threshold, we derived an equation for the spike threshold from the Hodgkin-Huxley equations. It involved approximations of the sodium current, and we also developed the theory in an isopotential neuron. Therefore in that paper, we checked the theory against the numerical simulation of a complex multicompartmental neuron model, and it was not obvious that it would work.

There is another motivation, which is more specific to computational neuroscience. Theories in this field are about how the interaction of neurons produces behavior, or in other words, about linking physiology, at the neuron level, and function, at the systems or organism level. But to speak of function, one needs an environment. This external element is not part of the neural model, yet it is critical to the relevance of the model. Theories generally do not include explicit models of the environment, or only simplistic versions. For example, in my paper about sound localization with Dan Goodman, we proposed a mechanism by which selective synchrony occurs when a sound is presented at specific locations, leading to a simple spiking model that can accurately estimation the location of a sound source in the presence of realistic diffraction properties. In principle it works perfectly, but of course in a real environment the acoustical signals are unknown, but not arbitrary, they may have a limited spectrum, there may be noise, diffraction properties are also unknown but not arbitrary, there may be ambiguities (e.g. the cones of confusion), etc. For this reason, the model needed to be implemented and its performance tested, which we did with recorded sounds, measured acoustical filters and acoustical noise. Thus it appears that even for theory-driven work, simulation is unavoidable because the theory applies to the interaction with an unknown, complex environment. In fact, ideally, models should be simulated, embodied (in a robot) and allowed to interact with a real (non simulated) environment. Since theories in computational neuroscience claim to link physiology and function, this would be the kind of empirical work required to substantiate such claims.

The other type of simulation work is data-driven. I believe this is usually what is meant by “simulation-based science”. In this kind work, there is little specific theory – that is, only established theories are used, such as cable equation theory. Instead, models are built based on measurements. The simulations are then truly used as a kind of experiment, to observe what might emerge from the complex interaction of neuron models. It is sometimes said that simulations are used to do “virtual experiments” when the actual experiments would be impractical. Another typical use is to test the behavior of a complex model when parameters are varied in a range that is considered plausible.

In physics, such computer simulations are also used, for example to simulate the explosion of a nuclear bomb. But as Winsberg noted, there is a very important epistemological distinction between simulations in physics and in biology: in the former, there is an extremely detailed knowledge of both the laws that govern the underlying processes and of the arrangement of the individual elements in the simulations. Note that even in this case, the value of such simulations is controversial. But in the case of biology and especially neuroscience, the situation is quite different. It is in fact acknowledged by the typical use cases mentioned above.

Consider the statement that a simulation is used to perform a “virtual experiment” when actual experiments are impractical. This seems similar to the simulation of a nuclear explosion. In that case, one is interested in the large scale behavior of the system, and at such a large scale the experiment is difficult to do. But in neuroscience, the situation is exactly the opposite. The experiment with a full organism is actually what is easy to do (or at least feasible), it is a behavioral experiment. So simulations are not used to observe how an animal behaves. They are used to observe the microstructure of the system. But then this means that this microstructure was not known at the time when the model was built, and so these properties that are to be observed are considered as sufficiently constrained by the initial set of measurements to be derived from them.

The second, and generally complementary, use case is to simulate the model while varying a number of parameters so as to find the viable region in which the model produces results consistent with some higher-order measurements (for example, local field potentials). If the parameters are varied, then this means they are actually not known with great certainty. Thus it is clear that biophysical models based on measurements are in fact much less reliable than physical models such as those of nuclear explosions.

One source of uncertainty is the values of parameters in the models, for example channel densities. This is already one great problem. Probably the biggest issue here is not so much the uncertainty about parameters, which is an issue in models of all fields, but the fact the parameters are most likely not independent, i.e., they covary in a given cell or between cells. This lack of independence comes from the fact that the model is of a living thing, and in a living thing all components and processes contribute to the function of the organism, which implies tight relations between them. The study of these relations is a defining part of biology as a field, but if a model does not explicitly include these relations, then it would seem extraordinary that proper function can arise without them, given that they are hidden under the uncertainty in the parameters. For example, consider action potential generation. Sodium channels are responsible for initiation, potassium channels for repolarization. There are a number of recent studies showing that their properties and densities are precisely tuned with respect to each other so that energy consumption is minimized: indeed energy is lost if they are simultaneously open because they have opposite effects. If this functional relation were unknown and only channel densities were known within some range, then the coordination would go unnoticed and a naive model simply using independent values from these distributions would display inefficient action potential generation, unlike real neurons.

I will try to summarize the above point. Such simulations are based on the assumption that the laws that govern the underlying processes are very well understood. This may well be true for the laws of neural electricity (cable equations, Hodgkin-Huxley equations). However, in biology in general and in neuroscience in particular, the relevant laws are also those that describe the relations between the different elements of the model. This is a completely different set of laws. For the example of action potential generation, the laws are related to the co-expression of channels, which is more related to the molecular machinery of the cell than to its electrical properties.

Now these laws, which relate to the molecular and genetic machinery, are certainly not so well known. And yet, they are more relevant to what defines a living thing than those describing the propagation of electrical activity, since indeed these are the laws that maintain the structure that maintain the cells alive. Thus, models based on measurements attempt to reproduce biological function without capturing the logics of the living, and this seems rather hopeful. There are also many examples in recent research that show that the knowledge we have of neural function is rather poor, compared to what is to be found. For example, glial cells (which make most of the cells in the brain) are now thought to play a much more important role in brain function than before, and these are generally ignored in models. Another example is in action potential initiation. Detailed biophysical models are based on morphological reconstructions of the axon, but in fact in the axon initial segment, there is also a scaffold that presumably alters the electrical properties along the axon (for example the axial resistivity should be higher).

All these remarks are meant to point out that in fact, it is illusory to think that there are, or will be in the near future, realistic models of neural networks based on measurements. What is worse, such models seem to miss a critical point in the study of living systems: these are not defined only by their structure (values of parameters, shape of cells) but by processes to maintain that structure and produce function. To quote Maturana (1974), there is a difference between the structure (channel densities etc) and the organization, which is the set of processes that set up that structure, and it is the organization, not the structure, that defines a living thing. Epistemologically speaking, the idea that things not accessible to experiment can be simulated based on measurements that constrain a model is induction. But the predictive power of induction is rather limited when there is such uncertainty.

I do not want to sound as if I were entirely dismissing data-driven simulations. Such simulations can still be useful, as an exploratory tool. For example, one may simulate a neuron using measured channel densities and test whether the results are consistent with what the actual cell does. If they are not, then we know we are missing some important property. But it is wrong to claim that such models are more realistic because they are based on measurements. On one hand, they are based on empirical measurements, on the other hand, they are dismissing mechanisms (or “principles”), which is another empirical aspect to be accounted for in living things. I will come back in a later post to the notion of “realistic model”.

What is computational neuroscience? (VIII) Technical development and observations

In the previous posts, I have strongly insisted on the epistemological notion that theory precedes empirical observation, in the sense that experiments are designed to test theories. I insisted on this point because computational neuroscience seems to be understood by many scientists through the prism of naive inductivism: the view that theory derives more or less directly from observation (you make experimental measurements, and then you “make a model” from them). I do not need to insist again on why this view is flawed in many ways. But of course it would be absurd to advocate the opposite view, that is, that observation cannot provide knowledge unless it is designed to test a theory. This post is meant to nuance my previous arguments.

In fact, historically, science has progressed by two very different means: one is the introduction of radically new perspectives (“paradigm shifts”), another one is the development of new tools. A salient example in neuroscience is the development of patch-clamp, which allows recording currents flowing through single ionic channels. The technique led to the Nobel Prize of Neher and Sakmann in 1991. The discoveries they made with this technique were not revolutionary in Kuhn’s sense, that is, they did not fundamentally contradict the prevailing views and it was not a conceptual change of paradigm. It was already thought since the times of Hodgkin and Huxley that membrane currents came from the flow of ions through channels in the membrane, even though they could not directly observe it at the time. But still, the ability to make observations that were not possible before led to considerable new knowledge, for example the fact that channel opening is binary and stochastic.

At the present time, many think that the discoverers of optogenetics are on the shortlist to get the Nobel Prize in the coming years. Optogenetics is a very recent technique in which channelrhodopsin, a light-activated channel, is expressed in the membrane of target neurons through genetic manipulation. Using lasers, one can then control the firing of neurons in vivo at a millisecond timescale. It allows probing the causal role of different neurons in behavior, while most previous techniques, which relied mostly on recordings, could only measure correlates of behavior. Although it is probably too early to see it clearly, I anticipate that the technique will trigger not only new empirical knowledge, but also conceptually new theories. Indeed, there is a strong bias on the development of theories by what can be experimentally tested and observed. For example, many current theories in neuroscience focus on the “neural code”, that is, how neurons “represent” different types of information. This is an observer-centric view, which in my opinion stems from the fact that our current empirical view of the brain comes from recordings and imaging: we observe responses to stimuli. The neural coding view is a perspective that one has to adopt to explain such experimental data, rather than a hypothesis on what neurons do. But once we switch to different types of experimental data, in which we observe the effect of neural firing, rather than what they “encode”, not only does it become unnecessary to adopt the stimulus-response perspective, but in fact one has to adopt the opposite perspective to explain the experimental data: neurons act on postsynaptic neurons with spikes, rather than observe the firing of presynaptic neurons. This is a conceptual change of perspective, but one that is triggered by a new experimental technique. Note that it still requires the development of these new theories: by itself, the change in perspective is not a theory. But the new technique is responsible for this development in a sociological/historical sense.

Another way in which I anticipate new theories will arise from empirical observations is in the understanding of dendritic function. Almost all theories in computational neuroscience, at least those that address the functional or network level, are based on a view of synaptic integration based on isopotential neurons. That is, it is assumed that the location of synapses on the dendritic tree shapes postsynaptic potentials and perhaps total conductance, but that is otherwise irrelevant to synaptic integration. This is not exactly a hypothesis, because we know that it is not true, but rather a methodological assumption, an approximation. Why do we make this assumption if we know it is not true? Simply because removing this assumption does not give us an alternative theory, it leaves us with nothing: there are so many possibilities in which dendritic integration might work, we do not know where to start. But this will change (and certainly started changing in recent years) once we have a better general idea of how synapses are distributed on the dendritic tree, and perhaps the mechanisms by which this distribution arises. Indeed, one thing at least is clear from recent experimental work: this distribution is not random at all, and obeys different rules for excitation and inhibition. In other words: even though theory does not derive from observations, it needs a starting point, and therefore observations are critical.

What is computational neuroscience? (VII) Incommensurability and relativism

I explained in previous posts that new theories should not be judged by their agreement with the current body of empirical data, because these data were produced by the old theory. In the new theory, they may be interpreted very differently or even considered irrelevant. A few philosophers have gone so far as to state that different theories are incommensurable, that is, they cannot be compared with each other because they have different logics (e.g. observations are not described in the same way in the different theories). This reasoning may lead to relativistic views of science, that is, the idea that all theories are equally “good” and that their choice are a matter of personal taste or fashion. In this post I will try to explain the arguments, and also to discard relativism.

In “Against Method”, Feyerabend explains that scientific theories are defined in a relational way, that is, elements of a theory make sense only in reference to other elements of the theory. I believe this is a very deep remark that applies to theories of knowledge in the broadest sense, including perception for example. Below, I drew a schematic figure to illustrate the arguments.

Theories are systems of thought that relate to the world. Concepts in a theory are meant to relate to the world, and they are defined with respect to other concepts in the theory. A given concept in a given theory may have a similar concept in another theory, but it is a different concept, in general. To explain his arguments, Feyerabend uses the analogy of language. It is a good analogy because languages relate to the world, and they have an internal relational structure. Imagine theories A and B are two languages. A word in language A is defined (e.g. in the dictionary) by using other words from language A. A child learns her native language by picking up the relationship between the words, and how they relate to the world she can see. To understand language A, a native speaker of language B may translate the words. However, translation is not definition. It is imprecise because the two words often do not have exactly the same meaning in both languages. Some words may not even exist in one language. A deeper understanding of language A requires to go beyond translation, and to capture the meaning of words by acquiring a more global understanding of the language, both in its internal structure and in its relationship with the world.

Another analogy one could make is political theories, in how they view society. Clearly, a given observation can be interpreted in opposite ways in conservative and liberal political views. For example, the same economic crisis could be seen as the result of public debt or as the result of public cuts in spending (due to public acquisition of private debt).

These analogies support the argument that an element of a new theory may not be satisfactorily explained in the framework of the old theory. It may only make full sense when embedded in the full structure of the new theory – which means that new theories may be initially unclear and that the concepts may not be well defined. This remark can certainly make different theories difficult to compare, but I would not conclude that theories are incommensurable. This conclusion would be valid if theories were closed systems, because then a given statement would make no sense elsewhere than in the context of the theory in which it is formulated. Axiomatic systems in mathematics could be said to be incommensurable (for example, Euclidian and non-Euclidian geometries). But theories of knowledge, unlike axiomatic systems, are systems that relate to the world, and the world is shared between different theories (as illustrated in the drawing above). For this reason, translation is imprecise but not arbitrary, and one may still assess the degree of consistency between a scientific theory and the part of the world it is meant to explain.

One may find an interesting example in social psychology. In the theory of cognitive dissonance, new facts that seem to contradict our belief system are taken into account by minimally adjusting that belief system (minimizing the “dissonance” between the facts and the theory). In philosophy of knowledge, these adjustments would be called “ad hoc hypotheses”. When it becomes too difficult to account for all the contradictory facts (making the theory too cumbersome), the belief system may ultimately collapse. This is very similar to the theory of knowledge defended by Imre Lakatos, where belief systems are replaced by research programs. Cognitive dissonance theory was introduced by a field study in a small American sect who believed that the end of the world would occur at a specific date (Festinger, Riecken and Schachter (1956), When Prophecy Fails. University of Minnesota Press). When the said date arrived and the world did not end, strangely enough, the sect did not collapse. On the contrary, it made it stronger, with the followers more firmly believing in their view of the world. They considered that the world did not end because they prayed so much and God heard their prayers and postponed the event. So they made a new prediction, which of course turned out to be false. The sect ultimately collapsed, although only after a surprisingly long time.

The example illustrates two points. Firstly, a theory does not collapse because one prediction is falsified. Instead, the theory is adjusted with a minor modification so as to account for the seemingly contradicting observation. But this process does not go on forever, because of its interaction with the world: when predictions are systematically falsified, the theory ultimately loses its followers, and for a good reason.

In summary, a theory of knowledge is a system in interaction with the world. It has an internal structure, and it also relates to the world. And although it may relate to the world in its own words, one may still assess the adequacy of this relationship. For this reason, one may not defend scientific relativism in its strongest version.

For the reader of my other posts in this blog, this definition of theories of knowledge might sound familiar. Indeed it is highly related to theories of perception defended by Gibson, O’Regan and Varela, for example. After all, perception is a form of knowledge about the world. These authors have in common that they define perception in a relational way, the relationship between the actions of the organism in the world (driven by “theory”) and the effects of these actions on the organism (“tests” of the theory). This is in contrast with “neurophysiological subjectivism”, for which meaning is intrinsically produced by the brain (a closed system, in my drawing above) and “computational objectivism”, in which there is a pre-defined objective world (related to the idea of translation).

What is computational neuroscience? (VI) Deduction, induction, counter-induction

At this point, it should be clear that there is not a single type of theoretical work. I believe most theoretical work can be categorized into three broad classes: deduction, induction, and counter-induction. Deduction is deriving theoretical knowledge from previous theoretical knowledge, with no direct reference to empirical facts. Induction is the process of making a theory that accounts for the available empirical data, in general in a parsimonious way (Occam’s razor). Counter-induction is the process of making a theory based on non-empirical considerations (for example philosophical principles or analogy) or on a subset of empirical observations that are considered significant, and re-interpreting empirical facts so that they agree with the new theory. Note that 1) all these processes may lead to new empirical predictions, 2) a given line of research may use all three types of processes.

For illustration, I will discuss the work done in my group on the dynamics of spike threshold (see these two papers with Jonathan Platkiewicz: “ A Threshold Equation for Action Potential Initiation” and “Impact of Fast Sodium Channel Inactivation on Spike Threshold Dynamics and Synaptic Integration”). It is certainly not the most well-known line of research and therefore it will require some explanation. However, since I know it so well, it will be easier to highlight the different types of theoretical thinking – I will try to show how all three types of processes were used.

I will first briefly summarize the scientific context. Neurons communicate with each other by spikes, which are triggered when the membrane potential reaches a threshold value. It turns out that, in vivo, the spike threshold is not a fixed value even within a given neuron. Many empirical observations show that it depends on the stimulation, and on various aspects of the previous activity of the neuron, e.g. its previous membrane potential and the previously triggered spikes. For example, the spike threshold tends to be higher when the membrane potential was previously higher. By induction, one may infer that the spike threshold adapts to the membrane potential. One may then derive a first-order differential equation describing the process, in which the threshold adapts to the membrane potential with some characteristic time constant.  Such phenomenological equations have been proposed in the past by a number of authors, and it is qualitatively consistent with a number of properties seen in the empirical data. But note that an inductive process can only produce a hypothesis. The data could be explained by other hypotheses. For example, the threshold could be modulated by an external process, say inhibition targeted at the spike initiation site, which would co-vary with the somatic membrane potential. However, the hypothesis could potentially be tested. For example, an experiment could be done in which the membrane potential is actively modified by an electrode injecting current: if threshold modulation is external, spike threshold should not be affected by this perturbation. So an inductive process can be a fruitful theoretical methodology.

In our work with Jonathan Platkiewicz, we started from this inductive insight, and then followed a deductive process. The biophysics of spike initiation is described by the Hodgkin-Huxley equations. Hodgkin and Huxley got the Nobel prize in 1963 for showing how ionic mechanisms interact to generate spikes in the squid giant axons. They used a quantitative model (four differential equations) that they fitted to their measurements. They were then able to accurately predict the velocity of spike propagation along the axon. As a side note, this mathematical model, which explicitly refers to ionic channels, was established much before these channels could be directly observed (by Neher and Sakmann, who then also got the Nobel prize in 1991). Thus this discovery was not data-driven at all, but rather hypothesis-driven.

In the Hodgkin-Huxley model, spikes are initiated by the opening of sodium channels, which let a positive current enter the cell when the membrane potential is high enough, triggering a positive feedback process. These channels also inactivate (more slowly) when the membrane potential increases, and when they inactivate the spike threshold increases. This is one mechanism by which the spike threshold can adapt to the membrane potential. Another way, in the Hodgkin-Huxley equations, is by the opening of potassium channels when the membrane potential increases. In this model, we then derived an equation describing how the spike threshold depends on these ionic channels, and then a differential equation describing how it evolves with the membrane potential. This is a purely deductive process (which also involves approximations), and it also predicts that the spike threshold adapts to the membrane potential. Yet it provides new theoretical knowledge, compared to the inductive process. First, it shows that threshold adaptation is consistent with Hodgkin-Huxley equations, an established biophysical theory. This is not so surprising, but given that other hypotheses could be formulated (see e.g. the axonal inhibition hypothesis I mentioned above), it strengthens this hypothesis. Secondly, it shows under what conditions on ionic channel properties the theory can be consistent with the empirical data. This provides new ways to test the theory (by measuring ionic channel properties) and therefore increases its empirical content. Thirdly, the equation we proposed is slightly different from those previously proposed by induction. That is, the theory predicts that the spike threshold only adapts above a certain potential, otherwise it is fixed. This is a prediction that is not obvious from the published data, and therefore could not have been made by a purely inductive process. Thus, a deductive process is also a fruitful theoretical methodology, even though it is in some sense “purely theoretical”, that is, accounting for empirical facts is not part of the theory-making process itself (except for motivating the work).

In the second paper, we also used a deductive process to understand what threshold adaptation implies for synaptic integration. For example, we show that incoming spikes interact at the timescale of threshold adaptation, rather than of the membrane time constant. Note how the goal of this theoretical work now is not to account for empirical facts or explain mechanisms, but to provide a new interpretative framework for these facts. The theory redefines what should be considered significant – in this case, the distance to threshold rather than the absolute membrane potential. This is an important remark, because it implies that theoretical work is not only about making new experimental predictions, but also about interpreting experimental observations and possibly orienting future experiments.

We then concluded the paper with a counter-inductive line of reasoning. Different ionic mechanisms may contribute to threshold adaptation, in particular sodium channel inactivation and potassium channel activation. We argued that the former was more likely, because it is more energetically efficient (the latter requires both sodium and potassium channels to be open and counteract each other, implying considerable ionic traffic). This argument is not empirical: it relies on the idea that neurons should be efficient based on evolutionary theory (a theoretical argument) and on the fact that the brain has been shown to be efficient in many other circumstances (an argument by analogy). It is not based on empirical evidence, and worse, it is contradicted by empirical evidence. Indeed, blocking Kv1 channels abolishes threshold dynamics. I then reason counter-inductively to make my theoretical statement compatible with this observation. I first note that removing the heart of a man prevents him from thinking, but it does not imply that thoughts are produced by the heart. This is an epistemological argument (discarding the methodology as inappropriate). Secondly, I was told by a colleague (unpublished observation) that suppressing Kv1 moves the spike initiation site to the node of Ranvier (discarding the data as being irrelevant or abnormal). Thirdly, I can quantitatively account for the results with our theory, by noting that suppressing any channel can globally shift the spike threshold and possibly move the minimum threshold below the half-inactivation voltage of sodium channels, in which case there is no more threshold variability. These are three counter-inductive arguments that are perfectly reasonable. One might not be convinced by them, but they cannot be discarded as being intrinsically wrong. Since it is possible that I am right, counter-inductive reasoning is a useful scientific methodology. Note also how counter-inductive reasoning can suggest new experiments, for example testing whether suppressing Kv1 moves the initiation site to the node of Ranvier.

In summary, there are different types of theoretical work. They differ not so much in content as in methodology: deduction, induction and counter-induction. All three types of methodologies are valid and fruitful, and they should be recognized as such, noting that they have different logics and possibly different aims.

 

Update. It occurred to me that I use the word “induction” to refer to the making of a law from a series of observations, but it seems that this process is often subdivided in two different processes, induction and abduction. In this sense, induction is the making of a law from a series of observations in the sense of “generalizing”: for example, reasoning by analogy or fitting a curve to empirical data. Abduction is the finding of a possible underlying cause that would explain the observations. Thus abduction is more creative and seems more uncertain: it is the making of a hypothesis (among other possible hypotheses), while induction is rather the direct generalization of empirical data together with accepted knowledge. For example, data-driven neural modeling is a sort of inductive process. One builds a model from measurements and implicit accepted knowledge about neural biophysics – which generally comes with an astounding number of implicit hypotheses and approximations, e.g. electrotonic compactness or the idea that ionic channel properties are similar across cells and related species. The model accounts for the set of measurements, but it also predicts responses in an infinite number of situations. In my view, induction is the weakest form of theoretical process because there is no attempt to go beyond the data. Empirical data are seen as a series of unconnected weather observations that just need to be included in the already existing theory.

What is computational neuroscience? (V) A side note on Paul Feyerabend

Paul Feyerabend was a philosopher of science who defended an anarchist view of science (in his book “Against Method”). That is, he opposed the idea that there should be methodologies imposed in science, because he considered that these are the expression of conservatism. One may not agree with all his conclusions (some think of him as defending relativistic views), but his arguments are worth considering. By looking at the Copernican revolution, Feyerabend makes a strong case that the methodologies proposed by philosophers (e.g. falsificationism) have failed both as a description of scientific activity and as a prescription of "good" scientific activity. That is, in the history of science, new theories that ultimately replace established theories are initially in contradiction with established scientific facts. If they had been judged by the standards of falsificationism for example, they would have been immediately falsified. Yet the Copernican view (the Earth revolves around the sun) ultimately prevailed on the Ptolemaic system (the Earth is at the center of the universe). Galileo firmly believed in heliocentrism not because of empirical reasons (it did not explain more data) but because it “made more sense”, that is, it seemed like a more elegant explanation of the apparent trajectories of planets. See e.g. the picture below (taken from Wikipedia) showing the motion of the Sun, the Earth and Mars in both systems:

It appears clearly in this picture that there is no more empirical content in the heliocentric view, but it seems more satisfactory. At the time though, heliocentrism could be easily disproved with simple arguments, such as the tower argument: when a stone falls from the top of a tower, it falls right beneath it, while it should be “left behind” if the Earth were moving. This is a solid empirical fact, easily reproducible, which falsifies heliocentrism. It might seem foolish to us today, but it does so only because we know that the Earth moves. If we look again at the picture above, we see two theories that both account for the apparent trajectories of planets, but the tower argument corroborates geocentrism while it falsifies heliocentrism. Therefore, so Feyerabend concludes, scientific methodologies that are still widely accepted today (falsificationism) would immediately discard heliocentrism. It follows that these are not only a poor description of how scientific theories are made, but they are also a dangerous prescription of scientific activity, for they would not allow the Copernican revolution to occur.

Feyerabend then goes on to argue that the development of new theories follow a counter-inductive process. This, I believe, is a very deep observation. When a new theory is introduced, it is initially contradictory with a number of established scientific facts, such as the tower argument. Therefore, the theory develops by making the scientific facts agree with the theory, for example by finding an explanation for the fact that the stone falls right beneath the point where it was dropped. Note that these explanations may take a lot of time to be made convincingly, and that they do not constitute the core of the theory. This stands in sharp contrast with induction, in which a theory is built so as to account for the known facts. Here it is the theory itself (e.g. a philosophical principle) that is considered true, while the facts are re-interpreted so as to agree with it.

I want to stress that these arguments do not support relativism, i.e., the idea that all scientific theories are equally valid, depending on the point of view. To make this point clearly, I will make an analogy with a notion that is familiar to physicists, energy landscape:

This is very schematic but perhaps it helps making the argument. In the picture above, I represent on the vertical axis the amount of disagreement between a theory (on the horizontal axis) and empirical facts. This disagreement could be seen as the “energy” that one wants to minimize. The standard inductive process consists in incrementally improving a theory so as to minimize this energy (a sort of “gradient descent”). This process may stabilize into an established theory (the “current theory” in the picture). However, it is very possible that a better theory, empirically speaking, cannot be developed by this process, because it requires a change in paradigm, something that cannot be obtained by incremental changes to the established theory. That is, there is an “energy barrier” between the two theories. Passing through this barrier requires an empirical regression, in which the newly introduced theory is initially worse than the current theory in accounting for the empirical facts.

This analogy illustrates the idea that it can be necessary to temporarily deviate from the empirical facts so as to ultimately explain more of them. This does not mean that empirical facts do not matter, but simply that explaining more and more empirical facts should not be elevated to the rank of “the good scientific methodology”. There are other scientific processes that are both valid as methodologies and necessary for scientific progress. I believe this is how the title of Feyerabend’s book, “Against Method”, should be understood.

What is computational neuroscience? (IV) Should theories explain the data?

Since there is such an obvious answer, you might anticipate that I am going to question it! More precisely, I am going to analyze the following statement: a good theory is one that explains the maximum amount of empirical data while being as simple as possible. I will argue that 1) this is not stupid at all, but that 2) it cannot be a general criterion to distinguish good and bad theories, and finally that 3) it is only a relevant criterion for orthodox theories, i.e., theories that are consistent with theories that produced the data. The arguments are not particularly original, I will mostly summarize points made by a number of philosophers.

First of all, given a finite set of observations, there are an infinite number of universal laws that agree with the observations, so the problem is undetermined. This is the skeptic criticism of inductivism. Which theory to choose then? One approach is "Occam's razor", i.e., the idea that among competing hypotheses, the most parsimonious one should be preferred. But of course, Karl Popper and others would argue that it cannot be a valid criterion to distinguish between theories, because it could still be that the more complex hypothesis predicts future observations better than the simpler hypothesis - there is just no way to know without doing the new experiments. Yet it is not absurd as a heuristic to develop theories. This is a known fact in the field of machine learning for example, related to the problem of "overfitting". If one wants to describe the relationship between two quantities x and y, from a set of n examples (xi,yi), one could perfectly fit an nth-order polynomial to the data. It would completely explain the data, but yet would be very unlikely to fit a new example. In fact, a lower-dimensional relationship is more likely to account for new data, and this can be shown more rigorously with the tools of statistical learning theory. Thus there is a trade-off between how much of the data is accounted for and the simplicity of the theory. So, Occam's Razor is actually a very sensible heuristic to produce theories. But it should not be confused with a general criterion to discard theories.

The interim conclusion is: a theory should account for the data, but not at the expense of being as complicated as the data itself. Now I will make criticisms that are deeper, and mostly based on post-Popper philosophers such as Kuhn, Lakatos and Feyerabend. In a nutshell, the argument is that insisting that a theory should explain empirical data is a kind of inversion of what science is about. Science is about understanding the real world, by making theories and testing them with carefully designed experiments. These experiments are usually done using conditions that are very unecological, and this is justified by the fact that they are designed to test a specific hypothesis in a controlled way. For example, the laws of mechanics would be tested in conditions where there is no friction, a condition that actually almost never happens in the real world - and this is absolutely fine methodology. But then insisting that a new theory should be evaluated by how much it explains the empirical data is what I would call the "empiricist inversion": empirical data were produced, using very peculiar conditions justified by the theory that motivated the experiments, and now we demand that any theory should explain this data. One obvious point, which was made by Kuhn and Feyerabend, is that it gives a highly unfair advantage to the first theory, just because it was there first. But it is actually worse than this, because it also means that the criterion to judge theories is now disconnected from what was meant to be explained in the first place by the theory that produced the data. Here is the empiricist inversion: we consider that theories should explain data, when actually data is produced to test theories. What a theory is meant to explain is the world; data is only used as a methodological tool to test theories of the world.

In summary, this criterion then tends to produce theories of data, not theories of the world. This point in fact relates to the arguments of Gibson, who criticized psychological research for focusing on laboratory stimuli rather than ecological conditions. Of course simplified laboratory stimuli are used to control experiments precisely, but it should always be kept in mind that these simplified stimuli are used as methodological tools and not as the things that are meant to be explained. In neural modeling, I find that many models are developed to explain experimental data, ignoring the function of the models (i.e., the “computational level” in Marr’s analysis framework). In my view, this is characteristic of the empiricist inversion, which results in models of the data, not models of the brain.

At this point, my remarks might start being confusing. On one hand I am saying that it is a good idea to try to account for the data with a simple explanation, on the other hand I am saying that we should not care so much about the data. These seemingly contradictory statements can still make sense because they apply to different types of theories. This is related to what Thomas Kuhn termed “normal science” and “revolutionary science”. These terms might sound a bit too judgmental so I will rather speak of “orthodox theories” and “non-orthodox theories”. The idea is that science is structured by paradigm shifts. Between such shifts, a central paradigm dominates. Data are obtained through this paradigm, anomalies are also explained through this paradigm (rather than being seen as falsifications), and a lot of new scientific results are produced by “puzzle solving”, i.e., trying to explain data. At some point, for various reasons (e.g. too many unexplained anomalies), the central paradigm shifts to a new one and the process starts again, but with new data, new methods, or new ways to look at the observations.

“Orthodox theories” are theories developed within the central paradigm. These try to explain the data obtained with this paradigm, the “puzzle-solving” activity. Here it makes sense to consider that a good theory is a simple explanation of the empirical data. But this kind of criterion cannot explain paradigm shifts. A paradigm shift requires the development of non-orthodox theories, for which the existing empirical data may not be adequate. Therefore the making of non-orthodox theories follows a different logic. Because the existing data were obtained with a different paradigm, these theories are not driven by the data, although they may be motivated by some anomalous set of data. For example they may be developed from philosophical considerations or by analogy. The logic of their construction might be better described by counter-induction rather than induction (a concept proposed by Feyerabend). That is, their development starts from a theoretical principle, rather than from data, and existing data are deconstructed so as to fit the theory. By this process, implicit assumptions of the central paradigm are uncovered, and this might ultimately trigger new experiments and produce new experimental data that may be favorable to the new theory.

Recently, there have been a lot of discussions in the fields of neuroscience and computational neuroscience about the availability of massive amounts of data. Many consider it as a great opportunity, which should change the way we work and build models. It certainly seems like a good thing to have more data, but I would like to point out that it mostly matters for the development of orthodox theories. Putting too much emphasis (and resources) on it also raises the danger of driving the field away from non-orthodox theories, which in the end are the ones that bring scientific revolutions (with the caveat that of course most non-orthodox theories turn out to be wrong). Being myself unhappy with current orthodox theories in neuroscience, I see this danger as quite significant.

This was a long post and I will now try to summarize. I started with the provocative question: should a theory explain the data? First of all, a theory that explains every single bit of data is an enumeration of data, not a theory. It is unlikely to predict any new significant fact. This point is related to overfitting or the “curse of dimensionality” in statistical learning. A better theory is one that explains a lot of the data with a simple explanation, a principle known as Occam’s razor. However, this criterion should be thought of as a heuristic to develop theories, not a clear-cut general decision criterion between theories. In fact, this criterion is relevant mostly for orthodox theories, i.e., those theories that follow the central paradigm with which most data have been obtained. Non-orthodox theories, on the other hand, cannot be expected to explain most of the data obtained through a different paradigm (at least initially). It can be seen that in fact they are developed through a counter-inductive process, by which data are made consistent with the theory. This process may fail to produce new empirical facts consistent with the new theory (most often) or it may succeed and subsequently become the new central paradigm - but this is usually a long process.

What is computational neuroscience? (III) The different kinds of theories in computational neuroscience

Before I try to answer the questions I asked at the end of the previous post , I will first describe the different types of approaches in computational neuroscience. Note that this does not cover everything in theoretical and quantitative neuroscience (see my first post).

David Marr, a very important figure in computational neuroscience, proposed that cognitive systems can be described at three levels:

1) The computational level: what does the system do? (for example: estimating the sound location of a sound source)

2) The algorithmic/representational level: how does it do it? (for example: by calculating the maximum of cross-correlation between the two monaural signals)

3) The physical level: how is it physically realized? (for example: with axonal delay lines and coincidence detectors)

Theories in computational neuroscience differ by which level is addressed, and by the postulated relationships between the three levels (see also my related post).

David Marr considered that these three levels are independent. Francisco Varela described this view as “computational objectivism”. This means that the goal of the computation is defined in terms that are external to the organism. The two other levels describe how this goal is achieved, but they have no influence on what is achieved. It is implied that evolution shapes levels 2 and 3 by imposing the first level. It is important to realize that theories that follow this approach necessarily start from the highest level (defining the object of information processing), and only then analyze the lower levels. Such approaches can be restricted to the first level, or the first two levels, but they cannot address only the third level, or the second level, because these are defined by the higher levels. It can be described as a “top-down” approach.

The opposite view is that both the algorithmic and computational levels derive from the physical level, i.e., they emerge from the interactions between neurons. Varela described it as “neurophysiological subjectivism”. In this view, one would start by analyzing the third level, and then possibly go up to the higher levels – this is a “bottom-up” approach. This is the logic followed by data-driven approaches that I criticized in my first post. I criticized it because this view fails to acknowledge the fact that living beings are intensely teleonomic, i.e., the physical level serves a project (invariant reproduction, in the words of Jacques Monod). This is not to say that function is not produced by the interaction of neurons – it has to, in a materialistic view. But as a method of scientific inquiry, analyzing the physical level independently of the higher levels, as if it were a non-living object (e.g. a gas), does not seem adequate – at least it seems highly hopeful. As far as I know, this type of approach has produced theories of neural dynamics, rather than theories of neural computation. For example, showing how oscillations or some other large scale aspect of neural networks might emerge from the interaction of neurons. In other words, in Marr’s hierarchy, such studies are restricted to the third level. Therefore, I would categorize them as theoretical neuroscience rather than computational neuroscience.

These two opposite views roughly correspond to externalism and internalism in philosophy of perception. It is important to realize that these are important philosophical distinctions, which have considerable epistemological implications, in particular on what is considered a “realistic” model. Computational objectivists would insist that a biological model must serve a function, otherwise it is simply not about biology. Neurophysiological subjectivists would insist that the models must agree with certain physiological experiments, otherwise they are empirically wrong.

There is another class of approaches in philosophy of perception, which can be seen as intermediate between these two, the embodied approaches. These consider that the computational level cannot be defined independently of the physical level, because the goal of computation can only be defined in terms that are accessible to the organism. In the more external views (Gibson/O’Regan), this means that the computational level actually includes the body, but the neural implementation is seen as independent from the computational level. For example, in Gibson’s ecological approach and in O’Regan’s sensorimotor theory, the organism looks for information about the world implicit in its sensorimotor coupling. This differs quite substantially from computational objectivism in the way the goal of the computation is defined. In computational objectivism, the goal is defined externally. For example: to estimate the angle between a sound source and the head. Sensorimotor theories acknowledge that the notion of “angle” is one of an external observer with some measurement apparatus, it cannot be one of an organism. Instead in sensorimotor approaches, direction is defined subjectively (contrary to computational objectivism), but still in reference to an external world (contrary to neurophysiological subjectivism), as the self-generated movement that would make the sound move to the front (an arbitrary reference point). In the more internal views (e.g. Varela), the notion of computation itself is questioned, as it is considered that the goal is defined by the organism itself. This is Varela’s concept of autopoiesis, according to which a living entity acts so as to maintain its own organization. “Computation” is then a by-product of this process. This last class of approaches is currently less developed in computational neuroscience.

The three types of approaches I have described are mostly between the relationships between the computational and physical levels, and they are tightly linked with different views in philosophy of perception. There is also another divide line between neural computation theories, which has to do with the relationship between the algorithmic and physical levels. This is related to the rate-based vs. spike-based theories of neural computation (see my series of posts on the subject).

In Marr’s view and in general in rate-based views, the algorithmic and physical levels are mostly independent. Because algorithms are generally described in terms of calculus with analog values, spikes are generally seen as implementing analog calculus. In other words, spikes only reflect an underlying analog quantity, the firing rate of a neuron, on which the algorithms are defined. The usual view is that spikes are produced randomly with some probability reflecting the underlying rate (an abstract quantity).

On the contrary, another view holds that algorithms are defined at the level of spikes, not of rates. Such theories include the idea of binding by synchrony (Singer/von der Malsburg), in which neural synchrony is the signature of a coherent object, the related idea of synfire chains (Abeles), and more recently the theories developed by Sophie Denève and by myself (there is also Thorpe’s rank-order coding theory, but it is more on the side of coding than computation). In these former two theories, spiking is seen as a decision. In Denève’s approach, the neuron spikes so as to reduce an error criterion. In my recent paper on computing with synchrony, the neuron spikes when it observes unlikely coincidences, which signals some invariant structure (in the sense of Gibson). In both cases, the algorithm is defined directly at the level of spikes.

In summary: theories of neural computation can be classified according to the implicit relationships between the three levels of analysis described by Marr. It is important to realize that these are not purely scientific differences (by this, I mean not simply about empirical disputes), but really philosophical and/or epistemological differences. In my view this is a big issue for the peer-reviewing system, because it is difficult to have a paper accepted when the reviewers or editors do not share the same epistemological views.