What is computational neuroscience? (X) Reverse engineering the brain

One phrase that occasionally pops up when speaking of the goal of computational neuroscience is “reverse engineering the brain”. This is quite an interesting phrase from an epistemological point of view. The analogy is to see the brain as an engineered device, the “engineer” being evolution, of which we do not possess the design plans. We are supposed to understand it by opening it, and trying to guess what mechanisms are at play.

What is interesting is that observing and trying to understand the mechanisms is basically what science is about, not only neuroscience, so there must be something else in this analogy. For example, we would not describe the goal of astronomy as reverse engineering the planets. What is implied in the phrase is the notion that there is a plan, and that this plan is meant to achieve a function. It is a reference to the teleonomic nature of life in general, and of the nervous system in particular: the brain is not just a soup of neurons, these neurons coordinate their action so as to achieve some function (to survive, to reproduce, etc).

So the analogy is meaningful from this point of view, but as any analogy it has its limits. Is there no difference between a living being and an engineered artifact? This question points at what is life, which is a very broad question, but here I will just focus on two differences that I think are relevant for the present matter.

There is one very important specificity that was well explained by the philosopher Humberto Maturana (“The Organization of the living”, 1974). Engineered things have a structure that is designed so as to fulfill some function, that is, they are made of specific components that have to be arranged in a specific way, according to a plan. So all you need to understand is the structure, and its relation with the function. But as Maturana pointed out, living things have a structure (the body, the wiring of neurons, etc) but they also have an organization that produces that structure. The organization is a set of processes that produce the structure, which is itself responsible for the organization. But what defines the living being is its organization, not its structure, which can change. In the case of the nervous system, the wiring between neurons changes dramatically in the course of life, or even in the course of one hour, and the living being remains the same. The function of the organization is to maintain the conditions for its existence, and since it exists in a body interacting with an external environment, it is in fact necessary that the structure changes so as to maintain the organization. This is what is usually termed “plasticity” or “learning”. Therefore living things are defined by their organization, while engineered things are defined by their structure.

This is one aspect in which the engineering analogy is weak, because it misses this important distinction. Another one is that an engineered thing is made by an engineer, that is, by someone external to the object. Therefore the function is defined with respect to an external point of view. The plan would typically include elements that are defined in terms of physics, concepts that can only be grasped and measured by some external observer with appropriate tools. But a living organism only has its own senses and ways of interacting with the environment to make sense of the world. This is true of the nervous system as a whole, but also of individual cells: a cell has ways of interacting with other cells and possibly with the outside world, but it does not have a global picture of the organism. For example, an engineer plan would specify where each component should go, e.g. with Euclidian coordinates. But this is not how development can work in a living thing. Instead, the plan should come in the form of mechanisms that specify not “where” a thing is, but rather “how to get there”, or perhaps even when a component should transform into a new component – specific ways of interacting that end up in the desired result.

Therefore the nature of the “plan” is really quite different from the plan of an engineer. To make my point, I will draw an analogy with philosophy of knowledge. A plan is a form of knowledge, or at least it includes some knowledge. For example, if the plan includes the statement “part A should be placed at such coordinates”, then there is an implicit knowledge on part of the organism that executes the plan about Euclidian geometry. For an engineer, knowledge comes from physics, and is based on the use of specific tools to measure things in the world. But for a cell, knowledge about the world comes just from the interaction with the world: different ways to sense it (e.g. incoming spikes for a neuron), different ways to act on it (e.g. producing a spike, releasing some molecules in the extracellular medium). A plan can be specified in terms of physics if it is to be executed by an engineer, but it cannot be specified in these terms if it is to be executed by a cell: instead, it would be specified in terms of mechanisms that make sense given the ways the cell can interact with the world. Implicit knowledge about the world that is included in an engineer plan is what I could call “metaphysical knowledge”, in relationship with the corresponding notion in philosophy of science.

Science is made of universal statements, such as the law of gravitation. But not all statements are scientific, for example “there is a God”. In philosophy of science, Karl Popper proposed that a scientific statement is one that can potentially be falsified by an observation, whereas a metaphysical statement is a statement that cannot be falsified. For example, the statement “all penguins are black” is scientific, because I could imagine that one day I see a white penguin. On the other hand, the statement “there is a God” is metaphysical, because there is no way I can check. Closer to the matter of this text, the statement “the world is actually five-dimensional but we live in a three-dimensional subspace” is also metaphysical because independently of whether it is true or not, we have no way to confirm it or to falsify it given the way we interact with the world.

So what I call “metaphysical knowledge” in an engineer plan is knowledge that cannot be corroborated or falsified by the organism that executes the plan, given its senses and possibilities for action. For example, consider the following statement: neurons in the lateral geniculate nucleus project to the occipital region of the brain. This includes metaphysical knowledge about where that region is, which is specified from the point of view of an external observer. This cannot be a biological plan. Instead, a biological plan would rather have to specify what kind of interaction a growing axon should have with its environment in order to end up in the desired region.

In summary, although the phrase “reverse engineering” acknowledges the fact that, contrary to physical things of nature such as planets, living things have a function, it misses several important specificities of life. One is that living things are defined by their organization, rather than by the changing structure that the organization produces, while engineered things are defined by their structure. Another one is that the “plan”, which defines that organization, is of a very different nature than the plan made by and for an engineer, because in the latter case the function and the design are conceived from an external point of view, which generally includes “metaphysical knowledge”, i.e., knowledge that cannot be grasped from the perspective of the organism.

What is computational neuroscience? (IX) The epistemological status of simulations

Computational neuroscience is not only about making theories. A large part of the field is also about simulations of neural models on a computer. In fact, there is little theoretical work in neuroscience that does not involve simulations at some stage. The epistemological status of simulations is quite interesting, and studies about it in philosophy of knowledge are quite recent. There is for example the work of Eric Winsberg, but I believe it mostly addresses questions related to physics. In particular, he starts one of his most cited papers (“Simulations, models and theories”, 2001) by stating: “I will be talking about the use of computers for modeling very complex physical phenomena for which there already exist good, well-understood theories of the processes underlying the phenomena in question”. This is an important distinction, and I will come back to it.

What is interesting about simulations from an epistemological viewpoint is that from a strictly Popperian viewpoint, simulation is useless. Indeed it looks like a sort of experiment, but there is no interaction with the world. It starts from a theory and a set of factual statements, and derives another set of factual statements. It is neither the making of a theory (no universal statement is produced), nor the test of a theory. So why is it that simulation is used so broadly?

In fact there are different types of simulation work. Broadly speaking, we may think of two categories: theory-driven simulations, and data-driven simulations.

I will start with theory-driven simulations. There are in fact two different motivations to use simulations in theoretical work. One is exploratory: simulations are used in the process of making theories, because the models are so complex so that it may be difficult to predict their behavior. This is a general problem with so-called complex systems. Simulations are then used for example to explore the effect of various parameters on the behavior of the model, or to see whether some property can appear given a set of rules, etc. Another motivation is to test a theory. Now this may seem odd since we are not speaking of an empirical test. First of all, this apparent oddity perhaps stems from the myth that theoretical work is essentially about making logical deductions from initial statements. But in reality, especially in biology where models can be very complex, theoretical work almost invariably involves some guess work, approximations, and sometimes vaguely justified intuitions. Therefore, it makes sense to check the validity of these approximations in a number of scenarios. For example, in my paper with Jonathan Platkiewicz about the spike threshold, we derived an equation for the spike threshold from the Hodgkin-Huxley equations. It involved approximations of the sodium current, and we also developed the theory in an isopotential neuron. Therefore in that paper, we checked the theory against the numerical simulation of a complex multicompartmental neuron model, and it was not obvious that it would work.

There is another motivation, which is more specific to computational neuroscience. Theories in this field are about how the interaction of neurons produces behavior, or in other words, about linking physiology, at the neuron level, and function, at the systems or organism level. But to speak of function, one needs an environment. This external element is not part of the neural model, yet it is critical to the relevance of the model. Theories generally do not include explicit models of the environment, or only simplistic versions. For example, in my paper about sound localization with Dan Goodman, we proposed a mechanism by which selective synchrony occurs when a sound is presented at specific locations, leading to a simple spiking model that can accurately estimation the location of a sound source in the presence of realistic diffraction properties. In principle it works perfectly, but of course in a real environment the acoustical signals are unknown, but not arbitrary, they may have a limited spectrum, there may be noise, diffraction properties are also unknown but not arbitrary, there may be ambiguities (e.g. the cones of confusion), etc. For this reason, the model needed to be implemented and its performance tested, which we did with recorded sounds, measured acoustical filters and acoustical noise. Thus it appears that even for theory-driven work, simulation is unavoidable because the theory applies to the interaction with an unknown, complex environment. In fact, ideally, models should be simulated, embodied (in a robot) and allowed to interact with a real (non simulated) environment. Since theories in computational neuroscience claim to link physiology and function, this would be the kind of empirical work required to substantiate such claims.

The other type of simulation work is data-driven. I believe this is usually what is meant by “simulation-based science”. In this kind work, there is little specific theory – that is, only established theories are used, such as cable equation theory. Instead, models are built based on measurements. The simulations are then truly used as a kind of experiment, to observe what might emerge from the complex interaction of neuron models. It is sometimes said that simulations are used to do “virtual experiments” when the actual experiments would be impractical. Another typical use is to test the behavior of a complex model when parameters are varied in a range that is considered plausible.

In physics, such computer simulations are also used, for example to simulate the explosion of a nuclear bomb. But as Winsberg noted, there is a very important epistemological distinction between simulations in physics and in biology: in the former, there is an extremely detailed knowledge of both the laws that govern the underlying processes and of the arrangement of the individual elements in the simulations. Note that even in this case, the value of such simulations is controversial. But in the case of biology and especially neuroscience, the situation is quite different. It is in fact acknowledged by the typical use cases mentioned above.

Consider the statement that a simulation is used to perform a “virtual experiment” when actual experiments are impractical. This seems similar to the simulation of a nuclear explosion. In that case, one is interested in the large scale behavior of the system, and at such a large scale the experiment is difficult to do. But in neuroscience, the situation is exactly the opposite. The experiment with a full organism is actually what is easy to do (or at least feasible), it is a behavioral experiment. So simulations are not used to observe how an animal behaves. They are used to observe the microstructure of the system. But then this means that this microstructure was not known at the time when the model was built, and so these properties that are to be observed are considered as sufficiently constrained by the initial set of measurements to be derived from them.

The second, and generally complementary, use case is to simulate the model while varying a number of parameters so as to find the viable region in which the model produces results consistent with some higher-order measurements (for example, local field potentials). If the parameters are varied, then this means they are actually not known with great certainty. Thus it is clear that biophysical models based on measurements are in fact much less reliable than physical models such as those of nuclear explosions.

One source of uncertainty is the values of parameters in the models, for example channel densities. This is already one great problem. Probably the biggest issue here is not so much the uncertainty about parameters, which is an issue in models of all fields, but the fact the parameters are most likely not independent, i.e., they covary in a given cell or between cells. This lack of independence comes from the fact that the model is of a living thing, and in a living thing all components and processes contribute to the function of the organism, which implies tight relations between them. The study of these relations is a defining part of biology as a field, but if a model does not explicitly include these relations, then it would seem extraordinary that proper function can arise without them, given that they are hidden under the uncertainty in the parameters. For example, consider action potential generation. Sodium channels are responsible for initiation, potassium channels for repolarization. There are a number of recent studies showing that their properties and densities are precisely tuned with respect to each other so that energy consumption is minimized: indeed energy is lost if they are simultaneously open because they have opposite effects. If this functional relation were unknown and only channel densities were known within some range, then the coordination would go unnoticed and a naive model simply using independent values from these distributions would display inefficient action potential generation, unlike real neurons.

I will try to summarize the above point. Such simulations are based on the assumption that the laws that govern the underlying processes are very well understood. This may well be true for the laws of neural electricity (cable equations, Hodgkin-Huxley equations). However, in biology in general and in neuroscience in particular, the relevant laws are also those that describe the relations between the different elements of the model. This is a completely different set of laws. For the example of action potential generation, the laws are related to the co-expression of channels, which is more related to the molecular machinery of the cell than to its electrical properties.

Now these laws, which relate to the molecular and genetic machinery, are certainly not so well known. And yet, they are more relevant to what defines a living thing than those describing the propagation of electrical activity, since indeed these are the laws that maintain the structure that maintain the cells alive. Thus, models based on measurements attempt to reproduce biological function without capturing the logics of the living, and this seems rather hopeful. There are also many examples in recent research that show that the knowledge we have of neural function is rather poor, compared to what is to be found. For example, glial cells (which make most of the cells in the brain) are now thought to play a much more important role in brain function than before, and these are generally ignored in models. Another example is in action potential initiation. Detailed biophysical models are based on morphological reconstructions of the axon, but in fact in the axon initial segment, there is also a scaffold that presumably alters the electrical properties along the axon (for example the axial resistivity should be higher).

All these remarks are meant to point out that in fact, it is illusory to think that there are, or will be in the near future, realistic models of neural networks based on measurements. What is worse, such models seem to miss a critical point in the study of living systems: these are not defined only by their structure (values of parameters, shape of cells) but by processes to maintain that structure and produce function. To quote Maturana (1974), there is a difference between the structure (channel densities etc) and the organization, which is the set of processes that set up that structure, and it is the organization, not the structure, that defines a living thing. Epistemologically speaking, the idea that things not accessible to experiment can be simulated based on measurements that constrain a model is induction. But the predictive power of induction is rather limited when there is such uncertainty.

I do not want to sound as if I were entirely dismissing data-driven simulations. Such simulations can still be useful, as an exploratory tool. For example, one may simulate a neuron using measured channel densities and test whether the results are consistent with what the actual cell does. If they are not, then we know we are missing some important property. But it is wrong to claim that such models are more realistic because they are based on measurements. On one hand, they are based on empirical measurements, on the other hand, they are dismissing mechanisms (or “principles”), which is another empirical aspect to be accounted for in living things. I will come back in a later post to the notion of “realistic model”.

What is computational neuroscience? (VIII) Technical development and observations

In the previous posts, I have strongly insisted on the epistemological notion that theory precedes empirical observation, in the sense that experiments are designed to test theories. I insisted on this point because computational neuroscience seems to be understood by many scientists through the prism of naive inductivism: the view that theory derives more or less directly from observation (you make experimental measurements, and then you “make a model” from them). I do not need to insist again on why this view is flawed in many ways. But of course it would be absurd to advocate the opposite view, that is, that observation cannot provide knowledge unless it is designed to test a theory. This post is meant to nuance my previous arguments.

In fact, historically, science has progressed by two very different means: one is the introduction of radically new perspectives (“paradigm shifts”), another one is the development of new tools. A salient example in neuroscience is the development of patch-clamp, which allows recording currents flowing through single ionic channels. The technique led to the Nobel Prize of Neher and Sakmann in 1991. The discoveries they made with this technique were not revolutionary in Kuhn’s sense, that is, they did not fundamentally contradict the prevailing views and it was not a conceptual change of paradigm. It was already thought since the times of Hodgkin and Huxley that membrane currents came from the flow of ions through channels in the membrane, even though they could not directly observe it at the time. But still, the ability to make observations that were not possible before led to considerable new knowledge, for example the fact that channel opening is binary and stochastic.

At the present time, many think that the discoverers of optogenetics are on the shortlist to get the Nobel Prize in the coming years. Optogenetics is a very recent technique in which channelrhodopsin, a light-activated channel, is expressed in the membrane of target neurons through genetic manipulation. Using lasers, one can then control the firing of neurons in vivo at a millisecond timescale. It allows probing the causal role of different neurons in behavior, while most previous techniques, which relied mostly on recordings, could only measure correlates of behavior. Although it is probably too early to see it clearly, I anticipate that the technique will trigger not only new empirical knowledge, but also conceptually new theories. Indeed, there is a strong bias on the development of theories by what can be experimentally tested and observed. For example, many current theories in neuroscience focus on the “neural code”, that is, how neurons “represent” different types of information. This is an observer-centric view, which in my opinion stems from the fact that our current empirical view of the brain comes from recordings and imaging: we observe responses to stimuli. The neural coding view is a perspective that one has to adopt to explain such experimental data, rather than a hypothesis on what neurons do. But once we switch to different types of experimental data, in which we observe the effect of neural firing, rather than what they “encode”, not only does it become unnecessary to adopt the stimulus-response perspective, but in fact one has to adopt the opposite perspective to explain the experimental data: neurons act on postsynaptic neurons with spikes, rather than observe the firing of presynaptic neurons. This is a conceptual change of perspective, but one that is triggered by a new experimental technique. Note that it still requires the development of these new theories: by itself, the change in perspective is not a theory. But the new technique is responsible for this development in a sociological/historical sense.

Another way in which I anticipate new theories will arise from empirical observations is in the understanding of dendritic function. Almost all theories in computational neuroscience, at least those that address the functional or network level, are based on a view of synaptic integration based on isopotential neurons. That is, it is assumed that the location of synapses on the dendritic tree shapes postsynaptic potentials and perhaps total conductance, but that is otherwise irrelevant to synaptic integration. This is not exactly a hypothesis, because we know that it is not true, but rather a methodological assumption, an approximation. Why do we make this assumption if we know it is not true? Simply because removing this assumption does not give us an alternative theory, it leaves us with nothing: there are so many possibilities in which dendritic integration might work, we do not know where to start. But this will change (and certainly started changing in recent years) once we have a better general idea of how synapses are distributed on the dendritic tree, and perhaps the mechanisms by which this distribution arises. Indeed, one thing at least is clear from recent experimental work: this distribution is not random at all, and obeys different rules for excitation and inhibition. In other words: even though theory does not derive from observations, it needs a starting point, and therefore observations are critical.

What is computational neuroscience? (VII) Incommensurability and relativism

I explained in previous posts that new theories should not be judged by their agreement with the current body of empirical data, because these data were produced by the old theory. In the new theory, they may be interpreted very differently or even considered irrelevant. A few philosophers have gone so far as to state that different theories are incommensurable, that is, they cannot be compared with each other because they have different logics (e.g. observations are not described in the same way in the different theories). This reasoning may lead to relativistic views of science, that is, the idea that all theories are equally “good” and that their choice are a matter of personal taste or fashion. In this post I will try to explain the arguments, and also to discard relativism.

In “Against Method”, Feyerabend explains that scientific theories are defined in a relational way, that is, elements of a theory make sense only in reference to other elements of the theory. I believe this is a very deep remark that applies to theories of knowledge in the broadest sense, including perception for example. Below, I drew a schematic figure to illustrate the arguments.

Theories are systems of thought that relate to the world. Concepts in a theory are meant to relate to the world, and they are defined with respect to other concepts in the theory. A given concept in a given theory may have a similar concept in another theory, but it is a different concept, in general. To explain his arguments, Feyerabend uses the analogy of language. It is a good analogy because languages relate to the world, and they have an internal relational structure. Imagine theories A and B are two languages. A word in language A is defined (e.g. in the dictionary) by using other words from language A. A child learns her native language by picking up the relationship between the words, and how they relate to the world she can see. To understand language A, a native speaker of language B may translate the words. However, translation is not definition. It is imprecise because the two words often do not have exactly the same meaning in both languages. Some words may not even exist in one language. A deeper understanding of language A requires to go beyond translation, and to capture the meaning of words by acquiring a more global understanding of the language, both in its internal structure and in its relationship with the world.

Another analogy one could make is political theories, in how they view society. Clearly, a given observation can be interpreted in opposite ways in conservative and liberal political views. For example, the same economic crisis could be seen as the result of public debt or as the result of public cuts in spending (due to public acquisition of private debt).

These analogies support the argument that an element of a new theory may not be satisfactorily explained in the framework of the old theory. It may only make full sense when embedded in the full structure of the new theory – which means that new theories may be initially unclear and that the concepts may not be well defined. This remark can certainly make different theories difficult to compare, but I would not conclude that theories are incommensurable. This conclusion would be valid if theories were closed systems, because then a given statement would make no sense elsewhere than in the context of the theory in which it is formulated. Axiomatic systems in mathematics could be said to be incommensurable (for example, Euclidian and non-Euclidian geometries). But theories of knowledge, unlike axiomatic systems, are systems that relate to the world, and the world is shared between different theories (as illustrated in the drawing above). For this reason, translation is imprecise but not arbitrary, and one may still assess the degree of consistency between a scientific theory and the part of the world it is meant to explain.

One may find an interesting example in social psychology. In the theory of cognitive dissonance, new facts that seem to contradict our belief system are taken into account by minimally adjusting that belief system (minimizing the “dissonance” between the facts and the theory). In philosophy of knowledge, these adjustments would be called “ad hoc hypotheses”. When it becomes too difficult to account for all the contradictory facts (making the theory too cumbersome), the belief system may ultimately collapse. This is very similar to the theory of knowledge defended by Imre Lakatos, where belief systems are replaced by research programs. Cognitive dissonance theory was introduced by a field study in a small American sect who believed that the end of the world would occur at a specific date (Festinger, Riecken and Schachter (1956), When Prophecy Fails. University of Minnesota Press). When the said date arrived and the world did not end, strangely enough, the sect did not collapse. On the contrary, it made it stronger, with the followers more firmly believing in their view of the world. They considered that the world did not end because they prayed so much and God heard their prayers and postponed the event. So they made a new prediction, which of course turned out to be false. The sect ultimately collapsed, although only after a surprisingly long time.

The example illustrates two points. Firstly, a theory does not collapse because one prediction is falsified. Instead, the theory is adjusted with a minor modification so as to account for the seemingly contradicting observation. But this process does not go on forever, because of its interaction with the world: when predictions are systematically falsified, the theory ultimately loses its followers, and for a good reason.

In summary, a theory of knowledge is a system in interaction with the world. It has an internal structure, and it also relates to the world. And although it may relate to the world in its own words, one may still assess the adequacy of this relationship. For this reason, one may not defend scientific relativism in its strongest version.

For the reader of my other posts in this blog, this definition of theories of knowledge might sound familiar. Indeed it is highly related to theories of perception defended by Gibson, O’Regan and Varela, for example. After all, perception is a form of knowledge about the world. These authors have in common that they define perception in a relational way, the relationship between the actions of the organism in the world (driven by “theory”) and the effects of these actions on the organism (“tests” of the theory). This is in contrast with “neurophysiological subjectivism”, for which meaning is intrinsically produced by the brain (a closed system, in my drawing above) and “computational objectivism”, in which there is a pre-defined objective world (related to the idea of translation).

What is computational neuroscience? (VI) Deduction, induction, counter-induction

At this point, it should be clear that there is not a single type of theoretical work. I believe most theoretical work can be categorized into three broad classes: deduction, induction, and counter-induction. Deduction is deriving theoretical knowledge from previous theoretical knowledge, with no direct reference to empirical facts. Induction is the process of making a theory that accounts for the available empirical data, in general in a parsimonious way (Occam’s razor). Counter-induction is the process of making a theory based on non-empirical considerations (for example philosophical principles or analogy) or on a subset of empirical observations that are considered significant, and re-interpreting empirical facts so that they agree with the new theory. Note that 1) all these processes may lead to new empirical predictions, 2) a given line of research may use all three types of processes.

For illustration, I will discuss the work done in my group on the dynamics of spike threshold (see these two papers with Jonathan Platkiewicz: “ A Threshold Equation for Action Potential Initiation” and “Impact of Fast Sodium Channel Inactivation on Spike Threshold Dynamics and Synaptic Integration”). It is certainly not the most well-known line of research and therefore it will require some explanation. However, since I know it so well, it will be easier to highlight the different types of theoretical thinking – I will try to show how all three types of processes were used.

I will first briefly summarize the scientific context. Neurons communicate with each other by spikes, which are triggered when the membrane potential reaches a threshold value. It turns out that, in vivo, the spike threshold is not a fixed value even within a given neuron. Many empirical observations show that it depends on the stimulation, and on various aspects of the previous activity of the neuron, e.g. its previous membrane potential and the previously triggered spikes. For example, the spike threshold tends to be higher when the membrane potential was previously higher. By induction, one may infer that the spike threshold adapts to the membrane potential. One may then derive a first-order differential equation describing the process, in which the threshold adapts to the membrane potential with some characteristic time constant.  Such phenomenological equations have been proposed in the past by a number of authors, and it is qualitatively consistent with a number of properties seen in the empirical data. But note that an inductive process can only produce a hypothesis. The data could be explained by other hypotheses. For example, the threshold could be modulated by an external process, say inhibition targeted at the spike initiation site, which would co-vary with the somatic membrane potential. However, the hypothesis could potentially be tested. For example, an experiment could be done in which the membrane potential is actively modified by an electrode injecting current: if threshold modulation is external, spike threshold should not be affected by this perturbation. So an inductive process can be a fruitful theoretical methodology.

In our work with Jonathan Platkiewicz, we started from this inductive insight, and then followed a deductive process. The biophysics of spike initiation is described by the Hodgkin-Huxley equations. Hodgkin and Huxley got the Nobel prize in 1963 for showing how ionic mechanisms interact to generate spikes in the squid giant axons. They used a quantitative model (four differential equations) that they fitted to their measurements. They were then able to accurately predict the velocity of spike propagation along the axon. As a side note, this mathematical model, which explicitly refers to ionic channels, was established much before these channels could be directly observed (by Neher and Sakmann, who then also got the Nobel prize in 1991). Thus this discovery was not data-driven at all, but rather hypothesis-driven.

In the Hodgkin-Huxley model, spikes are initiated by the opening of sodium channels, which let a positive current enter the cell when the membrane potential is high enough, triggering a positive feedback process. These channels also inactivate (more slowly) when the membrane potential increases, and when they inactivate the spike threshold increases. This is one mechanism by which the spike threshold can adapt to the membrane potential. Another way, in the Hodgkin-Huxley equations, is by the opening of potassium channels when the membrane potential increases. In this model, we then derived an equation describing how the spike threshold depends on these ionic channels, and then a differential equation describing how it evolves with the membrane potential. This is a purely deductive process (which also involves approximations), and it also predicts that the spike threshold adapts to the membrane potential. Yet it provides new theoretical knowledge, compared to the inductive process. First, it shows that threshold adaptation is consistent with Hodgkin-Huxley equations, an established biophysical theory. This is not so surprising, but given that other hypotheses could be formulated (see e.g. the axonal inhibition hypothesis I mentioned above), it strengthens this hypothesis. Secondly, it shows under what conditions on ionic channel properties the theory can be consistent with the empirical data. This provides new ways to test the theory (by measuring ionic channel properties) and therefore increases its empirical content. Thirdly, the equation we proposed is slightly different from those previously proposed by induction. That is, the theory predicts that the spike threshold only adapts above a certain potential, otherwise it is fixed. This is a prediction that is not obvious from the published data, and therefore could not have been made by a purely inductive process. Thus, a deductive process is also a fruitful theoretical methodology, even though it is in some sense “purely theoretical”, that is, accounting for empirical facts is not part of the theory-making process itself (except for motivating the work).

In the second paper, we also used a deductive process to understand what threshold adaptation implies for synaptic integration. For example, we show that incoming spikes interact at the timescale of threshold adaptation, rather than of the membrane time constant. Note how the goal of this theoretical work now is not to account for empirical facts or explain mechanisms, but to provide a new interpretative framework for these facts. The theory redefines what should be considered significant – in this case, the distance to threshold rather than the absolute membrane potential. This is an important remark, because it implies that theoretical work is not only about making new experimental predictions, but also about interpreting experimental observations and possibly orienting future experiments.

We then concluded the paper with a counter-inductive line of reasoning. Different ionic mechanisms may contribute to threshold adaptation, in particular sodium channel inactivation and potassium channel activation. We argued that the former was more likely, because it is more energetically efficient (the latter requires both sodium and potassium channels to be open and counteract each other, implying considerable ionic traffic). This argument is not empirical: it relies on the idea that neurons should be efficient based on evolutionary theory (a theoretical argument) and on the fact that the brain has been shown to be efficient in many other circumstances (an argument by analogy). It is not based on empirical evidence, and worse, it is contradicted by empirical evidence. Indeed, blocking Kv1 channels abolishes threshold dynamics. I then reason counter-inductively to make my theoretical statement compatible with this observation. I first note that removing the heart of a man prevents him from thinking, but it does not imply that thoughts are produced by the heart. This is an epistemological argument (discarding the methodology as inappropriate). Secondly, I was told by a colleague (unpublished observation) that suppressing Kv1 moves the spike initiation site to the node of Ranvier (discarding the data as being irrelevant or abnormal). Thirdly, I can quantitatively account for the results with our theory, by noting that suppressing any channel can globally shift the spike threshold and possibly move the minimum threshold below the half-inactivation voltage of sodium channels, in which case there is no more threshold variability. These are three counter-inductive arguments that are perfectly reasonable. One might not be convinced by them, but they cannot be discarded as being intrinsically wrong. Since it is possible that I am right, counter-inductive reasoning is a useful scientific methodology. Note also how counter-inductive reasoning can suggest new experiments, for example testing whether suppressing Kv1 moves the initiation site to the node of Ranvier.

In summary, there are different types of theoretical work. They differ not so much in content as in methodology: deduction, induction and counter-induction. All three types of methodologies are valid and fruitful, and they should be recognized as such, noting that they have different logics and possibly different aims.


Update. It occurred to me that I use the word “induction” to refer to the making of a law from a series of observations, but it seems that this process is often subdivided in two different processes, induction and abduction. In this sense, induction is the making of a law from a series of observations in the sense of “generalizing”: for example, reasoning by analogy or fitting a curve to empirical data. Abduction is the finding of a possible underlying cause that would explain the observations. Thus abduction is more creative and seems more uncertain: it is the making of a hypothesis (among other possible hypotheses), while induction is rather the direct generalization of empirical data together with accepted knowledge. For example, data-driven neural modeling is a sort of inductive process. One builds a model from measurements and implicit accepted knowledge about neural biophysics – which generally comes with an astounding number of implicit hypotheses and approximations, e.g. electrotonic compactness or the idea that ionic channel properties are similar across cells and related species. The model accounts for the set of measurements, but it also predicts responses in an infinite number of situations. In my view, induction is the weakest form of theoretical process because there is no attempt to go beyond the data. Empirical data are seen as a series of unconnected weather observations that just need to be included in the already existing theory.

What is computational neuroscience? (V) A side note on Paul Feyerabend

Paul Feyerabend was a philosopher of science who defended an anarchist view of science (in his book “Against Method”). That is, he opposed the idea that there should be methodologies imposed in science, because he considered that these are the expression of conservatism. One may not agree with all his conclusions (some think of him as defending relativistic views), but his arguments are worth considering. By looking at the Copernican revolution, Feyerabend makes a strong case that the methodologies proposed by philosophers (e.g. falsificationism) have failed both as a description of scientific activity and as a prescription of "good" scientific activity. That is, in the history of science, new theories that ultimately replace established theories are initially in contradiction with established scientific facts. If they had been judged by the standards of falsificationism for example, they would have been immediately falsified. Yet the Copernican view (the Earth revolves around the sun) ultimately prevailed on the Ptolemaic system (the Earth is at the center of the universe). Galileo firmly believed in heliocentrism not because of empirical reasons (it did not explain more data) but because it “made more sense”, that is, it seemed like a more elegant explanation of the apparent trajectories of planets. See e.g. the picture below (taken from Wikipedia) showing the motion of the Sun, the Earth and Mars in both systems:

It appears clearly in this picture that there is no more empirical content in the heliocentric view, but it seems more satisfactory. At the time though, heliocentrism could be easily disproved with simple arguments, such as the tower argument: when a stone falls from the top of a tower, it falls right beneath it, while it should be “left behind” if the Earth were moving. This is a solid empirical fact, easily reproducible, which falsifies heliocentrism. It might seem foolish to us today, but it does so only because we know that the Earth moves. If we look again at the picture above, we see two theories that both account for the apparent trajectories of planets, but the tower argument corroborates geocentrism while it falsifies heliocentrism. Therefore, so Feyerabend concludes, scientific methodologies that are still widely accepted today (falsificationism) would immediately discard heliocentrism. It follows that these are not only a poor description of how scientific theories are made, but they are also a dangerous prescription of scientific activity, for they would not allow the Copernican revolution to occur.

Feyerabend then goes on to argue that the development of new theories follow a counter-inductive process. This, I believe, is a very deep observation. When a new theory is introduced, it is initially contradictory with a number of established scientific facts, such as the tower argument. Therefore, the theory develops by making the scientific facts agree with the theory, for example by finding an explanation for the fact that the stone falls right beneath the point where it was dropped. Note that these explanations may take a lot of time to be made convincingly, and that they do not constitute the core of the theory. This stands in sharp contrast with induction, in which a theory is built so as to account for the known facts. Here it is the theory itself (e.g. a philosophical principle) that is considered true, while the facts are re-interpreted so as to agree with it.

I want to stress that these arguments do not support relativism, i.e., the idea that all scientific theories are equally valid, depending on the point of view. To make this point clearly, I will make an analogy with a notion that is familiar to physicists, energy landscape:

This is very schematic but perhaps it helps making the argument. In the picture above, I represent on the vertical axis the amount of disagreement between a theory (on the horizontal axis) and empirical facts. This disagreement could be seen as the “energy” that one wants to minimize. The standard inductive process consists in incrementally improving a theory so as to minimize this energy (a sort of “gradient descent”). This process may stabilize into an established theory (the “current theory” in the picture). However, it is very possible that a better theory, empirically speaking, cannot be developed by this process, because it requires a change in paradigm, something that cannot be obtained by incremental changes to the established theory. That is, there is an “energy barrier” between the two theories. Passing through this barrier requires an empirical regression, in which the newly introduced theory is initially worse than the current theory in accounting for the empirical facts.

This analogy illustrates the idea that it can be necessary to temporarily deviate from the empirical facts so as to ultimately explain more of them. This does not mean that empirical facts do not matter, but simply that explaining more and more empirical facts should not be elevated to the rank of “the good scientific methodology”. There are other scientific processes that are both valid as methodologies and necessary for scientific progress. I believe this is how the title of Feyerabend’s book, “Against Method”, should be understood.

What is computational neuroscience? (IV) Should theories explain the data?

Since there is such an obvious answer, you might anticipate that I am going to question it! More precisely, I am going to analyze the following statement: a good theory is one that explains the maximum amount of empirical data while being as simple as possible. I will argue that 1) this is not stupid at all, but that 2) it cannot be a general criterion to distinguish good and bad theories, and finally that 3) it is only a relevant criterion for orthodox theories, i.e., theories that are consistent with theories that produced the data. The arguments are not particularly original, I will mostly summarize points made by a number of philosophers.

First of all, given a finite set of observations, there are an infinite number of universal laws that agree with the observations, so the problem is undetermined. This is the skeptic criticism of inductivism. Which theory to choose then? One approach is "Occam's razor", i.e., the idea that among competing hypotheses, the most parsimonious one should be preferred. But of course, Karl Popper and others would argue that it cannot be a valid criterion to distinguish between theories, because it could still be that the more complex hypothesis predicts future observations better than the simpler hypothesis - there is just no way to know without doing the new experiments. Yet it is not absurd as a heuristic to develop theories. This is a known fact in the field of machine learning for example, related to the problem of "overfitting". If one wants to describe the relationship between two quantities x and y, from a set of n examples (xi,yi), one could perfectly fit an nth-order polynomial to the data. It would completely explain the data, but yet would be very unlikely to fit a new example. In fact, a lower-dimensional relationship is more likely to account for new data, and this can be shown more rigorously with the tools of statistical learning theory. Thus there is a trade-off between how much of the data is accounted for and the simplicity of the theory. So, Occam's Razor is actually a very sensible heuristic to produce theories. But it should not be confused with a general criterion to discard theories.

The interim conclusion is: a theory should account for the data, but not at the expense of being as complicated as the data itself. Now I will make criticisms that are deeper, and mostly based on post-Popper philosophers such as Kuhn, Lakatos and Feyerabend. In a nutshell, the argument is that insisting that a theory should explain empirical data is a kind of inversion of what science is about. Science is about understanding the real world, by making theories and testing them with carefully designed experiments. These experiments are usually done using conditions that are very unecological, and this is justified by the fact that they are designed to test a specific hypothesis in a controlled way. For example, the laws of mechanics would be tested in conditions where there is no friction, a condition that actually almost never happens in the real world - and this is absolutely fine methodology. But then insisting that a new theory should be evaluated by how much it explains the empirical data is what I would call the "empiricist inversion": empirical data were produced, using very peculiar conditions justified by the theory that motivated the experiments, and now we demand that any theory should explain this data. One obvious point, which was made by Kuhn and Feyerabend, is that it gives a highly unfair advantage to the first theory, just because it was there first. But it is actually worse than this, because it also means that the criterion to judge theories is now disconnected from what was meant to be explained in the first place by the theory that produced the data. Here is the empiricist inversion: we consider that theories should explain data, when actually data is produced to test theories. What a theory is meant to explain is the world; data is only used as a methodological tool to test theories of the world.

In summary, this criterion then tends to produce theories of data, not theories of the world. This point in fact relates to the arguments of Gibson, who criticized psychological research for focusing on laboratory stimuli rather than ecological conditions. Of course simplified laboratory stimuli are used to control experiments precisely, but it should always be kept in mind that these simplified stimuli are used as methodological tools and not as the things that are meant to be explained. In neural modeling, I find that many models are developed to explain experimental data, ignoring the function of the models (i.e., the “computational level” in Marr’s analysis framework). In my view, this is characteristic of the empiricist inversion, which results in models of the data, not models of the brain.

At this point, my remarks might start being confusing. On one hand I am saying that it is a good idea to try to account for the data with a simple explanation, on the other hand I am saying that we should not care so much about the data. These seemingly contradictory statements can still make sense because they apply to different types of theories. This is related to what Thomas Kuhn termed “normal science” and “revolutionary science”. These terms might sound a bit too judgmental so I will rather speak of “orthodox theories” and “non-orthodox theories”. The idea is that science is structured by paradigm shifts. Between such shifts, a central paradigm dominates. Data are obtained through this paradigm, anomalies are also explained through this paradigm (rather than being seen as falsifications), and a lot of new scientific results are produced by “puzzle solving”, i.e., trying to explain data. At some point, for various reasons (e.g. too many unexplained anomalies), the central paradigm shifts to a new one and the process starts again, but with new data, new methods, or new ways to look at the observations.

“Orthodox theories” are theories developed within the central paradigm. These try to explain the data obtained with this paradigm, the “puzzle-solving” activity. Here it makes sense to consider that a good theory is a simple explanation of the empirical data. But this kind of criterion cannot explain paradigm shifts. A paradigm shift requires the development of non-orthodox theories, for which the existing empirical data may not be adequate. Therefore the making of non-orthodox theories follows a different logic. Because the existing data were obtained with a different paradigm, these theories are not driven by the data, although they may be motivated by some anomalous set of data. For example they may be developed from philosophical considerations or by analogy. The logic of their construction might be better described by counter-induction rather than induction (a concept proposed by Feyerabend). That is, their development starts from a theoretical principle, rather than from data, and existing data are deconstructed so as to fit the theory. By this process, implicit assumptions of the central paradigm are uncovered, and this might ultimately trigger new experiments and produce new experimental data that may be favorable to the new theory.

Recently, there have been a lot of discussions in the fields of neuroscience and computational neuroscience about the availability of massive amounts of data. Many consider it as a great opportunity, which should change the way we work and build models. It certainly seems like a good thing to have more data, but I would like to point out that it mostly matters for the development of orthodox theories. Putting too much emphasis (and resources) on it also raises the danger of driving the field away from non-orthodox theories, which in the end are the ones that bring scientific revolutions (with the caveat that of course most non-orthodox theories turn out to be wrong). Being myself unhappy with current orthodox theories in neuroscience, I see this danger as quite significant.

This was a long post and I will now try to summarize. I started with the provocative question: should a theory explain the data? First of all, a theory that explains every single bit of data is an enumeration of data, not a theory. It is unlikely to predict any new significant fact. This point is related to overfitting or the “curse of dimensionality” in statistical learning. A better theory is one that explains a lot of the data with a simple explanation, a principle known as Occam’s razor. However, this criterion should be thought of as a heuristic to develop theories, not a clear-cut general decision criterion between theories. In fact, this criterion is relevant mostly for orthodox theories, i.e., those theories that follow the central paradigm with which most data have been obtained. Non-orthodox theories, on the other hand, cannot be expected to explain most of the data obtained through a different paradigm (at least initially). It can be seen that in fact they are developed through a counter-inductive process, by which data are made consistent with the theory. This process may fail to produce new empirical facts consistent with the new theory (most often) or it may succeed and subsequently become the new central paradigm - but this is usually a long process.

What is computational neuroscience? (III) The different kinds of theories in computational neuroscience

Before I try to answer the questions I asked at the end of the previous post , I will first describe the different types of approaches in computational neuroscience. Note that this does not cover everything in theoretical and quantitative neuroscience (see my first post).

David Marr, a very important figure in computational neuroscience, proposed that cognitive systems can be described at three levels:

1) The computational level: what does the system do? (for example: estimating the sound location of a sound source)

2) The algorithmic/representational level: how does it do it? (for example: by calculating the maximum of cross-correlation between the two monaural signals)

3) The physical level: how is it physically realized? (for example: with axonal delay lines and coincidence detectors)

Theories in computational neuroscience differ by which level is addressed, and by the postulated relationships between the three levels (see also my related post).

David Marr considered that these three levels are independent. Francisco Varela described this view as “computational objectivism”. This means that the goal of the computation is defined in terms that are external to the organism. The two other levels describe how this goal is achieved, but they have no influence on what is achieved. It is implied that evolution shapes levels 2 and 3 by imposing the first level. It is important to realize that theories that follow this approach necessarily start from the highest level (defining the object of information processing), and only then analyze the lower levels. Such approaches can be restricted to the first level, or the first two levels, but they cannot address only the third level, or the second level, because these are defined by the higher levels. It can be described as a “top-down” approach.

The opposite view is that both the algorithmic and computational levels derive from the physical level, i.e., they emerge from the interactions between neurons. Varela described it as “neurophysiological subjectivism”. In this view, one would start by analyzing the third level, and then possibly go up to the higher levels – this is a “bottom-up” approach. This is the logic followed by data-driven approaches that I criticized in my first post. I criticized it because this view fails to acknowledge the fact that living beings are intensely teleonomic, i.e., the physical level serves a project (invariant reproduction, in the words of Jacques Monod). This is not to say that function is not produced by the interaction of neurons – it has to, in a materialistic view. But as a method of scientific inquiry, analyzing the physical level independently of the higher levels, as if it were a non-living object (e.g. a gas), does not seem adequate – at least it seems highly hopeful. As far as I know, this type of approach has produced theories of neural dynamics, rather than theories of neural computation. For example, showing how oscillations or some other large scale aspect of neural networks might emerge from the interaction of neurons. In other words, in Marr’s hierarchy, such studies are restricted to the third level. Therefore, I would categorize them as theoretical neuroscience rather than computational neuroscience.

These two opposite views roughly correspond to externalism and internalism in philosophy of perception. It is important to realize that these are important philosophical distinctions, which have considerable epistemological implications, in particular on what is considered a “realistic” model. Computational objectivists would insist that a biological model must serve a function, otherwise it is simply not about biology. Neurophysiological subjectivists would insist that the models must agree with certain physiological experiments, otherwise they are empirically wrong.

There is another class of approaches in philosophy of perception, which can be seen as intermediate between these two, the embodied approaches. These consider that the computational level cannot be defined independently of the physical level, because the goal of computation can only be defined in terms that are accessible to the organism. In the more external views (Gibson/O’Regan), this means that the computational level actually includes the body, but the neural implementation is seen as independent from the computational level. For example, in Gibson’s ecological approach and in O’Regan’s sensorimotor theory, the organism looks for information about the world implicit in its sensorimotor coupling. This differs quite substantially from computational objectivism in the way the goal of the computation is defined. In computational objectivism, the goal is defined externally. For example: to estimate the angle between a sound source and the head. Sensorimotor theories acknowledge that the notion of “angle” is one of an external observer with some measurement apparatus, it cannot be one of an organism. Instead in sensorimotor approaches, direction is defined subjectively (contrary to computational objectivism), but still in reference to an external world (contrary to neurophysiological subjectivism), as the self-generated movement that would make the sound move to the front (an arbitrary reference point). In the more internal views (e.g. Varela), the notion of computation itself is questioned, as it is considered that the goal is defined by the organism itself. This is Varela’s concept of autopoiesis, according to which a living entity acts so as to maintain its own organization. “Computation” is then a by-product of this process. This last class of approaches is currently less developed in computational neuroscience.

The three types of approaches I have described are mostly between the relationships between the computational and physical levels, and they are tightly linked with different views in philosophy of perception. There is also another divide line between neural computation theories, which has to do with the relationship between the algorithmic and physical levels. This is related to the rate-based vs. spike-based theories of neural computation (see my series of posts on the subject).

In Marr’s view and in general in rate-based views, the algorithmic and physical levels are mostly independent. Because algorithms are generally described in terms of calculus with analog values, spikes are generally seen as implementing analog calculus. In other words, spikes only reflect an underlying analog quantity, the firing rate of a neuron, on which the algorithms are defined. The usual view is that spikes are produced randomly with some probability reflecting the underlying rate (an abstract quantity).

On the contrary, another view holds that algorithms are defined at the level of spikes, not of rates. Such theories include the idea of binding by synchrony (Singer/von der Malsburg), in which neural synchrony is the signature of a coherent object, the related idea of synfire chains (Abeles), and more recently the theories developed by Sophie Denève and by myself (there is also Thorpe’s rank-order coding theory, but it is more on the side of coding than computation). In these former two theories, spiking is seen as a decision. In Denève’s approach, the neuron spikes so as to reduce an error criterion. In my recent paper on computing with synchrony, the neuron spikes when it observes unlikely coincidences, which signals some invariant structure (in the sense of Gibson). In both cases, the algorithm is defined directly at the level of spikes.

In summary: theories of neural computation can be classified according to the implicit relationships between the three levels of analysis described by Marr. It is important to realize that these are not purely scientific differences (by this, I mean not simply about empirical disputes), but really philosophical and/or epistemological differences. In my view this is a big issue for the peer-reviewing system, because it is difficult to have a paper accepted when the reviewers or editors do not share the same epistemological views.

What is computational neuroscience? (II) What is theory good for?

To answer this question, I need to write about basic notions of epistemology (the philosophy of knowledge). Epistemology is concerned in particular with what knowledge is and how it is acquired.

What is knowledge? Essentially, knowledge is statements about the world. There are two types of statements. First there are specific statements or “observations”, for example, “James has two legs”. But “All men have two legs” is a universal statement: it applies to an infinite number of observations, about men I have seen but also about men I might see in the future. We also call universal statements “theories”.

How is knowledge acquired? The naive view, classical inductivism, consists in collecting a large number of observations and generalizing from them. For example, one notes that all men he has seen so far have two legs, and concludes that all men have two legs. Unfortunately, inductivism cannot produce universal statements with certainty. It is well possible that one day you might see a man with only one leg. The problem is there are always an infinite number of universal statements that are consistent with any finite set of observations. For example, you can continue a sequence of finite numbers with any numbers you want, and it will still give you a possible a sequence of numbers.

Therefore, inductivism cannot guide the development of knowledge. Karl Popper, probably the most influential philosopher of science of the twentieth century, proposed to solve this problem with the notion of falsifiability. What distinguishes a scientific statement from a metaphysical statement is that it can be disproved by an experiment. For example, “all men have two legs” is a scientific statement, because the theory could be disproved by observing a man with one leg. But “there is a God” is not a scientific statement. This is not to say that these statements are true or not true, but that they have a scientific nature or not (but note that, by definition, a metaphysical statement can have no predictable impact on any of our experience, otherwise this would produce a test of that statement).

Popper’s concept of falsifiability has had a huge influence on modern science, and it essentially determines what we call “experimental work” and “theoretical work”. In Popper’s view, an experiment is an empirical test designed to falsify a theory. More generally, it is a situation for which different theories predict different outcomes. Note how this concept is different from the naive idea of “observing the laws of nature”. Laws of nature cannot be “observed” because an experiment is a single observation, whereas a law is a universal statement. Therefore, from a logical standpoint, the role of an experiment is rather to distinguish between otherwise consistent theories.

The structure of a typical experimental paper follows this logic: 1) Introduction, in which the theoretical issues are presented (the different hypotheses about some specific subject), 2) Methods, in which the experiment is described in details, so as to be reproducible, 3) Results, in which the outcomes are presented, 4) Discussion, in which the outcomes are shown to corroborate or invalidate various theories. Thus, an experimental paper is about formulating and performing a critical test of one, or usually several, theories.

Popper’s line of thinking seems to imply that knowledge can only progress through experimental work. Indeed theories can either be logically consistent or inconsistent, so there is no way to distinguish between logically consistent theories. Only empirical tests can corroborate or invalidate theories, and therefore produce knowledge. Hence the occasional demeaning comments that any theoretician has heard, around the idea that theories are mind games for a bunch of smart math-oriented people. That is, theory is useless since only empirical work can produce scientific knowledge.

This is a really paradoxical remark, for theory is the goal of scientific progress. Science is not about accumulating data, it is about finding the laws of nature, a.k.a. theories. It is precisely the predictive nature of science that makes it useful. How can it be that science is about making theories, but that science can only progress through empirical work?

Maybe this is a misunderstanding of Popper’s reasoning. Falsifiability is about how to distinguish between theories. It clarifies what empirical work is about, and what distinguishes science from metaphysics. But it says nothing about how theories are formulated in the first place. Falsifiability is about empirical validation of theories, not about the mysterious process of making theories, which we might say is the “hard problem” of philosophy of science. Yet making theories is a central part of the development of science. Without theory, there is simply no experiment to be done. But more importantly, science is made of theories.

So I can now answer the question I started with. Theories constitute the core of any science. Theoretical work is about the development of theories. Experimental work is about the testing of theories. Accordingly, theoretical papers are organized quite differently from experimental papers, because the methodology is very different, but also because there is no normalized methodology (“how it should be”). A number of computational journals insist on enforcing the structure of experimental papers (introduction / methods / results / discussion), but I believe this is due to the view that simulations are experiments (Winsberg, Philosophy of Science 2001), which I will discuss in another post.

Theory is often depicted as speculative. This is quite right. Theory is, in essence, speculative, since it is about making universal statements. But this does not mean that theory is nonsense. Theories are usually developed so as to be consistent with a body of experimental data, i.e., they have an empirical basis. Biological theories also often include a teleonomic element, i.e., they “make sense”. These two elements impose hard constraints on theories. In fact, they are so constraining that I do not know of any theory that is consistent with all (or even most) experimental data and that makes sense in a plausible ecological context. So theory making is about finding principled ways to explain existing data, and at the same time to explain biological function. Because this is such a difficult task, theoretical work can have some autonomy, in the sense that it can produce knowledge in the absence of new empirical work.

This last point is worth stressing, because it departs significantly from the standard Popperian view of scientific progress, which makes it a source of misunderstandings between theoreticians and experimenters. I am referring to the complexity of biological organisms, shaped by millions of years of evolution. Biological organisms are made of physical things that we understand at some level (molecules), but at the same time they serve a project (the global project being reproductive invariance, in the words of Jacques Monod). That they serve a project is not the simple result of the interaction of these physical elements, rather they are the result of evolutionary pressure. This means that even though on one hand we understand physics, or biophysics, to a high degree of sophistication, and on the other hand there are well established theories of biological function, there still is a huge explanatory gap between the two. This gap is largely theoretical, in the sense that we are looking for a way to make these two aspects logically consistent. This is why I believe theoretical work is so important in biology. It also has two consequences that can be hard to digest for experimenters: 1) theory can be autonomous to some extent (i.e., there can be “good” and “bad” theories, independently of new empirical evidence), 2) theoretical work is not necessarily aimed at making experimental predictions.

This discussion raises many questions that I will try to answer in the next posts:

- Why are theoretical and experimental journals separate?

- Should theories make predictions?

- Should theories be consistent with data?

- What is a “biologically plausible” model? And by the way, what is a model?

- Is simulation a kind of experiment?

What is computational neuroscience? (I) Definitions and the data-driven approach

What is computational neuroscience? Simply put, it is the field that is concerned with how the brain computes. The word “compute” is not necessarily an analogy with the computer, and it must be understood in a broad sense. It simply refers to the operations that must be carried out to perform cognitive functions (walking, recognizing a face, speaking). Put this way, it might seem that this is pretty much the entire field of neuroscience. What distinguishes computational neuroscience, then, is that this field seeks a mechanistic understanding of these operations, to the point that they could potentially be simulated on a computer. Note that this means neither that computational neuroscience is mostly about simulating the brain, nor that the brain is thought of as a computer. It simply refers to the materialistic assumption that, if all the laws that underlie cognition are known in details, then it should be possible to artificially reproduce them (assuming sufficient equipment).

Another related terminology is “theoretical neuroscience”. This is somewhat broader than computational neuroscience, and is probably an analogy to theoretical physics, a branch of physics that relies heavily on mathematical models. Theoretical neuroscience is not necessarily concerned with computation, at least not directly. One example could be the demonstration that action potential velocity is proportional to diameter in myelinated axons, and to the square root of the diameter in unmyelinated axons. This demonstration uses cable theory, a biophysical theory describing the propagation of electrical activity in axons and dendrites.

“Quantitative neuroscience” also refers to the use of quantitative mathematical models as a tool to understand brain function or dynamics, but the substitution of “quantitative” for “theoretical” suggests that the field is more concerned with data analysis (as opposed to theories of how the brain works).

Finally, “neural modeling” is concerned with the use of quantitative neural models, in general biophysical models. The terminology suggests a data-driven approach, i.e., building models of neural networks from experimental measurements, based on existing theories. This is why I am somewhat uneasy with this terminology, for epistemological reasons. The data-driven approach implicitly assumes that it is possible and meaningful to build a functioning neural network from a set of measurements alone. This raises two critical issues. One is that it is based on what Francisco Varela called “neurophysiological subjectivism” (see this related post), the idea that perception is the result of neural network dynamics. Neurophysiological subjectivism is problematic because (in particular) it fails to fully recognize the defining property of living beings, which is teleonomy (in other words, function). Living organisms are constrained on one hand by their physical substrate, but on the other hand this substrate is tightly constrained by evolution – this is precisely what makes them living beings and not just spin glasses. The data-driven approach only considers the constraints deriving from measurements, not the functional constraints, but this essentially amounts to denying the fact that the object of study is part of a living being. Alternatively, it assumes that measurements are sufficiently constraining that function is entirely implied, which seems naive.

The second major issue with the data-driven approach is that it has a strong flavor of inductivism. That is, it implicitly assumes that a functioning model is directly implied by a finite set of measurements. But inductivism is a philosophical error, for there are an infinite number of theories (or “models”) consistent with any finite set of observations (an error pointed out by Hume, for example). In fact, Popper and his followers also noted that inductivism commits another philosophical error, which is to think that there is such a thing as a “pure observation”. Experimental results are always to be interpreted in a specific theoretical context (a.k.a. the “Methods” section). One does not “measure” a model. One performs a specific experiment and observes the outcome with tools, which are themselves based on currently accepted theories. In other words, an experimental result is the answer to a specific question. But the type of question is not “What is the time constant of the model?”, but rather “What exponential function can I best fit to the electrical response of this neuron to a current pulse?”. Measurements may then provide constraints on possible models, but they never imply a model. In addition, as I noted above, physical constraints (implied by measurements) are only one side of the story, functional constraints are the other side. Neglecting this other side means studying a “soup of neurons”, not the brain.

In summary, it is often stated or implied that “realistic” models are those that are based on measurements: this is 1) an inductivist mistake, 2) a tragic disregard of what defines living beings, i.e. functional constraints.

I will end this post by asking a question: what is a better description of the brain? A soup of “realistic” neurons or a more conceptual mechanistic description of how interacting neurons support cognitive functions?