What is computational neuroscience? (VI) Deduction, induction, counter-induction

At this point, it should be clear that there is not a single type of theoretical work. I believe most theoretical work can be categorized into three broad classes: deduction, induction, and counter-induction. Deduction is deriving theoretical knowledge from previous theoretical knowledge, with no direct reference to empirical facts. Induction is the process of making a theory that accounts for the available empirical data, in general in a parsimonious way (Occam’s razor). Counter-induction is the process of making a theory based on non-empirical considerations (for example philosophical principles or analogy) or on a subset of empirical observations that are considered significant, and re-interpreting empirical facts so that they agree with the new theory. Note that 1) all these processes may lead to new empirical predictions, 2) a given line of research may use all three types of processes.

For illustration, I will discuss the work done in my group on the dynamics of spike threshold (see these two papers with Jonathan Platkiewicz: “ A Threshold Equation for Action Potential Initiation” and “Impact of Fast Sodium Channel Inactivation on Spike Threshold Dynamics and Synaptic Integration”). It is certainly not the most well-known line of research and therefore it will require some explanation. However, since I know it so well, it will be easier to highlight the different types of theoretical thinking – I will try to show how all three types of processes were used.

I will first briefly summarize the scientific context. Neurons communicate with each other by spikes, which are triggered when the membrane potential reaches a threshold value. It turns out that, in vivo, the spike threshold is not a fixed value even within a given neuron. Many empirical observations show that it depends on the stimulation, and on various aspects of the previous activity of the neuron, e.g. its previous membrane potential and the previously triggered spikes. For example, the spike threshold tends to be higher when the membrane potential was previously higher. By induction, one may infer that the spike threshold adapts to the membrane potential. One may then derive a first-order differential equation describing the process, in which the threshold adapts to the membrane potential with some characteristic time constant.  Such phenomenological equations have been proposed in the past by a number of authors, and it is qualitatively consistent with a number of properties seen in the empirical data. But note that an inductive process can only produce a hypothesis. The data could be explained by other hypotheses. For example, the threshold could be modulated by an external process, say inhibition targeted at the spike initiation site, which would co-vary with the somatic membrane potential. However, the hypothesis could potentially be tested. For example, an experiment could be done in which the membrane potential is actively modified by an electrode injecting current: if threshold modulation is external, spike threshold should not be affected by this perturbation. So an inductive process can be a fruitful theoretical methodology.

In our work with Jonathan Platkiewicz, we started from this inductive insight, and then followed a deductive process. The biophysics of spike initiation is described by the Hodgkin-Huxley equations. Hodgkin and Huxley got the Nobel prize in 1963 for showing how ionic mechanisms interact to generate spikes in the squid giant axons. They used a quantitative model (four differential equations) that they fitted to their measurements. They were then able to accurately predict the velocity of spike propagation along the axon. As a side note, this mathematical model, which explicitly refers to ionic channels, was established much before these channels could be directly observed (by Neher and Sakmann, who then also got the Nobel prize in 1991). Thus this discovery was not data-driven at all, but rather hypothesis-driven.

In the Hodgkin-Huxley model, spikes are initiated by the opening of sodium channels, which let a positive current enter the cell when the membrane potential is high enough, triggering a positive feedback process. These channels also inactivate (more slowly) when the membrane potential increases, and when they inactivate the spike threshold increases. This is one mechanism by which the spike threshold can adapt to the membrane potential. Another way, in the Hodgkin-Huxley equations, is by the opening of potassium channels when the membrane potential increases. In this model, we then derived an equation describing how the spike threshold depends on these ionic channels, and then a differential equation describing how it evolves with the membrane potential. This is a purely deductive process (which also involves approximations), and it also predicts that the spike threshold adapts to the membrane potential. Yet it provides new theoretical knowledge, compared to the inductive process. First, it shows that threshold adaptation is consistent with Hodgkin-Huxley equations, an established biophysical theory. This is not so surprising, but given that other hypotheses could be formulated (see e.g. the axonal inhibition hypothesis I mentioned above), it strengthens this hypothesis. Secondly, it shows under what conditions on ionic channel properties the theory can be consistent with the empirical data. This provides new ways to test the theory (by measuring ionic channel properties) and therefore increases its empirical content. Thirdly, the equation we proposed is slightly different from those previously proposed by induction. That is, the theory predicts that the spike threshold only adapts above a certain potential, otherwise it is fixed. This is a prediction that is not obvious from the published data, and therefore could not have been made by a purely inductive process. Thus, a deductive process is also a fruitful theoretical methodology, even though it is in some sense “purely theoretical”, that is, accounting for empirical facts is not part of the theory-making process itself (except for motivating the work).

In the second paper, we also used a deductive process to understand what threshold adaptation implies for synaptic integration. For example, we show that incoming spikes interact at the timescale of threshold adaptation, rather than of the membrane time constant. Note how the goal of this theoretical work now is not to account for empirical facts or explain mechanisms, but to provide a new interpretative framework for these facts. The theory redefines what should be considered significant – in this case, the distance to threshold rather than the absolute membrane potential. This is an important remark, because it implies that theoretical work is not only about making new experimental predictions, but also about interpreting experimental observations and possibly orienting future experiments.

We then concluded the paper with a counter-inductive line of reasoning. Different ionic mechanisms may contribute to threshold adaptation, in particular sodium channel inactivation and potassium channel activation. We argued that the former was more likely, because it is more energetically efficient (the latter requires both sodium and potassium channels to be open and counteract each other, implying considerable ionic traffic). This argument is not empirical: it relies on the idea that neurons should be efficient based on evolutionary theory (a theoretical argument) and on the fact that the brain has been shown to be efficient in many other circumstances (an argument by analogy). It is not based on empirical evidence, and worse, it is contradicted by empirical evidence. Indeed, blocking Kv1 channels abolishes threshold dynamics. I then reason counter-inductively to make my theoretical statement compatible with this observation. I first note that removing the heart of a man prevents him from thinking, but it does not imply that thoughts are produced by the heart. This is an epistemological argument (discarding the methodology as inappropriate). Secondly, I was told by a colleague (unpublished observation) that suppressing Kv1 moves the spike initiation site to the node of Ranvier (discarding the data as being irrelevant or abnormal). Thirdly, I can quantitatively account for the results with our theory, by noting that suppressing any channel can globally shift the spike threshold and possibly move the minimum threshold below the half-inactivation voltage of sodium channels, in which case there is no more threshold variability. These are three counter-inductive arguments that are perfectly reasonable. One might not be convinced by them, but they cannot be discarded as being intrinsically wrong. Since it is possible that I am right, counter-inductive reasoning is a useful scientific methodology. Note also how counter-inductive reasoning can suggest new experiments, for example testing whether suppressing Kv1 moves the initiation site to the node of Ranvier.

In summary, there are different types of theoretical work. They differ not so much in content as in methodology: deduction, induction and counter-induction. All three types of methodologies are valid and fruitful, and they should be recognized as such, noting that they have different logics and possibly different aims.

 

Update. It occurred to me that I use the word “induction” to refer to the making of a law from a series of observations, but it seems that this process is often subdivided in two different processes, induction and abduction. In this sense, induction is the making of a law from a series of observations in the sense of “generalizing”: for example, reasoning by analogy or fitting a curve to empirical data. Abduction is the finding of a possible underlying cause that would explain the observations. Thus abduction is more creative and seems more uncertain: it is the making of a hypothesis (among other possible hypotheses), while induction is rather the direct generalization of empirical data together with accepted knowledge. For example, data-driven neural modeling is a sort of inductive process. One builds a model from measurements and implicit accepted knowledge about neural biophysics – which generally comes with an astounding number of implicit hypotheses and approximations, e.g. electrotonic compactness or the idea that ionic channel properties are similar across cells and related species. The model accounts for the set of measurements, but it also predicts responses in an infinite number of situations. In my view, induction is the weakest form of theoretical process because there is no attempt to go beyond the data. Empirical data are seen as a series of unconnected weather observations that just need to be included in the already existing theory.

What is computational neuroscience? (V) A side note on Paul Feyerabend

Paul Feyerabend was a philosopher of science who defended an anarchist view of science (in his book “Against Method”). That is, he opposed the idea that there should be methodologies imposed in science, because he considered that these are the expression of conservatism. One may not agree with all his conclusions (some think of him as defending relativistic views), but his arguments are worth considering. By looking at the Copernican revolution, Feyerabend makes a strong case that the methodologies proposed by philosophers (e.g. falsificationism) have failed both as a description of scientific activity and as a prescription of "good" scientific activity. That is, in the history of science, new theories that ultimately replace established theories are initially in contradiction with established scientific facts. If they had been judged by the standards of falsificationism for example, they would have been immediately falsified. Yet the Copernican view (the Earth revolves around the sun) ultimately prevailed on the Ptolemaic system (the Earth is at the center of the universe). Galileo firmly believed in heliocentrism not because of empirical reasons (it did not explain more data) but because it “made more sense”, that is, it seemed like a more elegant explanation of the apparent trajectories of planets. See e.g. the picture below (taken from Wikipedia) showing the motion of the Sun, the Earth and Mars in both systems:

It appears clearly in this picture that there is no more empirical content in the heliocentric view, but it seems more satisfactory. At the time though, heliocentrism could be easily disproved with simple arguments, such as the tower argument: when a stone falls from the top of a tower, it falls right beneath it, while it should be “left behind” if the Earth were moving. This is a solid empirical fact, easily reproducible, which falsifies heliocentrism. It might seem foolish to us today, but it does so only because we know that the Earth moves. If we look again at the picture above, we see two theories that both account for the apparent trajectories of planets, but the tower argument corroborates geocentrism while it falsifies heliocentrism. Therefore, so Feyerabend concludes, scientific methodologies that are still widely accepted today (falsificationism) would immediately discard heliocentrism. It follows that these are not only a poor description of how scientific theories are made, but they are also a dangerous prescription of scientific activity, for they would not allow the Copernican revolution to occur.

Feyerabend then goes on to argue that the development of new theories follow a counter-inductive process. This, I believe, is a very deep observation. When a new theory is introduced, it is initially contradictory with a number of established scientific facts, such as the tower argument. Therefore, the theory develops by making the scientific facts agree with the theory, for example by finding an explanation for the fact that the stone falls right beneath the point where it was dropped. Note that these explanations may take a lot of time to be made convincingly, and that they do not constitute the core of the theory. This stands in sharp contrast with induction, in which a theory is built so as to account for the known facts. Here it is the theory itself (e.g. a philosophical principle) that is considered true, while the facts are re-interpreted so as to agree with it.

I want to stress that these arguments do not support relativism, i.e., the idea that all scientific theories are equally valid, depending on the point of view. To make this point clearly, I will make an analogy with a notion that is familiar to physicists, energy landscape:

This is very schematic but perhaps it helps making the argument. In the picture above, I represent on the vertical axis the amount of disagreement between a theory (on the horizontal axis) and empirical facts. This disagreement could be seen as the “energy” that one wants to minimize. The standard inductive process consists in incrementally improving a theory so as to minimize this energy (a sort of “gradient descent”). This process may stabilize into an established theory (the “current theory” in the picture). However, it is very possible that a better theory, empirically speaking, cannot be developed by this process, because it requires a change in paradigm, something that cannot be obtained by incremental changes to the established theory. That is, there is an “energy barrier” between the two theories. Passing through this barrier requires an empirical regression, in which the newly introduced theory is initially worse than the current theory in accounting for the empirical facts.

This analogy illustrates the idea that it can be necessary to temporarily deviate from the empirical facts so as to ultimately explain more of them. This does not mean that empirical facts do not matter, but simply that explaining more and more empirical facts should not be elevated to the rank of “the good scientific methodology”. There are other scientific processes that are both valid as methodologies and necessary for scientific progress. I believe this is how the title of Feyerabend’s book, “Against Method”, should be understood.

What is computational neuroscience? (IV) Should theories explain the data?

Since there is such an obvious answer, you might anticipate that I am going to question it! More precisely, I am going to analyze the following statement: a good theory is one that explains the maximum amount of empirical data while being as simple as possible. I will argue that 1) this is not stupid at all, but that 2) it cannot be a general criterion to distinguish good and bad theories, and finally that 3) it is only a relevant criterion for orthodox theories, i.e., theories that are consistent with theories that produced the data. The arguments are not particularly original, I will mostly summarize points made by a number of philosophers.

First of all, given a finite set of observations, there are an infinite number of universal laws that agree with the observations, so the problem is undetermined. This is the skeptic criticism of inductivism. Which theory to choose then? One approach is "Occam's razor", i.e., the idea that among competing hypotheses, the most parsimonious one should be preferred. But of course, Karl Popper and others would argue that it cannot be a valid criterion to distinguish between theories, because it could still be that the more complex hypothesis predicts future observations better than the simpler hypothesis - there is just no way to know without doing the new experiments. Yet it is not absurd as a heuristic to develop theories. This is a known fact in the field of machine learning for example, related to the problem of "overfitting". If one wants to describe the relationship between two quantities x and y, from a set of n examples (xi,yi), one could perfectly fit an nth-order polynomial to the data. It would completely explain the data, but yet would be very unlikely to fit a new example. In fact, a lower-dimensional relationship is more likely to account for new data, and this can be shown more rigorously with the tools of statistical learning theory. Thus there is a trade-off between how much of the data is accounted for and the simplicity of the theory. So, Occam's Razor is actually a very sensible heuristic to produce theories. But it should not be confused with a general criterion to discard theories.

The interim conclusion is: a theory should account for the data, but not at the expense of being as complicated as the data itself. Now I will make criticisms that are deeper, and mostly based on post-Popper philosophers such as Kuhn, Lakatos and Feyerabend. In a nutshell, the argument is that insisting that a theory should explain empirical data is a kind of inversion of what science is about. Science is about understanding the real world, by making theories and testing them with carefully designed experiments. These experiments are usually done using conditions that are very unecological, and this is justified by the fact that they are designed to test a specific hypothesis in a controlled way. For example, the laws of mechanics would be tested in conditions where there is no friction, a condition that actually almost never happens in the real world - and this is absolutely fine methodology. But then insisting that a new theory should be evaluated by how much it explains the empirical data is what I would call the "empiricist inversion": empirical data were produced, using very peculiar conditions justified by the theory that motivated the experiments, and now we demand that any theory should explain this data. One obvious point, which was made by Kuhn and Feyerabend, is that it gives a highly unfair advantage to the first theory, just because it was there first. But it is actually worse than this, because it also means that the criterion to judge theories is now disconnected from what was meant to be explained in the first place by the theory that produced the data. Here is the empiricist inversion: we consider that theories should explain data, when actually data is produced to test theories. What a theory is meant to explain is the world; data is only used as a methodological tool to test theories of the world.

In summary, this criterion then tends to produce theories of data, not theories of the world. This point in fact relates to the arguments of Gibson, who criticized psychological research for focusing on laboratory stimuli rather than ecological conditions. Of course simplified laboratory stimuli are used to control experiments precisely, but it should always be kept in mind that these simplified stimuli are used as methodological tools and not as the things that are meant to be explained. In neural modeling, I find that many models are developed to explain experimental data, ignoring the function of the models (i.e., the “computational level” in Marr’s analysis framework). In my view, this is characteristic of the empiricist inversion, which results in models of the data, not models of the brain.

At this point, my remarks might start being confusing. On one hand I am saying that it is a good idea to try to account for the data with a simple explanation, on the other hand I am saying that we should not care so much about the data. These seemingly contradictory statements can still make sense because they apply to different types of theories. This is related to what Thomas Kuhn termed “normal science” and “revolutionary science”. These terms might sound a bit too judgmental so I will rather speak of “orthodox theories” and “non-orthodox theories”. The idea is that science is structured by paradigm shifts. Between such shifts, a central paradigm dominates. Data are obtained through this paradigm, anomalies are also explained through this paradigm (rather than being seen as falsifications), and a lot of new scientific results are produced by “puzzle solving”, i.e., trying to explain data. At some point, for various reasons (e.g. too many unexplained anomalies), the central paradigm shifts to a new one and the process starts again, but with new data, new methods, or new ways to look at the observations.

“Orthodox theories” are theories developed within the central paradigm. These try to explain the data obtained with this paradigm, the “puzzle-solving” activity. Here it makes sense to consider that a good theory is a simple explanation of the empirical data. But this kind of criterion cannot explain paradigm shifts. A paradigm shift requires the development of non-orthodox theories, for which the existing empirical data may not be adequate. Therefore the making of non-orthodox theories follows a different logic. Because the existing data were obtained with a different paradigm, these theories are not driven by the data, although they may be motivated by some anomalous set of data. For example they may be developed from philosophical considerations or by analogy. The logic of their construction might be better described by counter-induction rather than induction (a concept proposed by Feyerabend). That is, their development starts from a theoretical principle, rather than from data, and existing data are deconstructed so as to fit the theory. By this process, implicit assumptions of the central paradigm are uncovered, and this might ultimately trigger new experiments and produce new experimental data that may be favorable to the new theory.

Recently, there have been a lot of discussions in the fields of neuroscience and computational neuroscience about the availability of massive amounts of data. Many consider it as a great opportunity, which should change the way we work and build models. It certainly seems like a good thing to have more data, but I would like to point out that it mostly matters for the development of orthodox theories. Putting too much emphasis (and resources) on it also raises the danger of driving the field away from non-orthodox theories, which in the end are the ones that bring scientific revolutions (with the caveat that of course most non-orthodox theories turn out to be wrong). Being myself unhappy with current orthodox theories in neuroscience, I see this danger as quite significant.

This was a long post and I will now try to summarize. I started with the provocative question: should a theory explain the data? First of all, a theory that explains every single bit of data is an enumeration of data, not a theory. It is unlikely to predict any new significant fact. This point is related to overfitting or the “curse of dimensionality” in statistical learning. A better theory is one that explains a lot of the data with a simple explanation, a principle known as Occam’s razor. However, this criterion should be thought of as a heuristic to develop theories, not a clear-cut general decision criterion between theories. In fact, this criterion is relevant mostly for orthodox theories, i.e., those theories that follow the central paradigm with which most data have been obtained. Non-orthodox theories, on the other hand, cannot be expected to explain most of the data obtained through a different paradigm (at least initially). It can be seen that in fact they are developed through a counter-inductive process, by which data are made consistent with the theory. This process may fail to produce new empirical facts consistent with the new theory (most often) or it may succeed and subsequently become the new central paradigm - but this is usually a long process.

What is computational neuroscience? (III) The different kinds of theories in computational neuroscience

Before I try to answer the questions I asked at the end of the previous post , I will first describe the different types of approaches in computational neuroscience. Note that this does not cover everything in theoretical and quantitative neuroscience (see my first post).

David Marr, a very important figure in computational neuroscience, proposed that cognitive systems can be described at three levels:

1) The computational level: what does the system do? (for example: estimating the sound location of a sound source)

2) The algorithmic/representational level: how does it do it? (for example: by calculating the maximum of cross-correlation between the two monaural signals)

3) The physical level: how is it physically realized? (for example: with axonal delay lines and coincidence detectors)

Theories in computational neuroscience differ by which level is addressed, and by the postulated relationships between the three levels (see also my related post).

David Marr considered that these three levels are independent. Francisco Varela described this view as “computational objectivism”. This means that the goal of the computation is defined in terms that are external to the organism. The two other levels describe how this goal is achieved, but they have no influence on what is achieved. It is implied that evolution shapes levels 2 and 3 by imposing the first level. It is important to realize that theories that follow this approach necessarily start from the highest level (defining the object of information processing), and only then analyze the lower levels. Such approaches can be restricted to the first level, or the first two levels, but they cannot address only the third level, or the second level, because these are defined by the higher levels. It can be described as a “top-down” approach.

The opposite view is that both the algorithmic and computational levels derive from the physical level, i.e., they emerge from the interactions between neurons. Varela described it as “neurophysiological subjectivism”. In this view, one would start by analyzing the third level, and then possibly go up to the higher levels – this is a “bottom-up” approach. This is the logic followed by data-driven approaches that I criticized in my first post. I criticized it because this view fails to acknowledge the fact that living beings are intensely teleonomic, i.e., the physical level serves a project (invariant reproduction, in the words of Jacques Monod). This is not to say that function is not produced by the interaction of neurons – it has to, in a materialistic view. But as a method of scientific inquiry, analyzing the physical level independently of the higher levels, as if it were a non-living object (e.g. a gas), does not seem adequate – at least it seems highly hopeful. As far as I know, this type of approach has produced theories of neural dynamics, rather than theories of neural computation. For example, showing how oscillations or some other large scale aspect of neural networks might emerge from the interaction of neurons. In other words, in Marr’s hierarchy, such studies are restricted to the third level. Therefore, I would categorize them as theoretical neuroscience rather than computational neuroscience.

These two opposite views roughly correspond to externalism and internalism in philosophy of perception. It is important to realize that these are important philosophical distinctions, which have considerable epistemological implications, in particular on what is considered a “realistic” model. Computational objectivists would insist that a biological model must serve a function, otherwise it is simply not about biology. Neurophysiological subjectivists would insist that the models must agree with certain physiological experiments, otherwise they are empirically wrong.

There is another class of approaches in philosophy of perception, which can be seen as intermediate between these two, the embodied approaches. These consider that the computational level cannot be defined independently of the physical level, because the goal of computation can only be defined in terms that are accessible to the organism. In the more external views (Gibson/O’Regan), this means that the computational level actually includes the body, but the neural implementation is seen as independent from the computational level. For example, in Gibson’s ecological approach and in O’Regan’s sensorimotor theory, the organism looks for information about the world implicit in its sensorimotor coupling. This differs quite substantially from computational objectivism in the way the goal of the computation is defined. In computational objectivism, the goal is defined externally. For example: to estimate the angle between a sound source and the head. Sensorimotor theories acknowledge that the notion of “angle” is one of an external observer with some measurement apparatus, it cannot be one of an organism. Instead in sensorimotor approaches, direction is defined subjectively (contrary to computational objectivism), but still in reference to an external world (contrary to neurophysiological subjectivism), as the self-generated movement that would make the sound move to the front (an arbitrary reference point). In the more internal views (e.g. Varela), the notion of computation itself is questioned, as it is considered that the goal is defined by the organism itself. This is Varela’s concept of autopoiesis, according to which a living entity acts so as to maintain its own organization. “Computation” is then a by-product of this process. This last class of approaches is currently less developed in computational neuroscience.

The three types of approaches I have described are mostly between the relationships between the computational and physical levels, and they are tightly linked with different views in philosophy of perception. There is also another divide line between neural computation theories, which has to do with the relationship between the algorithmic and physical levels. This is related to the rate-based vs. spike-based theories of neural computation (see my series of posts on the subject).

In Marr’s view and in general in rate-based views, the algorithmic and physical levels are mostly independent. Because algorithms are generally described in terms of calculus with analog values, spikes are generally seen as implementing analog calculus. In other words, spikes only reflect an underlying analog quantity, the firing rate of a neuron, on which the algorithms are defined. The usual view is that spikes are produced randomly with some probability reflecting the underlying rate (an abstract quantity).

On the contrary, another view holds that algorithms are defined at the level of spikes, not of rates. Such theories include the idea of binding by synchrony (Singer/von der Malsburg), in which neural synchrony is the signature of a coherent object, the related idea of synfire chains (Abeles), and more recently the theories developed by Sophie Denève and by myself (there is also Thorpe’s rank-order coding theory, but it is more on the side of coding than computation). In these former two theories, spiking is seen as a decision. In Denève’s approach, the neuron spikes so as to reduce an error criterion. In my recent paper on computing with synchrony, the neuron spikes when it observes unlikely coincidences, which signals some invariant structure (in the sense of Gibson). In both cases, the algorithm is defined directly at the level of spikes.

In summary: theories of neural computation can be classified according to the implicit relationships between the three levels of analysis described by Marr. It is important to realize that these are not purely scientific differences (by this, I mean not simply about empirical disputes), but really philosophical and/or epistemological differences. In my view this is a big issue for the peer-reviewing system, because it is difficult to have a paper accepted when the reviewers or editors do not share the same epistemological views.

What is computational neuroscience? (II) What is theory good for?

To answer this question, I need to write about basic notions of epistemology (the philosophy of knowledge). Epistemology is concerned in particular with what knowledge is and how it is acquired.

What is knowledge? Essentially, knowledge is statements about the world. There are two types of statements. First there are specific statements or “observations”, for example, “James has two legs”. But “All men have two legs” is a universal statement: it applies to an infinite number of observations, about men I have seen but also about men I might see in the future. We also call universal statements “theories”.

How is knowledge acquired? The naive view, classical inductivism, consists in collecting a large number of observations and generalizing from them. For example, one notes that all men he has seen so far have two legs, and concludes that all men have two legs. Unfortunately, inductivism cannot produce universal statements with certainty. It is well possible that one day you might see a man with only one leg. The problem is there are always an infinite number of universal statements that are consistent with any finite set of observations. For example, you can continue a sequence of finite numbers with any numbers you want, and it will still give you a possible a sequence of numbers.

Therefore, inductivism cannot guide the development of knowledge. Karl Popper, probably the most influential philosopher of science of the twentieth century, proposed to solve this problem with the notion of falsifiability. What distinguishes a scientific statement from a metaphysical statement is that it can be disproved by an experiment. For example, “all men have two legs” is a scientific statement, because the theory could be disproved by observing a man with one leg. But “there is a God” is not a scientific statement. This is not to say that these statements are true or not true, but that they have a scientific nature or not (but note that, by definition, a metaphysical statement can have no predictable impact on any of our experience, otherwise this would produce a test of that statement).

Popper’s concept of falsifiability has had a huge influence on modern science, and it essentially determines what we call “experimental work” and “theoretical work”. In Popper’s view, an experiment is an empirical test designed to falsify a theory. More generally, it is a situation for which different theories predict different outcomes. Note how this concept is different from the naive idea of “observing the laws of nature”. Laws of nature cannot be “observed” because an experiment is a single observation, whereas a law is a universal statement. Therefore, from a logical standpoint, the role of an experiment is rather to distinguish between otherwise consistent theories.

The structure of a typical experimental paper follows this logic: 1) Introduction, in which the theoretical issues are presented (the different hypotheses about some specific subject), 2) Methods, in which the experiment is described in details, so as to be reproducible, 3) Results, in which the outcomes are presented, 4) Discussion, in which the outcomes are shown to corroborate or invalidate various theories. Thus, an experimental paper is about formulating and performing a critical test of one, or usually several, theories.

Popper’s line of thinking seems to imply that knowledge can only progress through experimental work. Indeed theories can either be logically consistent or inconsistent, so there is no way to distinguish between logically consistent theories. Only empirical tests can corroborate or invalidate theories, and therefore produce knowledge. Hence the occasional demeaning comments that any theoretician has heard, around the idea that theories are mind games for a bunch of smart math-oriented people. That is, theory is useless since only empirical work can produce scientific knowledge.

This is a really paradoxical remark, for theory is the goal of scientific progress. Science is not about accumulating data, it is about finding the laws of nature, a.k.a. theories. It is precisely the predictive nature of science that makes it useful. How can it be that science is about making theories, but that science can only progress through empirical work?

Maybe this is a misunderstanding of Popper’s reasoning. Falsifiability is about how to distinguish between theories. It clarifies what empirical work is about, and what distinguishes science from metaphysics. But it says nothing about how theories are formulated in the first place. Falsifiability is about empirical validation of theories, not about the mysterious process of making theories, which we might say is the “hard problem” of philosophy of science. Yet making theories is a central part of the development of science. Without theory, there is simply no experiment to be done. But more importantly, science is made of theories.

So I can now answer the question I started with. Theories constitute the core of any science. Theoretical work is about the development of theories. Experimental work is about the testing of theories. Accordingly, theoretical papers are organized quite differently from experimental papers, because the methodology is very different, but also because there is no normalized methodology (“how it should be”). A number of computational journals insist on enforcing the structure of experimental papers (introduction / methods / results / discussion), but I believe this is due to the view that simulations are experiments (Winsberg, Philosophy of Science 2001), which I will discuss in another post.

Theory is often depicted as speculative. This is quite right. Theory is, in essence, speculative, since it is about making universal statements. But this does not mean that theory is nonsense. Theories are usually developed so as to be consistent with a body of experimental data, i.e., they have an empirical basis. Biological theories also often include a teleonomic element, i.e., they “make sense”. These two elements impose hard constraints on theories. In fact, they are so constraining that I do not know of any theory that is consistent with all (or even most) experimental data and that makes sense in a plausible ecological context. So theory making is about finding principled ways to explain existing data, and at the same time to explain biological function. Because this is such a difficult task, theoretical work can have some autonomy, in the sense that it can produce knowledge in the absence of new empirical work.

This last point is worth stressing, because it departs significantly from the standard Popperian view of scientific progress, which makes it a source of misunderstandings between theoreticians and experimenters. I am referring to the complexity of biological organisms, shaped by millions of years of evolution. Biological organisms are made of physical things that we understand at some level (molecules), but at the same time they serve a project (the global project being reproductive invariance, in the words of Jacques Monod). That they serve a project is not the simple result of the interaction of these physical elements, rather they are the result of evolutionary pressure. This means that even though on one hand we understand physics, or biophysics, to a high degree of sophistication, and on the other hand there are well established theories of biological function, there still is a huge explanatory gap between the two. This gap is largely theoretical, in the sense that we are looking for a way to make these two aspects logically consistent. This is why I believe theoretical work is so important in biology. It also has two consequences that can be hard to digest for experimenters: 1) theory can be autonomous to some extent (i.e., there can be “good” and “bad” theories, independently of new empirical evidence), 2) theoretical work is not necessarily aimed at making experimental predictions.

This discussion raises many questions that I will try to answer in the next posts:

- Why are theoretical and experimental journals separate?

- Should theories make predictions?

- Should theories be consistent with data?

- What is a “biologically plausible” model? And by the way, what is a model?

- Is simulation a kind of experiment?

What is computational neuroscience? (I) Definitions and the data-driven approach

What is computational neuroscience? Simply put, it is the field that is concerned with how the brain computes. The word “compute” is not necessarily an analogy with the computer, and it must be understood in a broad sense. It simply refers to the operations that must be carried out to perform cognitive functions (walking, recognizing a face, speaking). Put this way, it might seem that this is pretty much the entire field of neuroscience. What distinguishes computational neuroscience, then, is that this field seeks a mechanistic understanding of these operations, to the point that they could potentially be simulated on a computer. Note that this means neither that computational neuroscience is mostly about simulating the brain, nor that the brain is thought of as a computer. It simply refers to the materialistic assumption that, if all the laws that underlie cognition are known in details, then it should be possible to artificially reproduce them (assuming sufficient equipment).

Another related terminology is “theoretical neuroscience”. This is somewhat broader than computational neuroscience, and is probably an analogy to theoretical physics, a branch of physics that relies heavily on mathematical models. Theoretical neuroscience is not necessarily concerned with computation, at least not directly. One example could be the demonstration that action potential velocity is proportional to diameter in myelinated axons, and to the square root of the diameter in unmyelinated axons. This demonstration uses cable theory, a biophysical theory describing the propagation of electrical activity in axons and dendrites.

“Quantitative neuroscience” also refers to the use of quantitative mathematical models as a tool to understand brain function or dynamics, but the substitution of “quantitative” for “theoretical” suggests that the field is more concerned with data analysis (as opposed to theories of how the brain works).

Finally, “neural modeling” is concerned with the use of quantitative neural models, in general biophysical models. The terminology suggests a data-driven approach, i.e., building models of neural networks from experimental measurements, based on existing theories. This is why I am somewhat uneasy with this terminology, for epistemological reasons. The data-driven approach implicitly assumes that it is possible and meaningful to build a functioning neural network from a set of measurements alone. This raises two critical issues. One is that it is based on what Francisco Varela called “neurophysiological subjectivism” (see this related post), the idea that perception is the result of neural network dynamics. Neurophysiological subjectivism is problematic because (in particular) it fails to fully recognize the defining property of living beings, which is teleonomy (in other words, function). Living organisms are constrained on one hand by their physical substrate, but on the other hand this substrate is tightly constrained by evolution – this is precisely what makes them living beings and not just spin glasses. The data-driven approach only considers the constraints deriving from measurements, not the functional constraints, but this essentially amounts to denying the fact that the object of study is part of a living being. Alternatively, it assumes that measurements are sufficiently constraining that function is entirely implied, which seems naive.

The second major issue with the data-driven approach is that it has a strong flavor of inductivism. That is, it implicitly assumes that a functioning model is directly implied by a finite set of measurements. But inductivism is a philosophical error, for there are an infinite number of theories (or “models”) consistent with any finite set of observations (an error pointed out by Hume, for example). In fact, Popper and his followers also noted that inductivism commits another philosophical error, which is to think that there is such a thing as a “pure observation”. Experimental results are always to be interpreted in a specific theoretical context (a.k.a. the “Methods” section). One does not “measure” a model. One performs a specific experiment and observes the outcome with tools, which are themselves based on currently accepted theories. In other words, an experimental result is the answer to a specific question. But the type of question is not “What is the time constant of the model?”, but rather “What exponential function can I best fit to the electrical response of this neuron to a current pulse?”. Measurements may then provide constraints on possible models, but they never imply a model. In addition, as I noted above, physical constraints (implied by measurements) are only one side of the story, functional constraints are the other side. Neglecting this other side means studying a “soup of neurons”, not the brain.

In summary, it is often stated or implied that “realistic” models are those that are based on measurements: this is 1) an inductivist mistake, 2) a tragic disregard of what defines living beings, i.e. functional constraints.

I will end this post by asking a question: what is a better description of the brain? A soup of “realistic” neurons or a more conceptual mechanistic description of how interacting neurons support cognitive functions?