What is computational neuroscience? (XXIX) The free energy principle

The free energy principle is the theory that the brain manipulates a probabilistic generative model of its sensory inputs, which it tries to optimize by either changing the model (learning) or changing the inputs (action) (Friston 2009; Friston 2010). The “free energy” is related to the error between predictions and actual inputs, or “surprise”, which the organism wants to minimize. It has a more precise mathematical formulation, but the conceptual issues I want to discuss here do not depend on it.

Thus, it can be seen as an extension of the Bayesian brain hypothesis that accounts for action in addition to perception. It shares the conceptual problems of the Bayesian brain hypothesis, namely that it focuses on statistical uncertainty, inferring variables of a model (called “causes”) when the challenge is to build and manipulate the structure of the model. It also shares issues with the predictive coding concept, namely that there is a conflation between a technical sense of “prediction” (expectation of the future signal) and a broader sense that is more ecologically relevant (if I do X, then Y will happen). In my view, these are the main issues with the free energy principle. Here I will focus on an additional issue that is specific of the free energy principle.

The specific interest of the free energy principle lies in its formulation of action. It resonates with a very important psychological theory called cognitive dissonance theory. That theory says that you try to avoid dissonance between facts and your system of beliefs, by either changing the beliefs in a small way or avoiding the facts. When there is a dissonant fact, you generally don’t throw your entire system of beliefs: rather, you alter the interpretation of the fact (think of political discourse or in fact, scientific discourse). Another strategy is to avoid the dissonant facts: for example, to read newspapers that tend to have the same opinions as yours. So there is some support in psychology for the idea that you act so as to minimize surprise.

Thus, the free energy principle acknowledges the circularity of action and perception. However, it is quite difficult to make it account for a large part of behavior. A large part of behavior is directed towards goals; for example, to get food and sex. The theory anticipates this criticism and proposes that goals are ingrained in priors. For example, you expect to have food. So, for your state to match your expectations, you need to seek food. This is the theory’s solution to the so-called “dark room problem” (Friston et al., 2012): if you want to minimize surprise, why not shut off stimulation altogether and go to the closest dark room? Solution: you are not expecting a dark room, so you are not going there in the first place.

Let us consider a concrete example to show that this solution does not work. There are two kinds of stimuli: food, and no food. I have two possible actions: to seek food, or to sit and do nothing. If I do nothing, then with 100% probability, I will see no food. If I seek food, then with, say, 20% probability, I will see food.

Let’s say this is the world in which I live. What does the free energy principle tell us? To minimize surprise, it seems clear that I should sit: I am certain to not see food. No surprise at all. The proposed solution is that you have a prior expectation to see food. So to minimize the surprise, you should put yourself into a situation where you might see food, ie to seek food. This seems to work. However, if there is any learning at all, then you will quickly observe that the probability of seeing food is actually 20%, and your expectations should be adjusted accordingly. Also, I will also observe that between two food expeditions, the probability to see food is 0%. Once this has been observed, surprise is minimal when I do not seek food. So, I die of hunger. It follows that the free energy principle does not survive Darwinian competition.

Thus, either there is no learning at all and the free energy principle is just a way of calling predefined actions “priors”; or there is learning, but then it doesn’t account for goal-directed behavior.

The idea to act so as to minimize surprise resonates with some aspects of psychology, like cognitive dissonance theory, but that does not constitute a complete theory of mind, except possibly of the depressed mind. See for example the experience of flow (as in surfing): you seek a situation that is controllable but sufficiently challenging that it engages your entire attention; in other words, you voluntarily expose yourself to a (moderate amount of) surprise; in any case certainly not a minimum amount of surprise.

Notes on consciousness. (VIII) Is a thermostat conscious?

A theory of consciousness initially proposed by David Chalmers (in his book the Conscious Mind) is that consciousness (or experience) is a property of information processing systems. It is an additional property, not logically implied by physical laws; a new law of nature. The theory was later formalized by Giulio Tononi into Integrated Information Theory, based on Shannon’s mathematical concept of information. One important feature of this theory is it is a radical form of panpsychism: it assigns consciousness (to different degrees) to virtually anything in the world, including a thermostat.

The Bewitched experiment of thought

I have criticized IIT previously on the grounds that it fails to define in a sensible way what makes a conscious subject (eg a subsystem of a conscious entity would be another conscious entity, so for example your brain would produce an infinite number of minds). But here I want to comment specifically on the example of the thermostat. It is an interesting example brought up by Chalmers in his book. The reasoning is as follows: a human brain is conscious; a mouse brain is probably conscious, but with a somewhat lower degree (for example, no self-consciousness). As we go down the scale of information-processing systems, the system might be less and less conscious, but why would it be that there is a definite threshold for consciousness? Why would a billion neurons be conscious but not a million? Why would a million neurons be conscious but not one thousand? And how about just one neuron? How about a thermostat? A thermostat is an elementary information-processing system with just two states, so maybe, Chalmers argue, the thermostat has a very elementary form of experience.

To claim that a thermostat is conscious defies intuition, but I would not follow Searle on insisting that the theory must be wrong because it assigns consciousness to things that we wouldn’t intuitively think are conscious. As I argued in a previous post, to claim that biology tells us that only brains are conscious is to use circular arguments. We don’t know whether anything else than a brain is conscious, and since consciousness is subjective, to decide whether anything is conscious is going to involve some theoretical aspects. Nonetheless, I am skeptical that a thermostat is conscious.

I propose to examine the Bewitched experiment of thought. In the TV series Bewitched, Samantha the housewife twitches her nose and everyone freezes except her. Then she twitches her nose and everyone unfreezes, without noticing that anything happened. For them, time has effectively stopped. The question is: was anyone experiencing anything during that time? To me, it is clear that no one can experience anything if time is frozen. In fact, that whole time has not existed at all for the conscious subject. It follows that a substrate with a fixed state (e.g. hot/cold) cannot experience anything, because time is effectively frozen for that substrate. Experience requires a flow of time, a change in structure through time. I leave it open whether the interaction of the thermostat with the room might produce experience for that coupled system (see below for some further thoughts).

What is “information”?

In my view, the fallacy in the initial reasoning is to put the thermostat and the brain in the same scale. That scale is the set of information-processing systems. But as I have argued before (mostly following Gibson’s arguments), it is misleading to see the brain an information-processing system. The brain can only be seen to transform information of one kind into information of another kind by an external observer, because the very concept of information is something that makes sense to a cognitive/perceptual system. The notion of information used by IIT is Shannon information, a notion from communication theory. This is an extrinsic notion of information: for example, neural activity is informative about objects in the world in the sense that properties of those objects can be inferred from neural activity. But this is totally unhelpful to understand how the brain, which only ever gets to deal with neural signals and not things in the world, sees the world (see this argument in more detail in my paper Is coding a relevant metaphor for the brain?).

Let’s clarify with a concrete case: does the thermostat perceive temperature? The thermostat can be in different states depending on temperature, but from its perspective, there is no temperature. There are changes in state that seems to be unrelated to anything else (there is literally nothing else for the thermostat). One could replace the temperature sensor with some other sensor, or with a random number generator, and there would be literally no functional change in the thermostat itself. Only an external observer can link the thermostat’s state with temperature, so the thermostat cannot possibly be conscious of temperature.

Thus, Shannon’s notion of information is inappropriate to understand consciousness. Instead of extracting information in the sense of communication theory, what the brain might do is build models of sensory (sensorimotor) signals from its subjective perspective, in the same way as scientists make models of the world with observations (=sensory signals) and experiments (=actions). But this intrinsic notion of information, which corresponds eg to laws of physics, is crucially not what Shannon’s notion of information is. And it is also not the kind of information that a thermostat is dealing with.

This inappropriate notion of information leads to what in my view is a rather absurd quantitative scale of consciousness, according to which entities are more or less conscious along a graded scale (phi). Differences in consciousness are qualitative, not quantitative: there is dreaming, being awake, being self-conscious or not, etc. These are not different numbers. This odd analog scale arises because Shannon information is counted in bits. But information in the sense of knowledge (science) is not counted in bits; there are different kinds of knowledge, they have different structure, relations between them etc.

Subjective physics of a thermostat

But let us not throw away Chalmers’ interesting experiment of thought just now. Let us ask, following Chalmers: what does it feel like to be a thermostat? We will examine it not with Shannon’s unhelpful notion of information but with what I called “subjective physics”: the laws that govern sensory signals and their relations to actions, from the perspective of the subject. This will define my world from a functional viewpoint. Let’s say I am a conscious thermostat; a homunculus inside the thermostat. All I can observe is a binary signal. Then there is a binary action that I can make, which for an external observer corresponds to turning on the heat. What kind of world does that make to me? Let’s say I’m a scientist homunculus, what kind of laws about the world can I infer?

If I’m a conventional thermostat, then the action will be automatically triggered when the signal is in a given state (“cold”). After some time, the binary signal will switch and so will the action. So in fact there is an identity between signal and action, which means that all I really observe is just the one binary signal, switching on and off, probably with some kind of periodicity. This is the world I might experience, as a homunculus inside the thermostat (note that to experience the periodicity requires memory, which a normal thermostat doesn’t have). In a way, I’m a “locked-in” thermostat: I can make observations, but I cannot freely act.

Let’s say that I am not locked-in and have a little more free will, so I can decide whether to act (heat) or not. If I can, then my world is a little bit more interesting: my action can trigger a switch of the binary signal, after some latency (again requiring some memory), and then when I stop, the binary signal switches back, after a time that depends on how much time my previous action lasted. So here I have a world that is much more structured, with relatively complex laws which in a way defines the concept of “temperature” from the perspective of the thermostat.

So if a thermostat were conscious, then we have a rough idea of the kind of world it might experience (although not how it feels like), and even in this elementary example, you can’t measure these experiences in bits - let alone the fact that a thermostat is not conscious anyway.

A brief critique of predictive coding

Predictive coding is becoming a popular theory in neuroscience (see for example Clark 2013). In a nutshell, the general idea is that brains encode predictions of their sensory inputs. This is an appealing idea because superficially, it makes a lot of sense: functionally, the only reason why you would want to process sensory information is if it might impact your future, so it makes sense to try to predict your sensory inputs.

There are substantial problems in the details of predictive coding theories, for example with the arbitrariness of the metric by which you judge that your prediction matches sensory inputs (what is important?), or the fact that predictive coding schemes encode both noise and signal. But I want to focus on the more fundamental problems. One has to with “coding”, the other with “predictive”.

It makes sense that brains anticipate. But does it make sense that brains code? Coding is a metaphor of a communication channel, and this is generally not a great metaphor for what the brain might do, unless you fully embrace dualism. I discuss this at length in a recent paper (Is coding a relevant metaphor for the brain?) so I won’t repeat the entire argument here. Predictive coding is a branch of efficient coding, so the same fallacy underlies its logic: 1) neurons encode sensory inputs; 2) living organisms are efficient; => brains must encode efficiently. (1) is trivially true in the sense that one can define a mapping from sensory inputs to neural activity. (2) is probably true to some extent (evolutionary arguments). So the conclusion follows. Critiques of efficient coding have focused on the “efficient” part: maybe the brain is not that efficient after all. But the error is elsewhere: living organisms are certainly efficient, but it doesn’t follow that they are efficient at coding. They might be efficient at surviving and reproducing, and it is not obvious that it entails coding efficiency (see the last part of the abovementioned paper for a counter-example). So the real strong assumption is there: the main function of the brain is to represent sensory inputs.

The second problem has to with “predictive”. It makes sense that an important function of brains, or in fact of any living organism, is to anticipate (see the great Anticipatory Systems by Robert Rosen). But to what extent do predictive coding schemes actually anticipate? First, in practice, those are generally not prediction schemes but compression schemes, in the sense that they do not tell us what will happen next but what happens now. This is at least the case of the classical Rao & Ballard (1999). Neurons encode the difference between expected input and actual input: this is compression, not prediction. It uses a sort of prediction in order to compress: other neurons (in higher layers) produce predictions of the inputs to those neurons, but the term prediction is used in the sense that the inputs are not known to the higher layer neurons, not that the “prediction” occurs before the inputs. Thus the term “predictive” is misleading because it is not used in a temporal sense.

However, it is relatively easy to imagine how predictive coding might be about temporal predictions, although the neural implementation is not straightforward (delays etc). So I want to make a deeper criticism. I started by claiming that it is useful to predict sensory inputs. I am taking this back (I can because I said it was superficial reasoning). It is not useful to know what will happen. What is useful is to know what might happen, depending on what you do. If there is nothing you can do about the future, what is the functional use of predicting it? So what is useful is to predict the future conditionally to a different set of potential actions. This is about manipulating models of the world, not representing the present.

Notes on consciousness. (VII) The substrate of consciousness

Here I want to stir some ideas about the substrate of consciousness. Let us start with a few intuitive ideas: a human brain is conscious; an animal brain is probably conscious; a stone is not conscious; my stomach is not conscious; a single neuron or cell is not conscious; the brainstem or the visual cortex is not a separate conscious entity; two people do not form a single conscious entity.

Many of these ideas are in fact difficult to justify. Let us start with single cells. To see the problem, think first of organisms that consist of a single cell. For example, bacteria, or ciliates. In this video, an amoeba’s engulfs and then digests two paramecia. At some point, you can see the paramecia jumping all around as if they were panicking. Are these paramecia conscious, do they feel anything? If I did not know anything about their physiology or size, my first intuition would be that they do feel something close to fear. However, knowing that these are unicellular organisms and therefore do not have a nervous system, my intuition is rather that they are not actually conscious. But why?

Why do we think a nervous system is necessary for consciousness? One reason is that organisms to which we ascribe consciousness (humans and animals, or at least some animals) all have a nervous system. But it’s a circular argument, which has no logical validity. A more convincing reason is that in humans, the brain is necessary and sufficient for consciousness. A locked-in patient is still conscious. On the other hand, any large brain lesion has an impact on conscious experience, and specific experiences can be induced by electrical stimulation of the brain.

However, this tends to prove that the brain is the substrate of my experience, but it says nothing about, say, the stomach. The stomach also has a nervous system, it receives sensory signals and controls muscles. If it were conscious, I could not experience it, by definition, since you can only experience your own consciousness. So it could also be, just as for the brain, that the stomach is sufficient and necessary for consciousness of the gut mind: perhaps if you stimulate it electrically, it triggers some specific experience. As ridiculous as it might sound, I cannot discard the idea that the stomach is conscious just because I don’t feel that it’s conscious; I will need arguments of a different kind.

I know I am conscious, but I do not know whether there are other conscious entities in my body. Unfortunately, this applies not just to the stomach, but more generally to any other component of my body, whether it has a nervous system or not. What tells me that the liver is not conscious? Imagine I am a conscious liver. From my perspective, removing one lung, or a foot, or a large part of the visual cortex, has no effect on my conscious experience. So the fact that the brain is necessary and sufficient for your conscious experience doesn’t rule out the fact that some other substrate is necessary and sufficient for the conscious experience of another entity in your body. Now I am not saying that the question of liver consciousness is undecidable, only that we will need more subtle arguments than those exposed so far (discussed later).

Let us come back to the single cell. Although I feel that a unicellular organism is not conscious because it doesn’t have a nervous system, so far I have no valid argument for this intuition. In addition, it turns out that Paramecium, as many other unicellular organism including (at least some) bacteria, is an excitable cell with voltage-gated channels, structurally very similar to a neuron. So perhaps it has some limited form of consciousness after all. If this is true, then I would be inclined to say that all unicellular organisms are also conscious, for example bacteria. But then what about a single cell (eg a neuron) in your body, is it conscious? One might object that a single cell in a multicellular organism is not an autonomous organism. To address this objection, I will go one level below the cell.

Eukaryotic cells (eg your cells) have little energy factories called mitochondria. It turns out that mitochondria are in fact bacteria which have been engulfed in cells a very long (evolutionary) time ago. They have their own DNA, but they now live and reproduce inside cells. This is a case of endosymbiosis. If mitochondria were conscious before they lived in cells, why would they have lost consciousness when they started living in cells? So if we think bacteria are conscious, then we must admit that we have trillions of conscious entities in the cells of our body – not counting the bacteria in our digestive system. The concept of an autonomous organism is an illusion: any living organism depends on interactions with an ecosystem, and that ecosystem might well be a cell or a multicellular organism.

By the same argument, if we think unicellular organisms are conscious, then single neurons should be conscious, as well as all single cells in our body. This is not exclusive of the brain being conscious as a distinct entity.

A plausible alternative, of course, is that single cells are not conscious, although I have not yet proposed a good argument for this alternative. Before we turn to a new question, I will let you contemplate the fact that bacteria can form populations that are tightly coupled by electrical communication. Does this make a bacteria colony conscious?

Let us now turn to another question. We can imagine that a cell is somehow minimally conscious, and that at the same time a brain forms a conscious entity of a different nature. Of course it might not be true, but there is a case for that argument. So now let us consider two people living their own life on opposite sides of the planet. Can this pair form a new conscious entity? Here, there are arguments to answer negatively. This is related to a concept called the unity of consciousness.

Suppose I see a red book. In the brain, some areas might respond to the color and some other areas might respond to the shape. It could be then that the color area experiences redness, and the shape area experience bookness. But I, as a single conscious unit, experiences a red book as a whole. Now if we consider two entities that do not interact, then there cannot be united experiences: somehow the redness and the bookness must be put together. So the substrate of a conscious entity cannot be made of parts that do not interact with the rest. Two separated people cannot form a conscious entity. But this does not rule out the possibility that two closely interacting people may not form a conscious superentity. Again, I do not believe this is the case, but we need to find new arguments to rule this out.

Now we finally have something a little substantial: a conscious entity must be made of components in interaction. From this idea follow a few remarks. First, consciousness is not a property of a substrate, but of the activity of a substrate (see a previous blog post on this idea). For example, if we freeze the brain in a particular state, it is not conscious. This rules out a number of inanimate objects (rocks) as conscious. Second, interactions take place in time. For example, it takes some time, up to a few tens of ms, for an action potential to travel from one neuron to another. This implies that a 1 ms time window cannot enclose a conscious experience. The “grain” of consciousness for a human brain should thus be no less than a few tens of milliseconds. In the same way, if a plant is conscious, then that consciousness cannot exist on a short timescale. This puts a constraint on the kind of experiences that can be ascribed to a particular substrate. Does consciousness require a nervous system? Maybe it doesn’t, but at least for large organisms, a nervous system is required to produce experiences on a short timescale.

I want to end with a final question. We are asking what kind of substrate gives rise to consciousness. But does consciousness require a fixed substrate? After all, the brain is dynamic. Synapses appear and disappear all the time, all the proteins get renewed regularly. The brain is literally a different set of molecules and a different structure from one day to the next. But the conscious entity remains. Or at least it seems so. This is what Buddhists call the illusion of self: contrary to your intuition, you are not the same person today and ten years ago; the self has no objective permanent existence. However, we can say that there is a continuity in conscious experience. That continuity, however, does not rely on a fixed material basis but more likely on some continuity of the underlying activity. Imagine for example a fictional worm that is conscious, but the substrate of consciousness is local. At some point it is produced by the interaction of neurons at some particular place of the nervous system, then that activity travels along the worm’s spine. The conscious entity remains and doesn’t feel like it’s travelling, it is simply grounded on a dynamic substrate.

Now I don’t think that this is true of the brain (or of the worm), but rather that long-range synchronization has something to do with the generation of a global conscious entity. However, it is conceivable that different subsets of neurons, even though they might span the same global brain areas, are involved in conscious experience at different times. In fact, this is even plausible. Most neurons don’t fire much, perhaps a few Hz on average. But one can definitely have a definite conscious experience over a fraction of second, and that experience thus can only involve the interaction of a subset of all neurons. We must conclude that the substrate of consciousness is actually not fixed but involve dynamic sets of neurons.

A summary of these remarks. I certainly have raised more questions than I have answered. In particular, it is not clear whether a single cell or a component of the nervous system (stomach, brainstem) is conscious. However, I have argued that: 1) any conscious experience requires the interaction of the components that produce it, and this interaction takes place in time; 2) the set of components that are involved in any particular experience is dynamic, despite the continuity in conscious experience.

Project: Binaural cues and spatial hearing in ecological environments

I previously laid out a few ideas for future research on spatial hearing:

  1. The ecological situation and the computational problem.
  2. Tuning curves.
  3. The coding problem.

This year I wrote a grant that addresses the first point. My project was rather straightforward:

  1. To make binaural recordings with ear mics in real environments, with real sound sources (actual sounding objects) placed at predetermined positions. This way we obtain distributions of binaural cues conditioned on source direction, capturing the variability due to context.
  2. To measure human localization performance in those situations.
  3. To try to see if a Bayesian model can account for these results, and possibly previous psychophysical results.

The project was preselected but unfortunately not funded. I probably won't resubmit it next year, except perhaps with a collaboration. So here it is for everyone to read: my grant application, "Binaural cues and spatial hearing in ecological environments". If you like it, if you want to do these recordings and experiments, please do so. I am interested in the results, but I'm happy if someone else does it. Please contact me if you would like to set up a collaboration, or discuss the project. I am especially interested in the theoretical analysis (ie the third part of the project). Our experience in the lab is primarily on the theoretical side, but also in signal analysis, and we have done a number of binaural recordings too and some psychophysics.

Are journals necessary filters?

In my previous post, I argued that one reason why many people cling to the idea that papers should be formally peer-reviewed before they are published, cited and discussed, despite the fact that this system is a recent historical addition to the scientific enterprise, is a philosophical misunderstanding about the nature of scientific truth. That is, the characteristic of science, as opposed to religion, is that it is never validated; it can and must be criticized. Therefore, no amount of peer reviewing can ever be a stamp of approval for “proven facts”. Instead what we need is public discussion of the science, not stamps of approvals.

In response to that post came up another common reason why many people think it’s important to have journals that select papers after peer-review. The reason is that we are crowded with millions of papers and you can’t read everything, so you need some way to know which paper is important, based on peer-review. So here this is not just about peer-reviewing before publishing, but also about the hierarchy of journals. Journals must do an editorial selection so that you don’t have to waste your time reading low-quality papers, or uninteresting papers. What this means, quite literally, is that you only read papers from “top journals”.

Here I want to show that this argument is untenable, because selecting their readings based on journal names is not what scientists should do or actually do, and because the argument is logically inconsistent.

Why is it logically inconsistent? If the argument is correct, then those papers accepted in lower rank journals should not be read because they are not worth reading. But in that case, why publish them at all? There seems to be no reason for the existence of journals that people do not read because they do not have time to read bad papers. If we argue that those journals should exist because in some cases there are some papers worth reading there, for any sort of reason, then we must admit that we don’t actually use journals as filters, or that we should not use them as filters (see below).

Is it good scientific practice to use journal names as filters? What this implies is that you ignore any paper, including papers in your field and potentially relevant to your own studies, which are not published in “top journals”. So for example, you would not cite a relevant study if it’s not from a top journal. It also means that you don’t check that your own work overlaps other studies. So you potentially take credit for ideas that you were not the first to have. Is this a professional attitude?

If in fact you don’t totally ignore those lower journals, then you don’t actually use journal name as a filter, you actually do look at the content of papers independently of the journal they are published in. Which is my final point: to use journal names as filters is not the normal practice of scientists (or maybe I’m optimistic?). When you look for relevant papers on your topic of interest, you typically do a search (eg pubmed). Do you only consider papers from “top journals”, blindly discarding all others? Of course not. You first look at the titles to see if it might be relevant; then you read the abstract if they are; if the abstract is promising you might open the paper and skim through it, and possibly read it carefully if you think it is worth it. Then you will look at cited papers; or at papers that cite the interesting you just read; or you will read a review; maybe a colleague or your advisor will suggest a few readings. In brief: you do a proper bibliographical search. I cannot believe that any good scientist considers that doing a bibliographical search consists in browsing the table of contents of top journals.

The only case when you do use journal names to select papers to read is indeed when you read tables of contents every month for a few selected journals. How much of this accounts for the papers that you cite? You can get a rough idea of this by looking at the cited half-life of papers or journals. For Cell, it’s about 9 years. I personally also follow new papers on biorxiv using keywords, while most new papers in journals are irrelevant to me because they cover too many topics.

In summary: using journals as filters is not professional because it means poor scholarship and misattribution of credit. Fortunately it’s not what scientists normally do anyway.

One related argument that came out in the discussion of my previous post is that having papers reviewed post-publication could not work because that would be too much work, and consequently most papers would not be reviewed, while at least in the current system every paper is peer reviewed. That is wrong in several ways. First, you can have papers published then peer-reviewed formally and publicly (as in F1000 Research), without this being coupled to editorial selection. Second, if anything, having papers submitted a single time instead of many times to different journals implies that there will be less work for reviewers, not more. Third, what is exactly the advantage of having each paper peer-reviewed if it is argued that those papers should not be read or cited? In the logic where peer review in “good journals” serves as filters for important papers, it makes no difference whether the unimportant papers are peer reviewed or not, so this cannot count as a valid argument against post-publication review.

All this being said, there is still a case for editorial selection after publication, as one of the many ways to discover papers of interest, see for example my free journal of theoretical neuroscience.

The great misunderstanding about peer review and the nature of scientific facts

Last week I organized a workshop on the future of academic publication. My point was that our current system, based on private pre-publication peer review, is archaic. I noted that the way the peer review system is currently organized (where external reviewers judge both the quality of the science and the interest for the journal) represents just a few decades in the history of science. It can hardly qualify as the way science is or should be done. It is a historical feature. For example, only one of Einstein’s papers was formally peer-reviewed; Crick & Watson’s DNA paper was not formally peer-reviewed. Many journals introduced external peer review in the 1960s or 1970s to deal with the growth in the number and variety of submissions (see e.g. Baldwin, 2015); before that, editors would decide whether to publish the papers they received, depending on the number of pages they could print.

Given the possibilities that offers the internet, it seems that there is no reason anymore to couple the two current roles of peer review: editorial selection and scientific discussion. One could simply share their work online, get feedback from the community to discuss the work, and then let people recommend papers to their colleagues and compile all sorts of reader’s digests. No time wasted in multiple submissions, no prestige misattributed to publications in glamour journals, who do not do a better a job than any other journal at pointing errors and frauds. Just the science and the public discussion of science.

But there is a lot of resistance to this idea, namely the idea that papers should be formally approved by peer reviewers before they are published. Because otherwise, so many people claim, the scientific world would be polluted by all sorts of unverified claims. It would not be science anymore, just gossip. I have attributed this attitude to conservatism, first because as noted above this system is a rather recent addition to the scientific enterprise, and second because papers are published before peer review. We call those “preprints”, but really these are scientific papers made public, so by definition they are published. I follow the preprints in my field and I don’t see any particular loss in quality.

However, I think I was missing a key element. The more profound reason why many people, in particular experimental biologists, are so attached to peer review is in my view that they hold naive philosophical views about the notion of truth in science. A paper should be peer-reviewed because otherwise you can’t cite it as a true fact. Peer review validates science, thanks to experts who make sure that the claims of the authors are actually true. Of course it can go wrong and reviewers might miss something, but it is the purpose of peer review. This view is reflected in the tendency, especially in biology journals, to choose titles that look like established truths: “Hunger is controlled by HGRase”, instead of “The molecular control of hunger”. Scientists and journalists can then write revealed truths with a verse reference, such as “Hunger is controlled by HGRase (McDonald et al., 2017)”.

The great misunderstanding is that truth is a notion that applies to logical propositions (for example, mathematical theorems), not to empirical claims. This has been well argued by Popper, for example. Truth is by nature a theoretical concept. Everything said is said with words, and in this sense it always refers to theoretical concepts. One can only judge whether observations are congruent with the meaning attributed to the words, and that meaning necessarily has a theoretical nature. There is no such thing as an “established fact”. This is so even of what we might consider as direct observations. Take for example the claim “The resting potential of neurons is -70 mV”. This is a theoretical statement. Why? First, because to establish it, I have recorded a number of neurons. If you test it, it will be on a different neuron, which I have not measured. So I am making a theoretical claim. Probably, I also tested my neurons with a particular method (not mentioning a particular region and species). But my claim makes no reference to the method by which I have made the inference. That would be the “methods” part of my paper, not the conclusion, and when you cite my paper, you will cite it because of the conclusion, the “established fact”, you will not be referring to the methods, which you consider are the means to establish the fact. It is the role of the reviewers to check the methods, to check that they do establish the fact.

But these are trivial remarks. It is not just that the method matters. The very notion of an observation always implicitly relies on a theoretical background. When I say that the resting potential is -70 mV, I mean that there is a potential difference of -70 mV across the membrane. But that’s not what I measure. I measure the difference in potential between some point outside the cell and the inside of a patch pipette whose solution is in contact with the cell’s inside. So I am assuming the potential is the same in all points of the cytosol, even though I have not tested it. I am also implicitly modeling the cytosol as a solution, even though the reality is more complex than that, given the mass of charged proteins in it. I am assuming that the extracellular potential is constant. I am assuming that my pipette solution reasonably matches the actual cytosol solution, given that “solution” is only a convenient model. I am implicitly making all sorts of theoretical assumptions, which have a lot of empirical support but are still of a theoretical nature.

I have tried with this example to show that even a very simple “fact” is actually a theoretical proposition, with many layers of assumptions. But of course in general, papers typically make claims that rely less firmly on accepted theoretical grounds, since they must be “novel”. So it is never the case that a paper definitely proves its conclusions. Because conclusions have a theoretical nature, all that can be checked is whether observations are consistent with the authors’ interpretation.

So the goal of peer review can’t be to establish the truth. If it were the case, then why would reviewers ever disagree? They disagree because they cannot actually judge whether a claim is true; they can only say whether they are personally convinced. This makes the current peer review system extremely poor, because all the information we get is: two anonymous people were convinced (and maybe others were not, but we’ll never find out). What would be more useful would be to have an open public discussion, with criticisms, qualifications and alternative interpretations fully disclosed for anyone to read and make their own opinion. In such a system, the notion of a stamp of approval on a paper would simply be absurd; why hide the disapprovals? There is the paper, and there is the scientific discussion of the paper, and that is all there needs to be.

There is some concern these days that peer reviewed research is unreliable. Well, science is unreliable. That is almost what defines it: it can be criticized and revised. Seeing peer review as the system that establishes the scientific truth is not only a historical error, it is a great philosophical error, and a dangerous bureaucratic view of science. We don’t need editorial decisions based on peer review. We need free publication (we have it) and we need open scientific discussion (it’s coming). That’s all we need.

What is computational neuroscience? (XXVII) The paradox of the efficient code and the neural Tower of Babel

A pervasive metaphor in neuroscience is the idea that neurons “encode” stuff: some neurons encode pain; others encode the location of a sound; maybe a population of neurons encode some other property of objects. What does this mean? In essence, that there is a correspondence between some objective property and neural activity: when I feel pain, this neuron spikes; or, the image I see is “represented” in the firing of visual cortical neurons. The mapping between the objective properties and neural activity is the “code”. How insightful is this metaphor?

An encoded message is understandable to the extent that the reader knows the code. But the problem with applying this metaphor to the brain is only the encoded message is communicated, not the code, and not the original message. Mathematically, original message = encoded message + code, but only one term is communicated. This could still work if there were a universal code that we could assume all neurons can read, the “language of neurons”, or if somehow some information about the code could be gathered from the encoded messages themselves. Unfortunately, this is in contradiction with the main paradigm in neural coding theory, “efficient coding”.

The efficient coding hypothesis stipulates that neurons encode signals into spike trains in an efficient way, that is, it uses a code such that all redundancy is removed from the original message while preserving information, in the sense that the encoded message can be mapped back to the original message (Barlow, 1961; Simoncelli, 2003). This implies that with a perfectly efficient code, encoded messages are undistinguishable from random. Since the code is determined on the statistics of the inputs and only the encoded messages are communicated, a code is efficient to the extent that it is not understandable by the receiver. This is the paradox of the efficient code.

In the neural coding metaphor, the code is private and specific to each neuron. If we follow this metaphor, this means that all neurons speak a different language, a language that allows expressing concepts very concisely but that no one else can understand. Thus, according to the coding metaphor, the brain is a Tower of Babel.

Can this work?

10 simple rules to format a preprint

Submitting papers to preprint servers (bioRxiv) is finally getting popular in biology. Unfortunately, many of these papers are formatted in a way that is very inconvenient to read, possibly because authors stick to the format asked by journals. Here are 10 simple rules to format your preprints:

  1. Format your preprint in the way you would like to read it. The next rules simply implement this first rule.
  2. Use single spacing. No one is going to write between the lines.
  3. Insert figures and their captions in the text, at the relevant place. It is really annoying when you have to continuously go back and forth between the text and the last pages. Putting figures at the end of the paper and captions yet at another place should be punished.
  4. Don’t forget the supplementary material.
  5. We don’t really need 10 rules. In fact the first rule is just fine.

What is computational neuroscience (XXV) - Are there biological models in computational neuroscience?

Computational neuroscience is the science of how the brain “computes”, that is, how the brain performs cognitive functions such as recognizing a face or walking. Here I will argue that most models of cognition developed in the field, especially as regards sensory systems, are actually not biological models but hybrid models consisting of a neural model together with an abstract model.

First of all, many neural models are not meant to be models of cognition. For example, there are models that are developed to explain the irregular spiking of cortical neurons, or oscillations. I will not consider them. According to the definition above, I categorize them in theoretical neuroscience rather than computational neuroscience. Here I consider for example models of perception, memory, motor control.

An example that I know well is the problem of localizing a sound source from timing cues. There are a number of models, including a spiking neuron model that we have developed (Goodman and Brette, 2010). This model takes as input two sound waves, corresponding to the two monaural sounds produced by the sound source, and outputs the estimated direction of the source. But the neural model, of course, does not output a direction. Rather, the output of the neural model is the activity of a layer of neurons. In the model, we consider that direction is encoded by the identity of the maximally active neuron. In another popular model in the field, direction is encoded by the relative total activity of two groups of neurons (see our comparison of models in Goodman et al. 2013). In all models, there is a final step which maps the activity of neurons to estimated sound location, and this step is not a neural model but an abstract model. This causes big epistemological problems when it comes to assessing and comparing the empirical value of models because a crucial part of the models is not physiological. Some argue that neurons are tuned to sound location; others that population activity varies systematically with sound location. Both are right, and thus none of these observations is a decisive argument to discriminate between the models.

The same is seen in other sensory modalities. The output is the identity of a face; or of an odor; etc. The symmetrical situation occurs in motor control models: this time the abstract model is on the side of the input (mapping from spatial position to neural activity or neural input). Memory models face this situation twice, with abstract models both on the input (the thing to be memorized) and the output (the recall).

Fundamentally, this situation has to do with the fact that most models in computational neuroscience take a representational approach: they describe how neural networks represent in their firing some aspect of the external world. The representational approach requires defining a mapping (called the “decoder”) from neural activity to objective properties of objects, and this mapping cannot be part of the neural model. Indeed, sound location is a property of objects and thus does not belong to the domain of neural activity. So no sound localization model can ever be purely neuronal.

Thus to develop biological models, it is necessary to discard the representational approach. Instead of “encoding” things, neurons control the body; neurons are agents (rather than painters in the representational approach). For example, a model of sound localization should be a model of an orientational response, including the motor command. The model explains not how space is “represented”, but how an animal orients its head (for example) to a sound source. When we try to model an actual behavior, we find that the nature of the problem changes quite significantly. For example, because a particular behavior is an event, neural firing must also be seen as events. In this context, counting spikes and looking at the mutual information between the count and some stimulus property is not very meaningful. What matters is the events that the spikes trigger in the targets (muscles or other neurons). The goal is not to represent the sensory signals but to produce an appropriate behavior. One also realizes that the relation between sensory signals and actions is circular, and therefore cannot be adequately described as “processing”: sensory signals make you turn the head, but if you turn the head, the sensory signals change.

Currently, most models of cognition in computational neuroscience are not biological models. They include neuron models together with abstract models, a necessity stemming from the representational approach. To a make biological model requires including a model of the sensorimotor loop. I believe this is the path that the community should take.