Is the coding metaphor relevant for the genome?

I have argued that the neural coding metaphor is highly misleading (see also similar arguments by Mark Bickhard in cognitive science). The coding metaphor is very popular in neuroscience, but there is another domain of science where it is also very popular: genetics. Is there a genetic code? Many scientists have criticized the idea of a genetic code (and of a genetic program). A detailed criticism can be found in Denis Noble’s book “The music of life” (see also Noble 2011 for a short review).

Many of the arguments I have made in my essay on neural coding readily apply to the “genetic code”. Let us start with the technical use of the metaphor. The genome is a sequence of DNA base triplets called “codons” (ACG, TGA, etc). Each codon specifies a particular amino-acid, and proteins are made of amino-acids. So there is a correspondence between DNA and amino-acids. This seems an appropriate use of the term “code”. But even it in this limited sense, it should be used with caution. The fact that a base triplet encodes an amino-acid is conditional on this triplet being effectively translated into an amino-acid (note that there are two stages, transcription into RNA, then translation into a protein). But in fact only a small fraction of a genome is actually translated, about 10% (depending on species); the rest is called “non-coding DNA”. So the same triplets can result in the production of an amino-acid, or they can influence the translation-transcription system in various ways, for example by interacting with various molecules involved in the production of RNA and proteins, thereby regulating transcription and translation (and this is just one example).

Even when DNA does encode amino-acids, it does not follow that a gene encodes a protein. What might be said is that a gene encodes the primary structure of proteins, that is, the sequence of amino-acids; but it does not specify by itself the shape that the protein will take (which determines its chemical properties), the various modifications that occur after translation, the position that the protein will take in the cellular system. All of those crucial properties depend on the interaction of the product of transcription with the cellular system. In fact, even the primary structure of proteins is not fully determined by the gene, because of splicing.

Thus, the genome is not just a book, as suggested by the coding metaphor (some have called the genome the “book of life”); it is a chemically active substance that interacts with its chemical environment, a part of a larger cellular system.

At the other end of the genetic code metaphor, genes encode phenotypes, traits of the organism. For example, the gene for blue eyes. A concept that often appears in the media is the idea of genes responsible for diseases. One hope behind the human genome project was that by scrutinizing the human genome, we might be able to identify the genes responsible for every disease (at least for every genetic disease). Some diseases are monogenic, i.e., due to a single gene defect, but the most common diseases are polygenic, i.e., are due to a combination of genetic factors (and generally environmental factors).

But even the idea of monogenic traits is misleading. There is no single gene that encodes a given trait. What has been demonstrated in some cases is that mutations in a single gene can impact a given trait. But this does not mean that the gene is responsible by itself for that trait (surprisingly, this fallacy is quite common in the scientific literature, as pointed out by Yoshihara & Yoshihara 2018). A gene by itself does nothing. It needs to be embedded into a system, namely a cell, in order to produce any phenotype. Consequently, the expressed phenotype depends on the system in which the gene is embedded, in particular the rest of the genome. There cannot be a gene for blue eyes if there are no eyes. So no gene can encode the color of eyes; this encoding is at best contextual (in the same way as “neural codes” are always contextual, as discussed in my neural coding essay).

So the concept of a “genetic code” can only be correct in a trivial sense: that the genome, as a whole, specifies the organism. This clearly limits the usefulness of the concept, however. Unfortunately, even this trivial claim is also incorrect. An obvious objection is that the genome specifies the organism only in conjunction with the environment. The deeper objection is that the immediate environment of the genome is the cell itself. No entity smaller than the cell can live or reproduce. The genome is not a viable system, and as such it cannot produce an organism, nor can it reproduce. An interesting experiment is the following: the nucleus (and thus the DNA) from an animal cell is transferred to the egg of an animal of another species (where the nucleus has been removed) (Sun et al., 2005). The “genetic code” theory would predict that the egg would develop into an animal of the donor species. What actually happens (this was done in related fish species) is that the egg develops into some kind of hybrid, with the development process closer to that of the recipient species. Thus, even in the most trivial sense, the genome does not encode the organism. Finally, since no entity smaller than the cell can reproduce, it follows that the genome is not the unique basis of heritability – the entire cell is (see Fields & Levin, 2018).

In summary, the genome does not encode much except for amino-acids (for about 10% of it). It should be conceptualized as a component that interacts with the cellular system, not as a “book” that would be read by some cellular machinery.

What is computational neuroscience? (XXXIII) The interactivist model of cognition

The interactivist model of cognition has been developed by Mark Bickhard over the last 40 years or so. It is related to the viewpoints of Gibson and O’Regan, among others. The model is described in a book (Bickhard and Tervenn, 1996) and a more recent review (Bickhard 2008).

It starts with a criticism of what Bickhard calls “encodingism”, the idea that mental representations are constituted by encodings, correspondences between things in the world and symbols (this is very similar to my criticism of the neural coding metaphor, except Bickhard’s angle is cognitive science while mine was neuroscience). The basic argument is that the encoding “crosses the boundary of the epistemic agent”: the perceptual system stands on only one side of the correspondence, so there is no way it can interpret symbols in terms of things in the world since it never has access to things in the world at any point. The interpretation of the symbols in terms of things in the world would require an interpreter, some entity that makes sense of a priori arbitrary symbols. But this was precisely the epistemic problem to be solved, so the interpreter is a homunculus and this is an incoherent view. This is related to the skeptic argument about knowledge: there cannot be valid knowledge since we acquire knowledge by our senses and we cannot step outside of ourselves to check that it is valid. Encodingism fails the skeptic objection. Note that Bickhard refutes neither the possibility of representations nor even the possibility of encodings, but rather the fact that encodings can be foundational of representations. There can be derivative encodings, based on existing representations (for example Morse is a derivative encoding, which presupposes that we know about both letters and dots and dashes).

A key feature that a representational system must have is what Bickhard calls “system-detectable errors”. A representational system must be able to test whether its representations are correct or not. This is not possible in encodingism because the system does not have access to what is being represented (knowledge that cannot be checked is what I called “metaphysical knowledge” in my Subjective physics paper). No learning is possible if there are no system-detectable errors. This is the problem of normativity.

The interactivist model proposes the following solution: representations are anticipations of potential interactions and their expected impact on future states of the systems, or on the future course of processes of the system (this is close to Gibson’s “affordances”). I give an example taken from Subjective physics. Consider a sound source located somewhere in space. What does it mean to know where the sound came from? In the encoding view, we would say that the system has a mapping between the angle of the source and properties of the sounds, and so it infers the source’s angle from the captured sounds. But what can this mean? Is the inferred angle in radians or degrees? Surely radians and degrees cannot make sense for the perceiver and cannot have been learned (this is what I called “metaphysical knowledge”), so in fact the representation cannot actually be in the form of the physical angle of the source. Rather, what it means that the source is at a given position is that (for example) you would expect that moving your eyes in a particular way would make the source appear in your fovea (see more detail about the Euclidean structure of space and related topics in Subjective physics). Thus, the notion of space is a representation of the expected consequences of certain types of actions.

The interactivist model of representations has the desirable property that it has system-detectable errors: a representation can be correct or not, depending on whether the anticipation turns out to be correct or not. Importantly, what is anticipated is internal states, and therefore the representation does not cross the boundary of the epistemic agent. Contrary to standard models of representation, the interactivist model successfully addresses the skeptic argument.

The interactivist model is described at a rather abstract level, often referring to abstract machine theory (states of automata). Thus, it leaves aside the problem of its naturalization: how is it instantiated by the brain? Important questions to address are: what is a ‘state’ of the brain? (in particular given that the brain is a continuously active dynamical system where no “end state” can be identified); how do we cope with its distributed nature, that is, that the epistemic agent is itself constituted of a web of interacting elementary epistemic agents? how are representations built and instantiated?

Better than the grant lottery

Funding rates for most research grant systems are currently very low, typically around 10%. This means that 90% of the time spent on writing and evaluating grant applications is wasted. It means that if each grant spans 5 years, then a PI has to write about 2 grants per year to be continuously funded; in practice, to reduce risk it should be more than 2 per year. It is an enormous waste, and in addition to that, it is accepted that below a certain funding rate, grant selection is essentially random (Fang et al., 2016). Such competition also introduces conservative biases (since only those applications that are consensual can make it to the top 10%), for example against interdisciplinary studies. Thus, low funding rates are a problem not only because of waste but also because they introduce distortions.

For these reasons, a number of scientists have proposed to introduce a lottery system (Fang 2016; see also Mark Humphries’ post): after a first selection, of say, the top 20-30%, the winners are picked at random. This would reduce bias without impacting quality. Thus, it would certainly be a progress. However, it does not address the problem of waste. 90% of applications would still be written in vain.

First, there is a very elementary enhancement to be implemented: pick at random before you evaluate the grants, i.e., directly reject every other grant, then select the best 20%. This gives exactly the same result, except the cost of evaluation is divided by two.

Now I am sure it would feel quite frustrating for an applicant to write a full grant only to get immediately rejected by the flip of a coin. So there is again a very simple enhancement: decide who will get rejected before they write the application. Pick at random 50% of scientists and invite them to submit a grant. Again, the result is the same, but in addition you reduce the time spent on grant writing by two.

At this point we might wonder why do this initial selection at random? This introduces variance for no good reason. You never know in advance whether you will be allowed to get funding next year and this seems arbitrary. Thus, there is an obvious enhancement: replace lottery by rotation. Every PI is allowed to submit a grant only every two years. Again, this is equivalent on average to the initial lottery system, except there is less variance and less waste.

This reasoning leads me to a more general point. There is a simple way to increase the success rate of a grant system, which is to reduce the number of applications. The average funding rate of labs does not depend on the number of applications; it depends on the budget and only on the budget. If you bar 50% of scientists from applying, then you don’t divide by two the average budget of every lab. The average budget allocated to each lab is the same, but the success rate is doubled.

The counter-intuitive part is that individually, you increase your personal success rate if you apply to more calls. But collectively it is exactly the opposite: the global success rate decreases if there are more calls (for the same overall budget), since there are more applications. This is because the success rate is low because of other people submitting, not because you are submitting. This is a tragedy of commons phenomenon.

There is a simple way to solve it, which is to add constraints. There are different ways to do it: 1) reduce the frequency of calls, and merge redundant calls, 2) introduce a rotation (e.g. those born on even years submit on even years), 3) do not allow submission if you are already funded (or say, in the first years). Any of these constraints mechanically increases the success rate, thus reduces both waste and bias, with no impact on average funding. It is better than a lottery.


p.s.: There is also an obvious and efficient way to reduce the problem, which is to increase base funding, so that scientists do not need grants in order to survive (see this and other ideas in a previous post).

Revues prédatrices : quel est le problème ?

Un récent article du Monde alerte sur un phénomène qui prend de l’ampleur dans l’édition scientifique : les revues prédatrices (voir aussi l’éditorial). Il s’agit d’éditeurs commerciaux qui publient des articles scientifiques en ligne, contre rémunération, sans aucune éthique scientifique, en particulier en acceptant tous les articles sans qu’ils soient revus par des pairs. De manière similaire, les fausses conférences se multiplient ; des entreprises organisent des conférences scientifiques dans un but purement commercial, sans se soucier de la qualité scientifique.

En réaction, certaines institutions commencent à monter des « listes blanches » de journaux à éviter. C’est compréhensible, puisque le phénomène a un coût important. Mais la réponse néglige le problème fondamental. Il faut se rendre à l’évidence : l’éthique commerciale (recherche du profit) n’est pas compatible avec l’éthique scientifique (recherche de la vérité). Les entreprises dont on parle ne sont pas illégales, à ma connaissance. Elles organisent des conférences qui sont réelles ; elles publient des journaux qui sont réels. Simplement, elles ne se soucient pas de la qualité scientifique, mais de leur profit. On considère cela comme immoral ; mais une entreprise commerciale n’a pas de dimension morale, il s’agit simplement d’une organisation dont le but est de générer du profit. On ne peut s’attendre à ce que les intérêts commerciaux correspondent comme par magie exactement aux intérêts scientifiques.

  1. Le problème de l’édition commerciale

Ceci est vrai aux deux extrémités du spectre de la publication académique : pour les journaux prédateurs comme pour les journaux prestigieux. L’article parle de « fausse science » ; mais la plupart des cas de fraude scientifique ont été révélés dans des journaux prestigieux, pas dans des journaux prédateurs – qui de toutes façons ne sont pas lus par la communauté scientifique (voir par exemple Brembs (2018) pour le lien entre qualité méthodologique et prestige du journal). Pour les journaux commerciaux prestigieux, la stratégie commerciale des éditeurs est non pas de maximiser le nombre d’articles publiés, mais de maximiser le prestige perçu de ces journaux, qui servent ensuite d’appâts pour vendre les collections de journaux de l’éditeur. Autrement dit, c’est une stratégie de marque. Cela passe notamment par une sélection drastique des articles soumis, opérée par des éditeurs professionnels, c’est-à-dire pas par des scientifiques professionnels, sur la base de l’importance perçue des résultats, poussant ainsi une génération de scientifiques à gonfler les prétentions de leurs articles. Cela passe par la promotion auprès des institutions publiques de métriques douteuses comme le facteur d’impact, et plus généralement la promotion d’une mythologie de la publication prestigieuse, à savoir l’idée fausse et dangereuse qu’un article doit être jugé par le prestige du journal dans lequel il est publié, plutôt que par sa valeur scientifique intrinsèque – qui elle est évaluée par la communauté scientifique, pas par un éditeur commercial, ni même par deux scientifiques anonymes. En proposant d’éditer des listes de mauvais journaux, on ne résout pas le problème car l’on adhère implicitement à cette logique perverse.

Il suffit de regarder les marges dégagées par les grandes multinationales de l’édition scientifique pour comprendre que le modèle commercial n’est pas adapté. Pour Elsevier par exemple, les marges sont de l’ordre de 40%. La simple lecture de ce chiffre devrait nous convaincre immédiatement que l’édition scientifique devrait être gérée par des institutions publiques, du moins non commerciales (par exemple des sociétés savantes, comme c’est le cas d’un certain nombre de journaux). Quel est la justification pour faire appel à un opérateur commercial pour gérer un service public, ou n’importe quel service ? La motivation est que la compétition permet de diminuer les coûts et d’améliorer la qualité. Or si les marges sont de 40%, c’est que visiblement la compétition n’opère pas. Pourquoi ? Simplement parce que lorsqu’un scientifique soumet un article, il ne choisit pas le journal en fonction du prix ni même du service rendu (qui est en réalité essentiellement rendu par des scientifiques bénévoles), mais en fonction de la visibilité et du prestige du journal. Il n’y a donc pas de compétition sur les prix. Le pire qui pourrait arriver pour un éditeur commercial est que les articles scientifiques soient jugés à leur valeur intrinsèque plutôt que par le journal dans lequel ils sont publiés, parce qu’alors ce modèle commercial unique s’effondrerait et les journaux seraient en compétition sur les prix et les services qu’ils doivent fournir, comme n’importe quelle autre entreprise commerciale. C’est le pire qui puisse arriver aux éditeurs commerciaux, et le mieux qui puisse arriver à la communauté scientifique. Voilà pourquoi les intérêts commerciaux et scientifiques sont divergents.

Quoi qu’il en soit, il faut se rendre à l’évidence : des marges aussi énormes signifient que le modèle commercial est inefficace. Il faut donc cesser immédiatement de faire appel à des journaux commerciaux. Ce n’est pas très difficile : les institutions publiques sont tout à fait capables de gérer des journaux scientifiques ; il en existe et depuis longtemps. Un exemple récent est eLife, un des journaux les plus innovants actuellement en biologie. Cela ne devrait pas être très étonnant : le cœur de l’activité des journaux, à savoir la relecture des articles, est déjà faite par des scientifiques, y compris chez les éditeurs commerciaux qui font appel à leurs services gratuitement. Cela ne veut pas dire que l’on ne peut pas faire appel à des entreprises privées pour fournir des services, par exemple héberger des serveurs, gérer les sites web, fournir de l’infrastructure. Mais les journaux ne doivent plus appartenir à des sociétés commerciales, dont l’intérêt est de gérer ces journaux comme des marques. L’éthique scientifique n’est pas compatible avec l’éthique commerciale.

Comment faire ? En réalité c'est assez évident. Il s’agit pour les pouvoirs publics d’annuler la totalité des abonnements aux éditeurs commerciaux et de cesser de payer des droits de publication à ces éditeurs. De nos jours, il n’est pas difficile d’avoir accès à la littérature scientifique sans passer par les journaux (par les prépublications ou ‘preprints’ ou simplement en écrivant aux auteurs qui sont généralement ravis que l’on s’intéresse à leurs travaux). L’argent économisé peut être réinvesti en partie dans l’édition scientifique non commerciale.

  1. Le mythe de la revue par les pairs

Je veux maintenant en venir à une question d’épistémologie plus subtile mais fondamentale. Quel est au fond le problème des revues prédatrices ? Clairement, il y a le gaspillage d’argent public. Mais l’article du Monde pointe également des problèmes scientifiques, à savoir le fait que de fausses informations sont propagées, sans avoir été vérifiées. L’éditorial parle en effet de ‘la sacro-sainte « revue par les pairs »’, qui n’est pas effectuée par ces revues. Mais est-ce vraiment le problème fondamental ?

L’idée que ce qui fait la valeur d’un article scientifique est qu’il a été validé par la relecture par les pairs avant publication est un mythe tenace mais néanmoins erroné. Cela est faux d’un point de vue empirique, et d’un point de vue théorique.

D’un point de vue empirique, à tout instant, il existe dans la littérature des conclusions contradictoires à propos d’un grand nombre de sujets, publiées dans des revues traditionnelles. Les cas de fraude récents concernent des articles qui ont pourtant subi une relecture par les pairs. Mais c’est le cas aussi d’une quantité beaucoup plus importantes d’articles non frauduleux, mais dont les conclusions ont été contestées par la suite. L’histoire des sciences est remplie de théories scientifiques contradictoires et coexistantes, d’âpres débats entre scientifiques. Ces débats ont lieu, justement, après publication, et le consensus scientifique se forme généralement assez lentement, pratiquement jamais sur la base d’un seul article (voir par exemple Imre Lakatos en philosophie des sciences, ou Thomas Kuhn). Par ailleurs, les résultats scientifiques sont également souvent diffusés dans la communauté scientifique avant publication formelle ; c’est le cas aujourd’hui avec les prépublications (« preprints » en ligne), mais c’était déjà partiellement le cas auparavant avec les conférences. L’article publié reste la référence parce qu’il fournit des détails précis, notamment méthodologiques, mais la contribution des relecteurs sollicités par les journaux n’est dans la plupart des cas pas essentielle, d’autant que celle-ci n’est généralement pas rendue publique.

D’un point de vue théorique, il n’y a aucune raison que la relecture par les pairs « valide » un résultat scientifique. Il n’y a rien de magique dans la revue par les pairs : simplement deux, parfois trois scientifiques donnent leur avis éclairé sur le manuscrit. Ces scientifiques ne sont pas plus experts que ceux qui vont lire l’article lorsqu’il sera publié (je parle bien sûr de la communauté scientifique et pas du grand public). Le fait qu’un article soit publié dans un journal ne dit pas grand chose en soi de la réception des résultats par la communauté ; lorsqu’un article est rejeté d’un journal, il est resoumis ailleurs. La publication finale n’atteste absolument pas d’un consensus scientifique. Par ailleurs, lorsqu’il s’agit d’études empiriques, les relecteurs n’ont pas en réalité la possibilité de vérifier les résultats, et notamment de vérifier s’il n’y a pas eu de fraude. Tout ce qu’ils peuvent faire, c’est vérifier que les méthodes employées semblent appropriées, et que les interprétations semblent sensées (deux points souvent sujets à débat). Pour valider les résultats (mais pas les interprétations), il faudrait au minimum pouvoir refaire les expériences en question, ce qui suppose le temps et l’équipement nécessaire. Ce travail indispensable est fait (ou tenté), mais il n’est pas fait au moment de la publication, ni commissionné par le journal. Il est fait après publication par la communauté scientifique. Le travail de « vérification » (mot inapproprié car il n’y a pas de vérité absolue en science, ce qui la distingue justement de la religion) est le travail de fond de la communauté scientifique, ce n’est pas le travail ponctuel du journal.

C’est cette idée reçue qu’il faut déconstruire : que le travail de revue interne au journal « valide » d’une certaine manière les résultats scientifiques. Ce n’est pas le cas, cela n’a jamais été le cas, et cela ne peut pas être le cas. La validation scientifique est la nature même de l’entreprise scientifique, qui est un travail collectif et de longue haleine. On ne peut pas lire un article et conclure « c’est vrai »; il faut pour cela l’intégrer dans un ensemble de connaissances scientifiques, confronter l’interprétation à des points de vue différents (car toute interprétation requiert un cadre théorique).

C’est justement cette idée reçue que les journaux prestigieux tentent au contraire de consolider. Il faut y résister. L’antidote est de rendre public et transparent le débat scientifique, qui actuellement reste souvent confiné aux couloirs des laboratoires et des conférences. On prétend que la relecture par les pairs valide les résultats scientifiques, mais ces rapports ne sont la plupart du temps pas publiés ; et quid des rapports non publiés lorsque l’article est rejeté par un journal ? Comment savoir alors ce qu’en pense la communauté ? Il faut au contraire rendre public le débat scientifique. C’est par exemple l’ambition de sites comme PubPeer, qui a mis à jour un certain nombre de fraudes, mais qui peut être utilisé simplement pour le débat scientifique de manière générale. Plutôt que de conditionner la publication à un accord confidentiel de scientifiques anonymes, il faut au contraire inverser ce système : publier l’article (c’est en fait déjà le cas par la prépublication), puis solliciter les avis de la communauté, qui seront également publiés, argumentés, discutés par les auteurs et le reste de la communauté. C’est ainsi que les scientifiques, mais également le plus grand public, pourront obtenir un vision plus juste de la valeur scientifique des articles publiés. La revue par les pairs est un principe fondamental de la science, oui, mais pas celle effectuée dans la confidence par les journaux, celle au contraire effectuée au grand jour et sans limite de temps par la communauté scientifique.

What is computational neuroscience? (XXXII) The problem of biological measurement (2)

In the previous post, I have pointed out differences between biological sensing and physical measurement. A direct consequence is that it is not so straightforward to apply the framework of control theory to biological systems. At the level of behavior, it seems clear that animal behavior involves control; it is quite documented in the case of motor control. But this is the perspective of an external observer: the target value, the actual value and the error criterion are identified with physical measurements by an external observer. But how does the organism achieve this control, from its own perspective?

What the organism does not do, at least not directly, is measure the physical dimension and compare it to a target value. Rather, the biological system is influenced by the physical signal and reacts in a way that makes the physical dimension closer to a target value. How? I do not have a definite answer to this question, but I will explore a few possibilities.

Let us first explore a conventional possibility. The sensory neuron encodes the sensory input (eg muscle stretch) in some way; the control system decodes it, and then compares it to a target value. So for example, let us say that the sensory neuron is an integrate-and-fire neuron. If the input is constant, then the interspike interval can be mapped back to the input value. If the input is not constant, it is more complicated but estimates are possible. There are various studies relevant to this problem (for example Lazar (2004); see also the work of Sophie Denève, e.g. 2013). But all these solutions require knowing quite precisely how the input has been encoded. Suppose for example that the sensory neuron adapts with some time constant. Then the decoder needs somehow to de-adapt. But to do it correctly, one needs to know the time constant accurately enough, otherwise biases are introduced. If we consider that the encoder itself learns, e.g. by adapting to signal statistics (as in the efficient coding hypothesis), then the properties of the encoder must be considered unknown by the decoder.

Can the decoder learn to decode the sensory spikes? The problem is it does not have access to the original signal. The key question then is: what could the error criterion be? If the system has no access to the original signal but only streams of spikes, then how could it evaluate an error? One idea is to make an assumption about some properties of the original signal. One could for example assume that the original signal varies slowly, in contrast with the spike train, which is a highly fluctuating signal. Thus we may look for a slow reconstruction of the signal from the spike train; this is in essence the idea of slow feature analysis. But the original signal might not be slowly fluctuating, as it is influenced by the actions of the controller, so it is not clear that this criterion will work.

Thus it is not so easy to think of a control system which would decode the sensory neuron activity into the original signal so as to compare it to a target value. But beyond this technical issue (how to learn the decoder), there is a more fundamental question: why splitting the work into two units (encoder/decoder), if the function of the second one is essentially to undo the work of the first one?

An alternative is to examine the system as a whole. We consider the physical system (environment), the sensory neuron, the actuator, and the interneurons (corresponding to the control system). Instead of seeing the sensory neuron as involved in an act of measurement and communication and the interneurons as involved in an act of interpretation and command, we see the entire system as a distributed dynamical system with a number of structural parameters. In terms of dynamical systems (rather than control), the question becomes: is the target value for the physical dimension an attractive fixed point of this system, or more generally, is there such a fixed point? (as opposed to fluctuations) We can then derive complementary questions:

  • robustness: is the fixed point robust to perturbations, for example changes in properties of the sensor, actuator or environment?
  • optimality: are there ways to adjust the structure of the system so that the firing rate is minimized (for example)?
  • control: can we change the fixed point by an intervention on this system? (e.g. on the interneurons)

Thus, the problem becomes one of designing a spiking system that has an attractive fixed point in the physical dimension, with some desirable properties. Framing the problem in this way does not necessarily require that the physical dimension is explicitly extracted (“decoded”) from the activity of the sensory neuron. If we look at such a system, we might not be able to identify in any of the neurons a quantity that corresponds to the physical signal, or to the target value. Rather, physical signal and target value are to be found in the physical environment, and it is a property of the coupled dynamical system (neurons-environment) that the physical signal tends to approach the target value.

What is computational neuroscience? (XXXI) The problem of biological measurement (1)

We tend to think of sensory receptors (photoreceptors, inner hair cells) or sensory neurons (retinal ganglion cells; auditory nerve fibers) as measuring physical dimensions, for example light intensity or acoustical pressure, or some function of it. The analogy is with physical instruments of measure, like a thermometer or a microphone. This confers a representational quality to the activity of neurons, an assumption that is at the core of the neural coding metaphor. I explain at length why that metaphor is misleading in many ways in an essay (Brette (2018) Is coding a relevant metaphor for the brain?). Here I want to examine more specifically the notion of biological measurement and the challenges it poses.

This notion comes about not only in classical representationalist views, where neural activity is seen as symbols that the brain then manipulates (the perception-cognition-action model, also called sandwich model), but also in alternative views, although it is less obvious. For example, one alternative is to see the brain not as a computer system (encoding symbols, then manipulating them) but as a control system (see Paul Cisek’s behavior as interaction, William Powers’ perceptual control theory, Tim van Gelder’s dynamical view of cognition). In this view, the activity of neurons does not encode stimuli. In fact there is no stimulus per se, as Dewey pointed out: “the motor response determines the stimulus, just as truly as sensory stimulus determines the movement.”.

A simple case is feedback control: the system tries to maintain some input at a target value. To do this, the system must compare the input with an internal value. We could imagine for example something like an idealized version of the stretch reflex: when the muscle is stretched, a sensory feedback triggers a contraction, and we want to maintain muscle length constant. But this apparently trivial task raises a number of deep questions, as more generally the application of control theory to biological systems. I suppose there is a sensor, a neuron that transduces some physical dimension into spike trains, for example the stretch of a muscle. There is also an actuator, which reacts to a spike by a physical action, for example contracting the muscle with a particular time course. I chose a spike-based description not just because it corresponds to the physiology of the stretch reflex, but also because it will illustrate some fundamental issues (which would exist also with graded transduction, but less obviously so).

Now we have a neuron, or a set of neurons, which receive these sensory inputs and send spikes to the actuator. For this discussion, it is not critical that these are actually neurons; we can just consider that there is a system there, and we ask how this system should be designed so as to successfully achieve a control task.

The major issue here is that the control system does not directly deal with the physical dimension. At first sight, we could think this is a minor issue. The physical dimension gets transduced, and we could simply define the target value in the transduced dimension (eg the current). But here we see that the problem is more serious. What the control system deals with is not simply a function of the physical dimension. More accurately, transduction is a nonlinear dynamical system influenced by a physical signal. The physical signal can be constant, for example, while the transduced current decays (adaptation) and the sensory neuron outputs spike trains, i.e., a highly variable signal. This poses a much more serious problem than a simple calibration problem. When the controlled physical value is at the target value, the sensory neuron might be spiking, perhaps not even at a regular rate. The control system should react to that particular kind of signal by not acting, while it should act when the signal deviates from it. But how can the control system identify the target state, or even know whether to act in one or the opposite direction?

Adaptation in neurons is often depicted as an optimization of information transmitted, in line with the metaphor of the day (coding). But the relevant question is: how does the receiver of this “information” knows how the neuron has adapted? Does it have to de-adapt, to somehow be matched to the adaptive process of the encoding neuron? (This problem has to do with the dualistic structure of the neural coding metaphor).

There are additional layers of difficulty. We have first recognized that transduction is not a simple mapping from a physical dimension to a biological (e.g. electrochemical) dimension, but rather a dynamical system influenced by a physical signal. Now this dynamical system depends on the structure of the sensory neuron. It depends for example on the number of ionic channels and their properties, and we know these are highly plastic and indeed quite variable both across time and across cells. This dynamical system also depends on elements of the body, or let’s say more generally the neuron’s environment. For example, the way acoustical pressure is transduced in current by an inner hair cell depends obviously on the acoustical pressure at the eardrum, but that physical signal depends on the shape the ear, which filters sounds. Properties of neurons change with time too, development and aging. Thus, we cannot assume that the dynamical transformation from physical signal to biological signal is a fixed one. Somehow, the control system has to work despite this huge plasticity and the dynamical nature of the sensors.

Let us pause for a moment and outline a number of differences between physical measurements, as with a thermometer, and biological measurements (or “sensing”):

  • The physical meter is calibrated with respect to an external reference, for example 0°C is when water freezes, while 100°C is when it boils. The biological sensor cannot be calibrated with respect to an external reference.
  • The physical meter produces a fixed value for a stationary signal. The biological sensor produces a dynamical signal in response to a stationary signal. More accurately, the biological sensor is a nonlinear dynamical system influenced by the physical signal.
  • The physical meter is meant to be stable, in that the mapping from physical quantity to measurement is fixed. When it is not, this is considered an error. The biological sensor does not have fixed properties. Changes in properties occur in the normal course of life, from birth to death, and some changes in properties are interpreted as adaptations, not errors.

From these differences, we realize that biological sensors do not provide physical measurements in the usual sense. The next question, then, is how can a biological system control a physical dimension with biological sensors that do not act as measurements of that dimension?

What is computational neuroscience? (XXX) Is the brain a computer?

It is sometimes stated as an obvious fact that the brain carries out computations. Computational neuroscientists sometimes see themselves as looking for the algorithms of the brain. Is it true that the brain implements algorithms? My point here is not to answer this question, but rather to show that the answer is not self-evident, and that it can only be true (if at all) at a fairly abstract level.

One line of argumentation is that models of the brain that we find in computational neuroscience (neural network models) are algorithmic in nature, since we simulate them on computers. And wouldn’t it be a sort of vitalistic claim that neural networks cannot be (in principle) simulated on computer?

There is an important confusion in this argument. At a low level, neural networks are modelled biophysically as dynamical systems, in which the temporality corresponds to the actual temporality of the real world (as opposed to the discrete temporality of algorithms). Mathematically, those are typically differential equations, possibly hybrid systems (i.e. coupled by timed pulses), in which time is a continuous variable. Those models can of course be simulated on computer using discretization schemes. For example, we choose a time step and compute the state of the network at time t+dt, from the state at time t. This algorithm, however, implements a simulation of the model; it is not the model that implements the algorithm. The discretization is nowhere to be found in the model. The model itself, being a continuous time dynamical system, is not algorithmic in nature. It is not described as a discrete sequence of operations; it is only the simulation of the model that is algorithmic, and different algorithms can simulate the same model.

If we put this confusion aside, then the claim that neural networks implement algorithms becomes not that obvious. It means that trajectories of the dynamical system can be mapped to the discrete flow of an algorithm. This requires: 1) to identify states with representations of some variables (for example stimulus properties, symbols); 2) to identify trajectories from one state to another as specific operations. In addition to that, for the algorithmic view to be of any use, there should be a sequence of operations, not just one operation (ie, describing the output as a function of the input is not an algorithmic description).

A key difficulty in this identification is temporality: the state of the dynamical system changes continuously, so how can this be mapped to discrete operations? A typical approach is neuroscience is to consider not states but properties of trajectories. For example, one would consider the average firing rate in a population of neurons in a given time window, and the rate of another population in another time window. The relation between these two rates in the context of an experiment would define an operation. As stated above, a sequence of such relations should be identified in order to qualify as an algorithm. But this mapping seems only possible within a feedforward flow; coupling poses a greater challenge for an algorithmic description. No known nervous system, however, has a feedforward connectome.

I am not claiming here that the function of the brain (or mind) cannot possibly be described algorithmically. Probably some of it can be. My point is rather that a dynamical system is not generically algorithmic. A control system, for example, is typically not algorithmic (see the detailed example of Tim van Gelder, What might cognition be if not computation?). Thus a neural dynamical system can only be seen as an algorithm at a fairly abstract level, which can probably address only a restricted subset of its function. It could be that control, which also attaches function to dynamical systems, is a more adequate metaphor of brain function than computation. Is the brain a computer? Given the rather narrow application of the algorithmic view, the reasonable answer should be: quite clearly not (maybe part of cognition could be seen as computation, but not brain function generally).

What is computational neuroscience? (XXIX) The free energy principle

The free energy principle is the theory that the brain manipulates a probabilistic generative model of its sensory inputs, which it tries to optimize by either changing the model (learning) or changing the inputs (action) (Friston 2009; Friston 2010). The “free energy” is related to the error between predictions and actual inputs, or “surprise”, which the organism wants to minimize. It has a more precise mathematical formulation, but the conceptual issues I want to discuss here do not depend on it.

Thus, it can be seen as an extension of the Bayesian brain hypothesis that accounts for action in addition to perception. It shares the conceptual problems of the Bayesian brain hypothesis, namely that it focuses on statistical uncertainty, inferring variables of a model (called “causes”) when the challenge is to build and manipulate the structure of the model. It also shares issues with the predictive coding concept, namely that there is a conflation between a technical sense of “prediction” (expectation of the future signal) and a broader sense that is more ecologically relevant (if I do X, then Y will happen). In my view, these are the main issues with the free energy principle. Here I will focus on an additional issue that is specific of the free energy principle.

The specific interest of the free energy principle lies in its formulation of action. It resonates with a very important psychological theory called cognitive dissonance theory. That theory says that you try to avoid dissonance between facts and your system of beliefs, by either changing the beliefs in a small way or avoiding the facts. When there is a dissonant fact, you generally don’t throw your entire system of beliefs: rather, you alter the interpretation of the fact (think of political discourse or in fact, scientific discourse). Another strategy is to avoid the dissonant facts: for example, to read newspapers that tend to have the same opinions as yours. So there is some support in psychology for the idea that you act so as to minimize surprise.

Thus, the free energy principle acknowledges the circularity of action and perception. However, it is quite difficult to make it account for a large part of behavior. A large part of behavior is directed towards goals; for example, to get food and sex. The theory anticipates this criticism and proposes that goals are ingrained in priors. For example, you expect to have food. So, for your state to match your expectations, you need to seek food. This is the theory’s solution to the so-called “dark room problem” (Friston et al., 2012): if you want to minimize surprise, why not shut off stimulation altogether and go to the closest dark room? Solution: you are not expecting a dark room, so you are not going there in the first place.

Let us consider a concrete example to show that this solution does not work. There are two kinds of stimuli: food, and no food. I have two possible actions: to seek food, or to sit and do nothing. If I do nothing, then with 100% probability, I will see no food. If I seek food, then with, say, 20% probability, I will see food.

Let’s say this is the world in which I live. What does the free energy principle tell us? To minimize surprise, it seems clear that I should sit: I am certain to not see food. No surprise at all. The proposed solution is that you have a prior expectation to see food. So to minimize the surprise, you should put yourself into a situation where you might see food, ie to seek food. This seems to work. However, if there is any learning at all, then you will quickly observe that the probability of seeing food is actually 20%, and your expectations should be adjusted accordingly. Also, I will also observe that between two food expeditions, the probability to see food is 0%. Once this has been observed, surprise is minimal when I do not seek food. So, I die of hunger. It follows that the free energy principle does not survive Darwinian competition.

Thus, either there is no learning at all and the free energy principle is just a way of calling predefined actions “priors”; or there is learning, but then it doesn’t account for goal-directed behavior.

The idea to act so as to minimize surprise resonates with some aspects of psychology, like cognitive dissonance theory, but that does not constitute a complete theory of mind, except possibly of the depressed mind. See for example the experience of flow (as in surfing): you seek a situation that is controllable but sufficiently challenging that it engages your entire attention; in other words, you voluntarily expose yourself to a (moderate amount of) surprise; in any case certainly not a minimum amount of surprise.

Draft of chapter 6, Spike initiation with an initial segment

I have just uploaded an incomplete draft of chapter 6, "Spike initiation with an initial segment". This chapter deals with how spikes are initiated in most vertebrate neurons (and also some invertebrate neurons), where there is a hotspot of excitability close to a large soma. This situation has a number of interesting implications which make spike initiation quite different from the situation investigated by Hodgkin and Huxley, that of stimulating the middle of an axon. Most of the chapter describes the theory that I have developed to analyze this situation, called "resistive coupling theory" because the axonal hotspot is resistively coupled to the soma.

The chapter is currently unfinished, because a few points require a little more research, which we have not finished. The presentation is also a bit more technical than I would like, so this is really a draft. I wanted nonetheless to release it now, as I have not uploaded a chapter for a while and it could be some time before the chapter is finished.

What is computational neuroscience? (XXVIII)The Bayesian brain

Our sensors give us an incomplete, noisy, and indirect information about the world. For example, estimating the location of a sound source is difficult because in natural contexts, the sound of interest is corrupted by other sound sources, reflections, etc. Thus it is not possible to know the position of the source with certainty. The ‘Bayesian coding hypothesis’ (Knill & Pouget, 2014) postulates that the brain represents not the most likely position, but the entire probability distribution of the position. It then uses those distributions to do Bayesian inference, for example, when combining different sources of information (say, auditory and visual). This would allow the brain to optimally infer the most likely position. There is indeed some evidence for optimal inference in psychophysical experiments – although there is also some contradicting evidence (Rahnev & Denison, 2018).

The idea has some appeal. The problem is that, by framing perception as a statistical inference problem, it focuses on the most trivial type of uncertainty, statistical uncertainty. It is illustrated by the following quote: “The fundamental concept behind the Bayesian approach to perceptual computations is that the information provided by a set of sensory data about the world is represented by a conditional probability density function over the set of unknown variables”. Implicit in this representation is a particular model, for which variables are defined. Typically, one model describes a particular experimental situation. For example, the model would describe the distribution of auditory cues associated with the position of the sound source. Another situation would be described by a different model, for example one with two sound sources would require a model with two variables. Or if the listening environment is a room and the size of that room might vary, then we would need a model with the dimensions of the room as variables. In any of these cases where we have identified and fixed parametric sources of variation, then the Bayesian approach works fine, because we are indeed facing a problem of statistical inference. But that framework doesn’t fit any real life situation. In real life, perceptual scenes have variable structure, which corresponds to the model in statistical inference (there is one source, or two sources, we are in a room, the second source comes from the window, etc). The perceptual problem is therefore not just to infer the parameters of the model (dimensions of the room etc), but also the model itself, its structure. Thus, it is not possible in general to represent an auditory scene by a probability distribution on a set of parameters, because the very notion of a parameter already assumes that the structure of the scene is known and fixed.

Inferring parameters for a known statistical model is relatively easy. What is really difficult, and is still challenging for machine learning algorithms today, is to identify the structure of a perceptual scene, what constitutes an object (object formation), how objects are related to each other (scene analysis). These fundamental perceptual processes do not exist in the Bayesian brain. This touches on two very different types of uncertainty: statistical uncertainty, variations that can be interpreted and expected in the framework of a model; and epistemic uncertainty,  the model is unknown (the difference has been famously explained by Donald Rumsfeld).

Thus, the “Bayesian brain” idea addresses an interesting problem (statistical inference), but it trivializes the problem of perception, by missing the fact that the real challenge is epistemic uncertainty (building a perceptual model), not statistical uncertainty (tuning the parameters): the world is not noisy, it is complex.