What is computational neuroscience? (XXXIV) Is the brain a computer (2)

In a previous post, I argued that the way the brain works is not algorithmic, and therefore it is not a computer in the common sense of the term. This contradicts a popular view in computational neuroscience that the brain is a kind of a computer that implements algorithms. That view comes from formal neural network theory, and the argumentation goes as follows. Formal neural networks can implement any computable function, which is a function that can be implemented by an algorithm. Thus the brain can implement algorithms for computable functions, and therefore is by definition a computer. There are multiple errors in this reasoning. The most salient error is a semantic drift on the concept of algorithm, the second major error is a confusion on what a computer is.

Algorithms

A computable function is a function that can be implemented by an algorithm. But the converse “if a function is computable, then whatever implements this function runs an algorithm” is not true. To see this, we need to be a bit more specific about what is meant by “algorithm” and “computable function”.

Loosely speaking, an algorithm is simply a set of explicit instructions to solve a problem. A cooking recipe is an algorithm in this sense. For example, to cook pasta: put water in a pan; heat up; when water boils, put pasta; wait for 10 minutes. The execution of this algorithm occurs in continuous time in a real environment. But what is algorithmic about this description is the discrete sequential flow of instructions. Water boiling itself is not algorithmic, the high-level instructions are: “when condition A is true (water boils), then do B (put pasta)”. Thus, when we speak of algorithms, we must define what is considered as elementary instructions, that is, what is beneath the algorithmic level (water boils, put pasta).

The textbook definition of algorithm in computer science is: "a sequence of computational steps that transform the input into the output." (Cormen et al., Introduction to algorithms; possibly the most used textbook on the subject). Computability is a way to formalize the notion of algorithm for functions of integers (in particular logical functions). To formalize it, one needs to specify what is considered an elementary instruction. Thus, computability does not formalize the loose notion of algorithm above, i.e, any recipe to calculate something, for otherwise any function would be computable and the concept would be empty (to calculate f(x), apply f to x). A computable function is a function that can be calculated by a Turing machine, or equivalently, which can be generated by a small set of elementary functions on integers (with composition and recursion). Thus, an algorithm in the sense of computability theory is a discrete-time sequence of arithmetic and logical operations (and recursion). Note that this readily extend to any countable alphabet instead of integers, and of course you can replace arithmetic and logical operations with higher-order instructions, as long as they are themselves computable (ie a high-level programming language). But it is not any kind of specification of how to solve a problem. For example, there are various algorithms to calculate pi. But we could also calculate pi by drawing a circle, measuring both the diameter and the perimeter, then dividing perimeter by diameter. This is not an algorithm in the sense of computability theory. It could be called an algorithm in the broader sense, but again note that what is algorithmic about it is the discrete structure of the instructions.

Thus, a device could calculate a computable function using an algorithm in the strict sense of computability, or in the broader sense (cooking recipe), or in a non-algorithmic way (i.e., without any discrete structure of instructions). In any case, what the brain or any device manages to do bears no relation with how it does it.

As pointed out above, what is algorithmic about a description of how something works is the discrete structure (first do A; if B is true, then do C, etc). If we removed this condition, then we would be left with the more general concept of model, not algorithm: a description of how something works. Thus, if we want to say anything specific by claiming that the brain implements algorithms, then we must insist on the discrete-time structure (steps). Otherwise, we are just saying that the brain has a model.

Now that we have more precisely defined what an algorithm is, let us examine whether the brain might implement algorithms. Clearly, it does not literally implement algorithms in the narrow sense of computability theory, i.e., with elementary operations on integers and recursion. But could it be that it implements algorithms in the broader sense? To get some perspective, consider the following two physical systems:

(A) are dominoes, (B) is a tent (illustration taken from my essay “Is coding a relevant metaphor for the brain?”). Both are physical systems that interact with an environment, in particular which can be perturbed by mechanical stimuli. The response of dominoes to mechanical stimuli might be likened to an algorithm, but that of the tent cannot. The fact that we can describe unambiguously (with physics) how the tent reacts to mechanical stimuli does not make the dynamics of the tent algorithmic, and the same is true of the brain. Formal neural networks (e.g. perceptrons or deep learning networks) are algorithmic, but the brain is a priori more like the tent: a set of coupled neurons that interact in continuous time, together and with the environment, with no evident discrete structure similar to an algorithm. As argued above, a specification of how these real neural networks work and solve problems is not an algorithm: it’s a model – unless we manage to map the brain’s dynamics to the discrete flow of an algorithm.

Computers

Thus, if a computer is something that solves problems by running algorithms, then the brain is not a computer. We may however consider a broader definition: the computer is something that computes, i.e., which is able to calculate computable functions. As pointed out above, this does not require the computer to run algorithms. For example, consider a box with some gas, a heater (input = temperature T) and a pressure sensor (output = P). The device computes the function P = nRT/V by virtue of physical laws, and not by an algorithm.

This box, however, is not a computer. Otherwise, any physical system would be called a computer. To be called a computer, the device should be able to implement any computable function. But what does it mean exactly? To run an arbitrary computable function, some parameters of the device need to be appropriately adjusted. Who adjusts these parameters and how? If we do not specify how this adjustment is being made, then the claim that the brain is a computer is essentially empty. It just says that for each function, there is a way to arrange the structure of the brain so that this function is achieved. It is essentially equivalent to the claim that atoms can calculate any computable function, depending on how we arrange them.

To call such a device a computer, we must additionally include a mechanism to adjust the parameters so that it does actually perform a particular computable function. This leads us to the conventional definition of a computer: something that can be instructed via computer programming. The notion of program is central to the definition of computers, whatever form this program takes. A crucial implication is that a computer is a device that is dependent on an external operator for its function. The external operator brings the software to the computer; without the ability to receive software, the device is not a computer.

In this sense, the brain cannot be a computer. We may then consider the following metaphorical extension: the brain is a self-programmed computer. But the circularity in this assertion is problematic. If the program is a result of the program itself, then the “computer” cannot actually implement any computable function, but only those that result from its autonomous functioning. A cat, a mouse, an ant and a human do not actually do the same things, and cannot even in principle do the same tasks.

Finally, is computability theory the right framework to describe the activity of the brain in the first place? It is certainly not the right framework to describe the interaction of a tent with its environment, so why would it be appropriate for the brain, an embodied dynamical system in circular relation with the environment? Computability theory is a theory about functions. But a dynamical system is not a function. You can of course define functions on dynamical systems, even though they do not fully characterize the system. For example, you can define the function that maps the current state to the state at some future time. In the case of the brain, we might want to define a function that maps an external perturbation of the system (i.e. a stimulus) to the state of the system at some future time. However, this is not well defined, because it depends on the state of the system at the time of the perturbation. This problem does not occur with formal neural networks precisely because these are not dynamical systems but mappings. The brain is spontaneously active, whether there is a “stimulus” or not. The very notion of the organism as something that responds to stimuli is the most naïve version of behaviorism. The organism has an endogenous activity and a circular relation to its environment. Consider for example central pattern generators: these are rhythmic patterns produced in the absence of any input. Not all dynamical systems can be framed into computability theory, and in fact most of them, including the brain, cannot because they are not mappings.

Conclusion

As I have argued in my essay on neural coding, there are two core problems with the computer metaphor of the brain (it should be clear by now that this is a metaphor and not a property). One is that it tries to match two causal structures that are totally incongruent, just like dominoes and a tent. The other is that the computer metaphor, just as the coding metaphor, implicitly assumes an external operator – who programs it / interprets the code. Thus, what these two metaphors fundamentally miss is the epistemic autonomy of the organism.

Is the coding metaphor relevant for the genome?

I have argued that the neural coding metaphor is highly misleading (see also similar arguments by Mark Bickhard in cognitive science). The coding metaphor is very popular in neuroscience, but there is another domain of science where it is also very popular: genetics. Is there a genetic code? Many scientists have criticized the idea of a genetic code (and of a genetic program). A detailed criticism can be found in Denis Noble’s book “The music of life” (see also Noble 2011 for a short review).

Many of the arguments I have made in my essay on neural coding readily apply to the “genetic code”. Let us start with the technical use of the metaphor. The genome is a sequence of DNA base triplets called “codons” (ACG, TGA, etc). Each codon specifies a particular amino-acid, and proteins are made of amino-acids. So there is a correspondence between DNA and amino-acids. This seems an appropriate use of the term “code”. But even it in this limited sense, it should be used with caution. The fact that a base triplet encodes an amino-acid is conditional on this triplet being effectively translated into an amino-acid (note that there are two stages, transcription into RNA, then translation into a protein). But in fact only a small fraction of a genome is actually translated, about 10% (depending on species); the rest is called “non-coding DNA”. So the same triplets can result in the production of an amino-acid, or they can influence the translation-transcription system in various ways, for example by interacting with various molecules involved in the production of RNA and proteins, thereby regulating transcription and translation (and this is just one example).

Even when DNA does encode amino-acids, it does not follow that a gene encodes a protein. What might be said is that a gene encodes the primary structure of proteins, that is, the sequence of amino-acids; but it does not specify by itself the shape that the protein will take (which determines its chemical properties), the various modifications that occur after translation, the position that the protein will take in the cellular system. All of those crucial properties depend on the interaction of the product of transcription with the cellular system. In fact, even the primary structure of proteins is not fully determined by the gene, because of splicing.

Thus, the genome is not just a book, as suggested by the coding metaphor (some have called the genome the “book of life”); it is a chemically active substance that interacts with its chemical environment, a part of a larger cellular system.

At the other end of the genetic code metaphor, genes encode phenotypes, traits of the organism. For example, the gene for blue eyes. A concept that often appears in the media is the idea of genes responsible for diseases. One hope behind the human genome project was that by scrutinizing the human genome, we might be able to identify the genes responsible for every disease (at least for every genetic disease). Some diseases are monogenic, i.e., due to a single gene defect, but the most common diseases are polygenic, i.e., are due to a combination of genetic factors (and generally environmental factors).

But even the idea of monogenic traits is misleading. There is no single gene that encodes a given trait. What has been demonstrated in some cases is that mutations in a single gene can impact a given trait. But this does not mean that the gene is responsible by itself for that trait (surprisingly, this fallacy is quite common in the scientific literature, as pointed out by Yoshihara & Yoshihara 2018). A gene by itself does nothing. It needs to be embedded into a system, namely a cell, in order to produce any phenotype. Consequently, the expressed phenotype depends on the system in which the gene is embedded, in particular the rest of the genome. There cannot be a gene for blue eyes if there are no eyes. So no gene can encode the color of eyes; this encoding is at best contextual (in the same way as “neural codes” are always contextual, as discussed in my neural coding essay).

So the concept of a “genetic code” can only be correct in a trivial sense: that the genome, as a whole, specifies the organism. This clearly limits the usefulness of the concept, however. Unfortunately, even this trivial claim is also incorrect. An obvious objection is that the genome specifies the organism only in conjunction with the environment. The deeper objection is that the immediate environment of the genome is the cell itself. No entity smaller than the cell can live or reproduce. The genome is not a viable system, and as such it cannot produce an organism, nor can it reproduce. An interesting experiment is the following: the nucleus (and thus the DNA) from an animal cell is transferred to the egg of an animal of another species (where the nucleus has been removed) (Sun et al., 2005). The “genetic code” theory would predict that the egg would develop into an animal of the donor species. What actually happens (this was done in related fish species) is that the egg develops into some kind of hybrid, with the development process closer to that of the recipient species. Thus, even in the most trivial sense, the genome does not encode the organism. Finally, since no entity smaller than the cell can reproduce, it follows that the genome is not the unique basis of heritability – the entire cell is (see Fields & Levin, 2018).

In summary, the genome does not encode much except for amino-acids (for about 10% of it). It should be conceptualized as a component that interacts with the cellular system, not as a “book” that would be read by some cellular machinery.

What is computational neuroscience? (XXXIII) The interactivist model of cognition

The interactivist model of cognition has been developed by Mark Bickhard over the last 40 years or so. It is related to the viewpoints of Gibson and O’Regan, among others. The model is described in a book (Bickhard and Tervenn, 1996) and a more recent review (Bickhard 2008).

It starts with a criticism of what Bickhard calls “encodingism”, the idea that mental representations are constituted by encodings, correspondences between things in the world and symbols (this is very similar to my criticism of the neural coding metaphor, except Bickhard’s angle is cognitive science while mine was neuroscience). The basic argument is that the encoding “crosses the boundary of the epistemic agent”: the perceptual system stands on only one side of the correspondence, so there is no way it can interpret symbols in terms of things in the world since it never has access to things in the world at any point. The interpretation of the symbols in terms of things in the world would require an interpreter, some entity that makes sense of a priori arbitrary symbols. But this was precisely the epistemic problem to be solved, so the interpreter is a homunculus and this is an incoherent view. This is related to the skeptic argument about knowledge: there cannot be valid knowledge since we acquire knowledge by our senses and we cannot step outside of ourselves to check that it is valid. Encodingism fails the skeptic objection. Note that Bickhard refutes neither the possibility of representations nor even the possibility of encodings, but rather the fact that encodings can be foundational of representations. There can be derivative encodings, based on existing representations (for example Morse is a derivative encoding, which presupposes that we know about both letters and dots and dashes).

A key feature that a representational system must have is what Bickhard calls “system-detectable errors”. A representational system must be able to test whether its representations are correct or not. This is not possible in encodingism because the system does not have access to what is being represented (knowledge that cannot be checked is what I called “metaphysical knowledge” in my Subjective physics paper). No learning is possible if there are no system-detectable errors. This is the problem of normativity.

The interactivist model proposes the following solution: representations are anticipations of potential interactions and their expected impact on future states of the systems, or on the future course of processes of the system (this is close to Gibson’s “affordances”). I give an example taken from Subjective physics. Consider a sound source located somewhere in space. What does it mean to know where the sound came from? In the encoding view, we would say that the system has a mapping between the angle of the source and properties of the sounds, and so it infers the source’s angle from the captured sounds. But what can this mean? Is the inferred angle in radians or degrees? Surely radians and degrees cannot make sense for the perceiver and cannot have been learned (this is what I called “metaphysical knowledge”), so in fact the representation cannot actually be in the form of the physical angle of the source. Rather, what it means that the source is at a given position is that (for example) you would expect that moving your eyes in a particular way would make the source appear in your fovea (see more detail about the Euclidean structure of space and related topics in Subjective physics). Thus, the notion of space is a representation of the expected consequences of certain types of actions.

The interactivist model of representations has the desirable property that it has system-detectable errors: a representation can be correct or not, depending on whether the anticipation turns out to be correct or not. Importantly, what is anticipated is internal states, and therefore the representation does not cross the boundary of the epistemic agent. Contrary to standard models of representation, the interactivist model successfully addresses the skeptic argument.

The interactivist model is described at a rather abstract level, often referring to abstract machine theory (states of automata). Thus, it leaves aside the problem of its naturalization: how is it instantiated by the brain? Important questions to address are: what is a ‘state’ of the brain? (in particular given that the brain is a continuously active dynamical system where no “end state” can be identified); how do we cope with its distributed nature, that is, that the epistemic agent is itself constituted of a web of interacting elementary epistemic agents? how are representations built and instantiated?

Better than the grant lottery

Funding rates for most research grant systems are currently very low, typically around 10%. This means that 90% of the time spent on writing and evaluating grant applications is wasted. It means that if each grant spans 5 years, then a PI has to write about 2 grants per year to be continuously funded; in practice, to reduce risk it should be more than 2 per year. It is an enormous waste, and in addition to that, it is accepted that below a certain funding rate, grant selection is essentially random (Fang et al., 2016). Such competition also introduces conservative biases (since only those applications that are consensual can make it to the top 10%), for example against interdisciplinary studies. Thus, low funding rates are a problem not only because of waste but also because they introduce distortions.

For these reasons, a number of scientists have proposed to introduce a lottery system (Fang 2016; see also Mark Humphries’ post): after a first selection, of say, the top 20-30%, the winners are picked at random. This would reduce bias without impacting quality. Thus, it would certainly be a progress. However, it does not address the problem of waste. 90% of applications would still be written in vain.

First, there is a very elementary enhancement to be implemented: pick at random before you evaluate the grants, i.e., directly reject every other grant, then select the best 20%. This gives exactly the same result, except the cost of evaluation is divided by two.

Now I am sure it would feel quite frustrating for an applicant to write a full grant only to get immediately rejected by the flip of a coin. So there is again a very simple enhancement: decide who will get rejected before they write the application. Pick at random 50% of scientists and invite them to submit a grant. Again, the result is the same, but in addition you reduce the time spent on grant writing by two.

At this point we might wonder why do this initial selection at random? This introduces variance for no good reason. You never know in advance whether you will be allowed to get funding next year and this seems arbitrary. Thus, there is an obvious enhancement: replace lottery by rotation. Every PI is allowed to submit a grant only every two years. Again, this is equivalent on average to the initial lottery system, except there is less variance and less waste.

This reasoning leads me to a more general point. There is a simple way to increase the success rate of a grant system, which is to reduce the number of applications. The average funding rate of labs does not depend on the number of applications; it depends on the budget and only on the budget. If you bar 50% of scientists from applying, then you don’t divide by two the average budget of every lab. The average budget allocated to each lab is the same, but the success rate is doubled.

The counter-intuitive part is that individually, you increase your personal success rate if you apply to more calls. But collectively it is exactly the opposite: the global success rate decreases if there are more calls (for the same overall budget), since there are more applications. This is because the success rate is low because of other people submitting, not because you are submitting. This is a tragedy of commons phenomenon.

There is a simple way to solve it, which is to add constraints. There are different ways to do it: 1) reduce the frequency of calls, and merge redundant calls, 2) introduce a rotation (e.g. those born on even years submit on even years), 3) do not allow submission if you are already funded (or say, in the first years). Any of these constraints mechanically increases the success rate, thus reduces both waste and bias, with no impact on average funding. It is better than a lottery.

 

p.s.: There is also an obvious and efficient way to reduce the problem, which is to increase base funding, so that scientists do not need grants in order to survive (see this and other ideas in a previous post).

Revues prédatrices : quel est le problème ?

Un récent article du Monde alerte sur un phénomène qui prend de l’ampleur dans l’édition scientifique : les revues prédatrices (voir aussi l’éditorial). Il s’agit d’éditeurs commerciaux qui publient des articles scientifiques en ligne, contre rémunération, sans aucune éthique scientifique, en particulier en acceptant tous les articles sans qu’ils soient revus par des pairs. De manière similaire, les fausses conférences se multiplient ; des entreprises organisent des conférences scientifiques dans un but purement commercial, sans se soucier de la qualité scientifique.

En réaction, certaines institutions commencent à monter des « listes blanches » de journaux à éviter. C’est compréhensible, puisque le phénomène a un coût important. Mais la réponse néglige le problème fondamental. Il faut se rendre à l’évidence : l’éthique commerciale (recherche du profit) n’est pas compatible avec l’éthique scientifique (recherche de la vérité). Les entreprises dont on parle ne sont pas illégales, à ma connaissance. Elles organisent des conférences qui sont réelles ; elles publient des journaux qui sont réels. Simplement, elles ne se soucient pas de la qualité scientifique, mais de leur profit. On considère cela comme immoral ; mais une entreprise commerciale n’a pas de dimension morale, il s’agit simplement d’une organisation dont le but est de générer du profit. On ne peut s’attendre à ce que les intérêts commerciaux correspondent comme par magie exactement aux intérêts scientifiques.

  1. Le problème de l’édition commerciale

Ceci est vrai aux deux extrémités du spectre de la publication académique : pour les journaux prédateurs comme pour les journaux prestigieux. L’article parle de « fausse science » ; mais la plupart des cas de fraude scientifique ont été révélés dans des journaux prestigieux, pas dans des journaux prédateurs – qui de toutes façons ne sont pas lus par la communauté scientifique (voir par exemple Brembs (2018) pour le lien entre qualité méthodologique et prestige du journal). Pour les journaux commerciaux prestigieux, la stratégie commerciale des éditeurs est non pas de maximiser le nombre d’articles publiés, mais de maximiser le prestige perçu de ces journaux, qui servent ensuite d’appâts pour vendre les collections de journaux de l’éditeur. Autrement dit, c’est une stratégie de marque. Cela passe notamment par une sélection drastique des articles soumis, opérée par des éditeurs professionnels, c’est-à-dire pas par des scientifiques professionnels, sur la base de l’importance perçue des résultats, poussant ainsi une génération de scientifiques à gonfler les prétentions de leurs articles. Cela passe par la promotion auprès des institutions publiques de métriques douteuses comme le facteur d’impact, et plus généralement la promotion d’une mythologie de la publication prestigieuse, à savoir l’idée fausse et dangereuse qu’un article doit être jugé par le prestige du journal dans lequel il est publié, plutôt que par sa valeur scientifique intrinsèque – qui elle est évaluée par la communauté scientifique, pas par un éditeur commercial, ni même par deux scientifiques anonymes. En proposant d’éditer des listes de mauvais journaux, on ne résout pas le problème car l’on adhère implicitement à cette logique perverse.

Il suffit de regarder les marges dégagées par les grandes multinationales de l’édition scientifique pour comprendre que le modèle commercial n’est pas adapté. Pour Elsevier par exemple, les marges sont de l’ordre de 40%. La simple lecture de ce chiffre devrait nous convaincre immédiatement que l’édition scientifique devrait être gérée par des institutions publiques, du moins non commerciales (par exemple des sociétés savantes, comme c’est le cas d’un certain nombre de journaux). Quel est la justification pour faire appel à un opérateur commercial pour gérer un service public, ou n’importe quel service ? La motivation est que la compétition permet de diminuer les coûts et d’améliorer la qualité. Or si les marges sont de 40%, c’est que visiblement la compétition n’opère pas. Pourquoi ? Simplement parce que lorsqu’un scientifique soumet un article, il ne choisit pas le journal en fonction du prix ni même du service rendu (qui est en réalité essentiellement rendu par des scientifiques bénévoles), mais en fonction de la visibilité et du prestige du journal. Il n’y a donc pas de compétition sur les prix. Le pire qui pourrait arriver pour un éditeur commercial est que les articles scientifiques soient jugés à leur valeur intrinsèque plutôt que par le journal dans lequel ils sont publiés, parce qu’alors ce modèle commercial unique s’effondrerait et les journaux seraient en compétition sur les prix et les services qu’ils doivent fournir, comme n’importe quelle autre entreprise commerciale. C’est le pire qui puisse arriver aux éditeurs commerciaux, et le mieux qui puisse arriver à la communauté scientifique. Voilà pourquoi les intérêts commerciaux et scientifiques sont divergents.

Quoi qu’il en soit, il faut se rendre à l’évidence : des marges aussi énormes signifient que le modèle commercial est inefficace. Il faut donc cesser immédiatement de faire appel à des journaux commerciaux. Ce n’est pas très difficile : les institutions publiques sont tout à fait capables de gérer des journaux scientifiques ; il en existe et depuis longtemps. Un exemple récent est eLife, un des journaux les plus innovants actuellement en biologie. Cela ne devrait pas être très étonnant : le cœur de l’activité des journaux, à savoir la relecture des articles, est déjà faite par des scientifiques, y compris chez les éditeurs commerciaux qui font appel à leurs services gratuitement. Cela ne veut pas dire que l’on ne peut pas faire appel à des entreprises privées pour fournir des services, par exemple héberger des serveurs, gérer les sites web, fournir de l’infrastructure. Mais les journaux ne doivent plus appartenir à des sociétés commerciales, dont l’intérêt est de gérer ces journaux comme des marques. L’éthique scientifique n’est pas compatible avec l’éthique commerciale.

Comment faire ? En réalité c'est assez évident. Il s’agit pour les pouvoirs publics d’annuler la totalité des abonnements aux éditeurs commerciaux et de cesser de payer des droits de publication à ces éditeurs. De nos jours, il n’est pas difficile d’avoir accès à la littérature scientifique sans passer par les journaux (par les prépublications ou ‘preprints’ ou simplement en écrivant aux auteurs qui sont généralement ravis que l’on s’intéresse à leurs travaux). L’argent économisé peut être réinvesti en partie dans l’édition scientifique non commerciale.

  1. Le mythe de la revue par les pairs

Je veux maintenant en venir à une question d’épistémologie plus subtile mais fondamentale. Quel est au fond le problème des revues prédatrices ? Clairement, il y a le gaspillage d’argent public. Mais l’article du Monde pointe également des problèmes scientifiques, à savoir le fait que de fausses informations sont propagées, sans avoir été vérifiées. L’éditorial parle en effet de ‘la sacro-sainte « revue par les pairs »’, qui n’est pas effectuée par ces revues. Mais est-ce vraiment le problème fondamental ?

L’idée que ce qui fait la valeur d’un article scientifique est qu’il a été validé par la relecture par les pairs avant publication est un mythe tenace mais néanmoins erroné. Cela est faux d’un point de vue empirique, et d’un point de vue théorique.

D’un point de vue empirique, à tout instant, il existe dans la littérature des conclusions contradictoires à propos d’un grand nombre de sujets, publiées dans des revues traditionnelles. Les cas de fraude récents concernent des articles qui ont pourtant subi une relecture par les pairs. Mais c’est le cas aussi d’une quantité beaucoup plus importantes d’articles non frauduleux, mais dont les conclusions ont été contestées par la suite. L’histoire des sciences est remplie de théories scientifiques contradictoires et coexistantes, d’âpres débats entre scientifiques. Ces débats ont lieu, justement, après publication, et le consensus scientifique se forme généralement assez lentement, pratiquement jamais sur la base d’un seul article (voir par exemple Imre Lakatos en philosophie des sciences, ou Thomas Kuhn). Par ailleurs, les résultats scientifiques sont également souvent diffusés dans la communauté scientifique avant publication formelle ; c’est le cas aujourd’hui avec les prépublications (« preprints » en ligne), mais c’était déjà partiellement le cas auparavant avec les conférences. L’article publié reste la référence parce qu’il fournit des détails précis, notamment méthodologiques, mais la contribution des relecteurs sollicités par les journaux n’est dans la plupart des cas pas essentielle, d’autant que celle-ci n’est généralement pas rendue publique.

D’un point de vue théorique, il n’y a aucune raison que la relecture par les pairs « valide » un résultat scientifique. Il n’y a rien de magique dans la revue par les pairs : simplement deux, parfois trois scientifiques donnent leur avis éclairé sur le manuscrit. Ces scientifiques ne sont pas plus experts que ceux qui vont lire l’article lorsqu’il sera publié (je parle bien sûr de la communauté scientifique et pas du grand public). Le fait qu’un article soit publié dans un journal ne dit pas grand chose en soi de la réception des résultats par la communauté ; lorsqu’un article est rejeté d’un journal, il est resoumis ailleurs. La publication finale n’atteste absolument pas d’un consensus scientifique. Par ailleurs, lorsqu’il s’agit d’études empiriques, les relecteurs n’ont pas en réalité la possibilité de vérifier les résultats, et notamment de vérifier s’il n’y a pas eu de fraude. Tout ce qu’ils peuvent faire, c’est vérifier que les méthodes employées semblent appropriées, et que les interprétations semblent sensées (deux points souvent sujets à débat). Pour valider les résultats (mais pas les interprétations), il faudrait au minimum pouvoir refaire les expériences en question, ce qui suppose le temps et l’équipement nécessaire. Ce travail indispensable est fait (ou tenté), mais il n’est pas fait au moment de la publication, ni commissionné par le journal. Il est fait après publication par la communauté scientifique. Le travail de « vérification » (mot inapproprié car il n’y a pas de vérité absolue en science, ce qui la distingue justement de la religion) est le travail de fond de la communauté scientifique, ce n’est pas le travail ponctuel du journal.

C’est cette idée reçue qu’il faut déconstruire : que le travail de revue interne au journal « valide » d’une certaine manière les résultats scientifiques. Ce n’est pas le cas, cela n’a jamais été le cas, et cela ne peut pas être le cas. La validation scientifique est la nature même de l’entreprise scientifique, qui est un travail collectif et de longue haleine. On ne peut pas lire un article et conclure « c’est vrai »; il faut pour cela l’intégrer dans un ensemble de connaissances scientifiques, confronter l’interprétation à des points de vue différents (car toute interprétation requiert un cadre théorique).

C’est justement cette idée reçue que les journaux prestigieux tentent au contraire de consolider. Il faut y résister. L’antidote est de rendre public et transparent le débat scientifique, qui actuellement reste souvent confiné aux couloirs des laboratoires et des conférences. On prétend que la relecture par les pairs valide les résultats scientifiques, mais ces rapports ne sont la plupart du temps pas publiés ; et quid des rapports non publiés lorsque l’article est rejeté par un journal ? Comment savoir alors ce qu’en pense la communauté ? Il faut au contraire rendre public le débat scientifique. C’est par exemple l’ambition de sites comme PubPeer, qui a mis à jour un certain nombre de fraudes, mais qui peut être utilisé simplement pour le débat scientifique de manière générale. Plutôt que de conditionner la publication à un accord confidentiel de scientifiques anonymes, il faut au contraire inverser ce système : publier l’article (c’est en fait déjà le cas par la prépublication), puis solliciter les avis de la communauté, qui seront également publiés, argumentés, discutés par les auteurs et le reste de la communauté. C’est ainsi que les scientifiques, mais également le plus grand public, pourront obtenir un vision plus juste de la valeur scientifique des articles publiés. La revue par les pairs est un principe fondamental de la science, oui, mais pas celle effectuée dans la confidence par les journaux, celle au contraire effectuée au grand jour et sans limite de temps par la communauté scientifique.

What is computational neuroscience? (XXXII) The problem of biological measurement (2)

In the previous post, I have pointed out differences between biological sensing and physical measurement. A direct consequence is that it is not so straightforward to apply the framework of control theory to biological systems. At the level of behavior, it seems clear that animal behavior involves control; it is quite documented in the case of motor control. But this is the perspective of an external observer: the target value, the actual value and the error criterion are identified with physical measurements by an external observer. But how does the organism achieve this control, from its own perspective?

What the organism does not do, at least not directly, is measure the physical dimension and compare it to a target value. Rather, the biological system is influenced by the physical signal and reacts in a way that makes the physical dimension closer to a target value. How? I do not have a definite answer to this question, but I will explore a few possibilities.

Let us first explore a conventional possibility. The sensory neuron encodes the sensory input (eg muscle stretch) in some way; the control system decodes it, and then compares it to a target value. So for example, let us say that the sensory neuron is an integrate-and-fire neuron. If the input is constant, then the interspike interval can be mapped back to the input value. If the input is not constant, it is more complicated but estimates are possible. There are various studies relevant to this problem (for example Lazar (2004); see also the work of Sophie Denève, e.g. 2013). But all these solutions require knowing quite precisely how the input has been encoded. Suppose for example that the sensory neuron adapts with some time constant. Then the decoder needs somehow to de-adapt. But to do it correctly, one needs to know the time constant accurately enough, otherwise biases are introduced. If we consider that the encoder itself learns, e.g. by adapting to signal statistics (as in the efficient coding hypothesis), then the properties of the encoder must be considered unknown by the decoder.

Can the decoder learn to decode the sensory spikes? The problem is it does not have access to the original signal. The key question then is: what could the error criterion be? If the system has no access to the original signal but only streams of spikes, then how could it evaluate an error? One idea is to make an assumption about some properties of the original signal. One could for example assume that the original signal varies slowly, in contrast with the spike train, which is a highly fluctuating signal. Thus we may look for a slow reconstruction of the signal from the spike train; this is in essence the idea of slow feature analysis. But the original signal might not be slowly fluctuating, as it is influenced by the actions of the controller, so it is not clear that this criterion will work.

Thus it is not so easy to think of a control system which would decode the sensory neuron activity into the original signal so as to compare it to a target value. But beyond this technical issue (how to learn the decoder), there is a more fundamental question: why splitting the work into two units (encoder/decoder), if the function of the second one is essentially to undo the work of the first one?

An alternative is to examine the system as a whole. We consider the physical system (environment), the sensory neuron, the actuator, and the interneurons (corresponding to the control system). Instead of seeing the sensory neuron as involved in an act of measurement and communication and the interneurons as involved in an act of interpretation and command, we see the entire system as a distributed dynamical system with a number of structural parameters. In terms of dynamical systems (rather than control), the question becomes: is the target value for the physical dimension an attractive fixed point of this system, or more generally, is there such a fixed point? (as opposed to fluctuations) We can then derive complementary questions:

  • robustness: is the fixed point robust to perturbations, for example changes in properties of the sensor, actuator or environment?
  • optimality: are there ways to adjust the structure of the system so that the firing rate is minimized (for example)?
  • control: can we change the fixed point by an intervention on this system? (e.g. on the interneurons)

Thus, the problem becomes one of designing a spiking system that has an attractive fixed point in the physical dimension, with some desirable properties. Framing the problem in this way does not necessarily require that the physical dimension is explicitly extracted (“decoded”) from the activity of the sensory neuron. If we look at such a system, we might not be able to identify in any of the neurons a quantity that corresponds to the physical signal, or to the target value. Rather, physical signal and target value are to be found in the physical environment, and it is a property of the coupled dynamical system (neurons-environment) that the physical signal tends to approach the target value.

What is computational neuroscience? (XXXI) The problem of biological measurement (1)

We tend to think of sensory receptors (photoreceptors, inner hair cells) or sensory neurons (retinal ganglion cells; auditory nerve fibers) as measuring physical dimensions, for example light intensity or acoustical pressure, or some function of it. The analogy is with physical instruments of measure, like a thermometer or a microphone. This confers a representational quality to the activity of neurons, an assumption that is at the core of the neural coding metaphor. I explain at length why that metaphor is misleading in many ways in an essay (Brette (2018) Is coding a relevant metaphor for the brain?). Here I want to examine more specifically the notion of biological measurement and the challenges it poses.

This notion comes about not only in classical representationalist views, where neural activity is seen as symbols that the brain then manipulates (the perception-cognition-action model, also called sandwich model), but also in alternative views, although it is less obvious. For example, one alternative is to see the brain not as a computer system (encoding symbols, then manipulating them) but as a control system (see Paul Cisek’s behavior as interaction, William Powers’ perceptual control theory, Tim van Gelder’s dynamical view of cognition). In this view, the activity of neurons does not encode stimuli. In fact there is no stimulus per se, as Dewey pointed out: “the motor response determines the stimulus, just as truly as sensory stimulus determines the movement.”.

A simple case is feedback control: the system tries to maintain some input at a target value. To do this, the system must compare the input with an internal value. We could imagine for example something like an idealized version of the stretch reflex: when the muscle is stretched, a sensory feedback triggers a contraction, and we want to maintain muscle length constant. But this apparently trivial task raises a number of deep questions, as more generally the application of control theory to biological systems. I suppose there is a sensor, a neuron that transduces some physical dimension into spike trains, for example the stretch of a muscle. There is also an actuator, which reacts to a spike by a physical action, for example contracting the muscle with a particular time course. I chose a spike-based description not just because it corresponds to the physiology of the stretch reflex, but also because it will illustrate some fundamental issues (which would exist also with graded transduction, but less obviously so).

Now we have a neuron, or a set of neurons, which receive these sensory inputs and send spikes to the actuator. For this discussion, it is not critical that these are actually neurons; we can just consider that there is a system there, and we ask how this system should be designed so as to successfully achieve a control task.

The major issue here is that the control system does not directly deal with the physical dimension. At first sight, we could think this is a minor issue. The physical dimension gets transduced, and we could simply define the target value in the transduced dimension (eg the current). But here we see that the problem is more serious. What the control system deals with is not simply a function of the physical dimension. More accurately, transduction is a nonlinear dynamical system influenced by a physical signal. The physical signal can be constant, for example, while the transduced current decays (adaptation) and the sensory neuron outputs spike trains, i.e., a highly variable signal. This poses a much more serious problem than a simple calibration problem. When the controlled physical value is at the target value, the sensory neuron might be spiking, perhaps not even at a regular rate. The control system should react to that particular kind of signal by not acting, while it should act when the signal deviates from it. But how can the control system identify the target state, or even know whether to act in one or the opposite direction?

Adaptation in neurons is often depicted as an optimization of information transmitted, in line with the metaphor of the day (coding). But the relevant question is: how does the receiver of this “information” knows how the neuron has adapted? Does it have to de-adapt, to somehow be matched to the adaptive process of the encoding neuron? (This problem has to do with the dualistic structure of the neural coding metaphor).

There are additional layers of difficulty. We have first recognized that transduction is not a simple mapping from a physical dimension to a biological (e.g. electrochemical) dimension, but rather a dynamical system influenced by a physical signal. Now this dynamical system depends on the structure of the sensory neuron. It depends for example on the number of ionic channels and their properties, and we know these are highly plastic and indeed quite variable both across time and across cells. This dynamical system also depends on elements of the body, or let’s say more generally the neuron’s environment. For example, the way acoustical pressure is transduced in current by an inner hair cell depends obviously on the acoustical pressure at the eardrum, but that physical signal depends on the shape the ear, which filters sounds. Properties of neurons change with time too, development and aging. Thus, we cannot assume that the dynamical transformation from physical signal to biological signal is a fixed one. Somehow, the control system has to work despite this huge plasticity and the dynamical nature of the sensors.

Let us pause for a moment and outline a number of differences between physical measurements, as with a thermometer, and biological measurements (or “sensing”):

  • The physical meter is calibrated with respect to an external reference, for example 0°C is when water freezes, while 100°C is when it boils. The biological sensor cannot be calibrated with respect to an external reference.
  • The physical meter produces a fixed value for a stationary signal. The biological sensor produces a dynamical signal in response to a stationary signal. More accurately, the biological sensor is a nonlinear dynamical system influenced by the physical signal.
  • The physical meter is meant to be stable, in that the mapping from physical quantity to measurement is fixed. When it is not, this is considered an error. The biological sensor does not have fixed properties. Changes in properties occur in the normal course of life, from birth to death, and some changes in properties are interpreted as adaptations, not errors.

From these differences, we realize that biological sensors do not provide physical measurements in the usual sense. The next question, then, is how can a biological system control a physical dimension with biological sensors that do not act as measurements of that dimension?

What is computational neuroscience? (XXX) Is the brain a computer?

It is sometimes stated as an obvious fact that the brain carries out computations. Computational neuroscientists sometimes see themselves as looking for the algorithms of the brain. Is it true that the brain implements algorithms? My point here is not to answer this question, but rather to show that the answer is not self-evident, and that it can only be true (if at all) at a fairly abstract level.

One line of argumentation is that models of the brain that we find in computational neuroscience (neural network models) are algorithmic in nature, since we simulate them on computers. And wouldn’t it be a sort of vitalistic claim that neural networks cannot be (in principle) simulated on computer?

There is an important confusion in this argument. At a low level, neural networks are modelled biophysically as dynamical systems, in which the temporality corresponds to the actual temporality of the real world (as opposed to the discrete temporality of algorithms). Mathematically, those are typically differential equations, possibly hybrid systems (i.e. coupled by timed pulses), in which time is a continuous variable. Those models can of course be simulated on computer using discretization schemes. For example, we choose a time step and compute the state of the network at time t+dt, from the state at time t. This algorithm, however, implements a simulation of the model; it is not the model that implements the algorithm. The discretization is nowhere to be found in the model. The model itself, being a continuous time dynamical system, is not algorithmic in nature. It is not described as a discrete sequence of operations; it is only the simulation of the model that is algorithmic, and different algorithms can simulate the same model.

If we put this confusion aside, then the claim that neural networks implement algorithms becomes not that obvious. It means that trajectories of the dynamical system can be mapped to the discrete flow of an algorithm. This requires: 1) to identify states with representations of some variables (for example stimulus properties, symbols); 2) to identify trajectories from one state to another as specific operations. In addition to that, for the algorithmic view to be of any use, there should be a sequence of operations, not just one operation (ie, describing the output as a function of the input is not an algorithmic description).

A key difficulty in this identification is temporality: the state of the dynamical system changes continuously, so how can this be mapped to discrete operations? A typical approach is neuroscience is to consider not states but properties of trajectories. For example, one would consider the average firing rate in a population of neurons in a given time window, and the rate of another population in another time window. The relation between these two rates in the context of an experiment would define an operation. As stated above, a sequence of such relations should be identified in order to qualify as an algorithm. But this mapping seems only possible within a feedforward flow; coupling poses a greater challenge for an algorithmic description. No known nervous system, however, has a feedforward connectome.

I am not claiming here that the function of the brain (or mind) cannot possibly be described algorithmically. Probably some of it can be. My point is rather that a dynamical system is not generically algorithmic. A control system, for example, is typically not algorithmic (see the detailed example of Tim van Gelder, What might cognition be if not computation?). Thus a neural dynamical system can only be seen as an algorithm at a fairly abstract level, which can probably address only a restricted subset of its function. It could be that control, which also attaches function to dynamical systems, is a more adequate metaphor of brain function than computation. Is the brain a computer? Given the rather narrow application of the algorithmic view, the reasonable answer should be: quite clearly not (maybe part of cognition could be seen as computation, but not brain function generally).

What is computational neuroscience? (XXIX) The free energy principle

The free energy principle is the theory that the brain manipulates a probabilistic generative model of its sensory inputs, which it tries to optimize by either changing the model (learning) or changing the inputs (action) (Friston 2009; Friston 2010). The “free energy” is related to the error between predictions and actual inputs, or “surprise”, which the organism wants to minimize. It has a more precise mathematical formulation, but the conceptual issues I want to discuss here do not depend on it.

Thus, it can be seen as an extension of the Bayesian brain hypothesis that accounts for action in addition to perception. It shares the conceptual problems of the Bayesian brain hypothesis, namely that it focuses on statistical uncertainty, inferring variables of a model (called “causes”) when the challenge is to build and manipulate the structure of the model. It also shares issues with the predictive coding concept, namely that there is a conflation between a technical sense of “prediction” (expectation of the future signal) and a broader sense that is more ecologically relevant (if I do X, then Y will happen). In my view, these are the main issues with the free energy principle. Here I will focus on an additional issue that is specific of the free energy principle.

The specific interest of the free energy principle lies in its formulation of action. It resonates with a very important psychological theory called cognitive dissonance theory. That theory says that you try to avoid dissonance between facts and your system of beliefs, by either changing the beliefs in a small way or avoiding the facts. When there is a dissonant fact, you generally don’t throw your entire system of beliefs: rather, you alter the interpretation of the fact (think of political discourse or in fact, scientific discourse). Another strategy is to avoid the dissonant facts: for example, to read newspapers that tend to have the same opinions as yours. So there is some support in psychology for the idea that you act so as to minimize surprise.

Thus, the free energy principle acknowledges the circularity of action and perception. However, it is quite difficult to make it account for a large part of behavior. A large part of behavior is directed towards goals; for example, to get food and sex. The theory anticipates this criticism and proposes that goals are ingrained in priors. For example, you expect to have food. So, for your state to match your expectations, you need to seek food. This is the theory’s solution to the so-called “dark room problem” (Friston et al., 2012): if you want to minimize surprise, why not shut off stimulation altogether and go to the closest dark room? Solution: you are not expecting a dark room, so you are not going there in the first place.

Let us consider a concrete example to show that this solution does not work. There are two kinds of stimuli: food, and no food. I have two possible actions: to seek food, or to sit and do nothing. If I do nothing, then with 100% probability, I will see no food. If I seek food, then with, say, 20% probability, I will see food.

Let’s say this is the world in which I live. What does the free energy principle tell us? To minimize surprise, it seems clear that I should sit: I am certain to not see food. No surprise at all. The proposed solution is that you have a prior expectation to see food. So to minimize the surprise, you should put yourself into a situation where you might see food, ie to seek food. This seems to work. However, if there is any learning at all, then you will quickly observe that the probability of seeing food is actually 20%, and your expectations should be adjusted accordingly. Also, I will also observe that between two food expeditions, the probability to see food is 0%. Once this has been observed, surprise is minimal when I do not seek food. So, I die of hunger. It follows that the free energy principle does not survive Darwinian competition.

Thus, either there is no learning at all and the free energy principle is just a way of calling predefined actions “priors”; or there is learning, but then it doesn’t account for goal-directed behavior.

The idea to act so as to minimize surprise resonates with some aspects of psychology, like cognitive dissonance theory, but that does not constitute a complete theory of mind, except possibly of the depressed mind. See for example the experience of flow (as in surfing): you seek a situation that is controllable but sufficiently challenging that it engages your entire attention; in other words, you voluntarily expose yourself to a (moderate amount of) surprise; in any case certainly not a minimum amount of surprise.

Draft of chapter 6, Spike initiation with an initial segment

I have just uploaded an incomplete draft of chapter 6, "Spike initiation with an initial segment". This chapter deals with how spikes are initiated in most vertebrate neurons (and also some invertebrate neurons), where there is a hotspot of excitability close to a large soma. This situation has a number of interesting implications which make spike initiation quite different from the situation investigated by Hodgkin and Huxley, that of stimulating the middle of an axon. Most of the chapter describes the theory that I have developed to analyze this situation, called "resistive coupling theory" because the axonal hotspot is resistively coupled to the soma.

The chapter is currently unfinished, because a few points require a little more research, which we have not finished. The presentation is also a bit more technical than I would like, so this is really a draft. I wanted nonetheless to release it now, as I have not uploaded a chapter for a while and it could be some time before the chapter is finished.