What is computational neuroscience? (XXXV) Metaphors as morphisms

What is a metaphor? Essentially, a metaphor is an analogy that doesn’t say its name. We use metaphors all the time without even noticing it, as was beautifully demonstrated by Lakoff & Johnson (1980). When I say for example, “let me cast some light on this issue”, I am using a fairly sophisticated metaphor in which I make an analogy between understanding and seeing. In that analogy, an explanation allows you to understand, in the same way as light allows you to see. You might then reply: I see what you mean, it is clearer! Chances are that, in normal conversation, we would not have noticed that we both used a metaphor.

Metaphors are everywhere in neuroscience, and in biology more generally (see these posts). For example: evolution optimizes traits (see the excellent article of Gould & Lewontin (1979) for a counterpoint); the genome is a code for the organism (see Denis Noble (2011a; 2011b)); the brain runs algorithms, or is a computer (see also Paul Cisek (1999) or Francisco Varela); neural activity is a code.

These metaphors are so ingrained in neuroscientific thinking that many object to the very idea that they are metaphorical. The objection is that “evolution is optimization” or “brain runs algorithms” is not a metaphor, it is a theory. Or, for the more dogmatic, these are not metaphors, these are facts.

Indisputable truths belong to theology, not science, so any claim that a general proposition is a fact should be seen as suspect – it is an expression of dogmatism. But there is a case that we are actually talking about theories. In the case of neural codes or brains as computers, one might insist that the terms “code” or “computer” refer to abstract properties, not to concrete objects like a desktop computer. But this is a misunderstanding of what a metaphor, or more generally an analogy, is. When I am “casting light on this issue”, I am not referring to any particular lamp, but to an abstract concept of light which does not actually involve photons. The question is not whether words are actually some sort of photons, but whether the functional relation between light and seeing is similar to the functional relation between explanation and understanding. There is no doubt that these concepts are abstracted from actual properties of concrete situations (of light and perception), but so are the concepts of code and computer. In the metaphor, it is the abstract properties that are at stake, so the objection “it is not a metaphor, it is a theory” either misunderstands what metaphor is (a metaphor is a theory), or perhaps really means “the theory is correct” – again dogmatism.

For the mathematically minded, a mathematical concept that captures this idea is “morphism”. A morphism is a map that preserves structure. For example, a group homomorphism f from X to Y is such that f(a*b) = f(a) x f(b): the operation * defined on X is mapped to the operation x defined on Y (of course “metaphors are morphisms” is a metaphor!).

For example, in the “let me cast light on this issue” metaphor, I am mapping the domain of visual perception to the domain of linguistic discourse: light -> words; visual object -> issue ; seeing -> understanding. What makes the metaphor interesting is that some relations within the first domain are mapped to relations in the other domain: use of light on an object causes seeing; use of words on an issue causes understanding.

Another example in science is the analogy between the heart and a pump. Each element of the pump (e.g. valve, liquid) is mapped to an element of the heart, and the analogy is relevant because functional relations between elements of the pump are mapped to corresponding relations between elements of the heart. Thus, the analogy has explanatory power. What makes a metaphor or an analogy interesting is not the fact that the two domains are similar (they are generally not), but the richness of the structure preserved by the implied morphism.

In other words, a metaphor or an analogy is a theory that takes inspiration from another domain (e.g. computer science), by mapping some structure from one domain to the other. There is nothing intrinsically wrong with this, on the contrary. Why then is the term “metaphor” so vehemently opposed in science ? Because the term implies that the theory is questionable (hence, again, dogmatism). There are ways in which understanding is like seeing, but there are also ways in which it is different.

Let us consider the metaphor “the brain implements algorithms”, which I previously discussed. Some are irritated by the very suggestion that this might even be a metaphor. The rhetorical strategy is generally two-fold: 1) by “algorithm”, we mean some abstract property, not programs written in C++; 2) the definition of “algorithm” is made general enough that it is trivially true, in which case it is not a metaphor since it is literally true. As argued, (1) is a misunderstanding of linguistics because metaphor is about abstract properties. And if we follow (2), then nothing can be inferred from the statement. Thus, it is only to the extent that “the brain implements algorithms” is metaphorical that it is insightful (and it is to some extent, but in my view to a limited extent).

The key question, thus, is what we mean by “algorithm”. A natural starting point would be to take the definition from a computer science textbook. The most used textbook on the subject is probably Cormen et al., Introduction to algorithms. It proposes the following definition: “a sequence of computational steps that transform the input into the output”. One would need to define what “computational” means in this context, but it is not key for this discussion. With this definition, to say that the brain implements an algorithm means that there exists a morphism between brain activity and a sequence of computational steps. That is, intermediate values of the algorithm are mapped to properties of brain activity (e.g. firing rates measured over some time window) - this is the “encoding”. Then we claim that this mapping has the property that a computational step linking two values is mapped to the operation of the dynamics of the brain linking the two corresponding neural measurements. I explain in the third part of my essay on neural coding why this claim cannot be correct, at least not generally and only approximately (one reason is that a measurement of neural activity must be done on some time window, and thus cannot be considered as an initial state of a dynamical system, from which you could deduce the future dynamics). But this is not the point of this discussion. The point is that this claim, that there is a morphism between an algorithm and brain activity, is not trivial and it has explanatory value. In other words, it is interesting. This stems from the rich structure that is being mapped between the two domains.

Since it is not trivial (as in fact any metaphor), a discussion will necessarily arise about whether and to what extent the implied mapping does in fact preserve structure between the two domains. You could accept this state of affairs and provide empirical or theoretical arguments. Or you could dismiss the metaphorical nature entirely. But by doing so, you are also dismissing what is interesting about the metaphor, that is, the fact that there might be a morphism between two domains. We could for example redefine “algorithm” in a more general way as a computable function, even if it is not what is usually meant by that (as the Cormen textbook shows). But in that case, the claim loses all explanatory value because no structure at all is transported between the two domains. We are just calling sensory signals “input” and motor commands “output” and whatever happens in between “algorithm”. In mathematical terms, this is a mapping but not a morphism.

Thus, metaphors are interesting because they are morphisms between domains, which is what gives them scientific value (they are models). The problem, however, is that metaphor is typically covert, and failure to recognize them as such leads to dogmatism. When one objects to the use of some words like “code”, “algorithm”, “representation”, “optimization”, a common reaction is that the issue “is just semantic”. What this means is that it is just about arbitrary labels, and the labels themselves do not really matter. As if scientific discourse were essentially uninteresting and trivial (we just observe things and give them names). This reaction reveals a naïve view of language where words are mappings (between objects and arbitrary labels), when what matters is the structured concepts that words refer to through morphisms, not just mappings. This is what metaphor is about.

A criticism of homo economicus (Or: people are neither rational nor irrational)

The mainstream theory of economics, neoclassical economics, is based on a very peculiar model of human behavior and social interactions. The core assumption is that people’s behavior consists in maximizing “utility”, which is a measure of personal preferences. That is, each situation is assigned some utility and people choose the situation with maximal utility, by making the best possible use of available information. This is called “rational behavior” (this is somewhat related to the view in psychology that perceptual behavior is optimal, which I have criticized on similar grounds).

This model has been criticized repeatedly on empirical grounds, in particular on the grounds that humans are actually not that rational, psychology has documented numerous cognitive biases, and so on. This line of criticism forms an entire field, behavioral economics. Epistemologically, economics is quite a particular field because lack of empirical evidence for its core models or even direct empirical contradiction does not seem to be a problem at all. One reason is that the ambition of economic theory is not just empirical but also normative, i.e., it also has a political dimension. In other words, if reality does not fit the model, then reality should be changed so as to fit it (hence the prescription of free markets). It is of course questionable that theories can be called scientific if they constitutively offer no possibility of empirical grounding.

Thus, although the assumptions of neoclassical economics have been pretty much demolished on empirical grounds by psychology (actual behavior of people) and anthropology (actual social interactions; see for example David Graeber’s “Debt: The First 5000 Years”), it still remains the dominant mode of economic thinking because it is intellectually appealing. Of course, political interference certainly has a role in this state of affairs, but here I want to focus on the intellectual aspects.

When the field of behavioral economics points out that humans actually do not behave “rationally”, those deviations are depicted as flaws or “bounds on rationality”. If you are not rational, then you are irrational. This is really not a radical criticism. We are bound to conclude that the rational agent is an approximation of real behavior, and everybody knows that a model cannot be exact. Perhaps the model could be amended, made more sophisticated, or perhaps we should educate people so that they are more rational (this seems to be Daniel Kahneman’s view). But fundamentally, “rational behavior” is a sensible, if imperfect, model of human behavior and social interactions.

What these criticisms miss is the fact that both “rational behavior” and “irrational behavior” have in common several implicit assumptions, which are not only empirically questionable but also absurd foundations for an economic theory – and which therefore cannot ground a normative approach any more than an empirical approach.

1) The first problem is with the idea of “rationality”. Rationality is something that belongs to the domain of logics, and therefore which can only be exerted on a particular model. Thus, to describe human behavior as “rational”, we must first assume that there exists a fixed model of the world and personal preferences, and that this model is not a subject of inquiry. In particular, personal preferences are given and cannot be changed. If, however, the advertisement business is not totally foolish, then this is wrong. Not only do personal preferences change, but one way of satisfying your own desires is by manipulating the desires of others, and this appears to be a large part of the activity of modern multinational companies. The fact that personal preferences are actually not fixed has two problematic consequences: 1) you cannot frame behavior as optimization if the optimization criterion is also a free parameter, 2) it becomes totally unclear how satisfying people’s preferences is supposed to be a good thing, if that means making them want what you sell, rather than selling them what they need; what preexisting economic problem is being solved in this way? Personal preferences can also be changed by the individual itself: for example she can decide, after reflection, that buying expensive branded clothes is futile (see e.g. cognitive dissonance theory about how people change their preferences and beliefs). But again, if that possibility is on the table, then how can we even define “rational behavior”? is it to buy the expensive cloth or is it to change the “utility function”? Assuming preferences are fixed properties of people is the move that allows economic theory to avoid philosophical and in particular ethical questions (what is “good”? see e.g. stoicism and buddhism), as well as the possibility that people influence each other for various reasons (manipulation, but also conversation and education). Unfortunately those questions do not disappear just by ignoring them.

2) The assumption of “rationality” also assumes that people have a fixed model of the world over which that rationality is exerted. They do not learn, for example, and they do not need to be taught either. They just happen to know everything useful there is to know about the world. Building an adequate model of the world, of the consequences of one’s actions, is considered outside the realm of economic theory. But in a normative perspective, this is really paradoxical. One aim of economic theory is to devise efficient organizations of work, in particular which ensure the distribution of accurate information to the relevant people. But by postulating that people are “rational agents”, economic theory considers as already solved the problem it is supposed to address in the first place. In other words, the problem of designing rational organizations of production is dismissed by postulating that people are rational. No wonder that this view leads to the bureaucratization of economy (see David Graeber’s Bullshit Jobs and Béatrice Hibou’s The Bureaucratization of the World in the Neoliberal Era).

3) Finally, implicit with the idea of “rational behavior” is caricatural reductionism. That is, the presumption that the optimization of individual preferences is realized at the individual level. This, in fact, amounts to neglecting the possibility that there are social interactions – quite problematic for a social science. A well-known example in game theory is the prisoner’s dilemma: two criminals are arrested; if they both remain silent, they will do one year in prison; if one betrays the other, he is set free and the other goes to jail for three years; if both betray the other, they both go to jail for two years. Whatever the other decides to do, it is always in your best interest to betray him: this would be the “rational behavior”. The paradox is that two “rational” criminals would end up in jail for two years, while two “irrational” criminals that would not betray each other would do just one year. Thus, “rationality” is not necessarily the most advantageous way to organize social interactions. Or to rephrase, individual rationality is not the same as collective rationality. This is of course a well-known problem in economy, in particular in the “tragedy of the commons” version. But again, this tends to be depicted as an amendment to be made to the core assumption of rationality (cf the concept of “externalities”), when it actually demonstrates the fallacy of the concept of “individual rationality”. Accordingly, neoclassical economists propose to solve the problem by incentives (e.g. carbon tax). But first of all, this is not the same as building collective infrastructures. And second, what this means is that anything that cannot be modeled as independent individual actions is not addressed by the economic theory, but instead must be tailored in the form of an “incentive structure”. Each collective problem now requires its own complex “incentive structure” designed in such a way that the “free” play of individual rationalities ensures the collective good, which is to say that each collective problem must be solved in an ad hoc way outside of the conceptual framework of theory. In other words, with its focus on “rational behavior”, neoclassical economics sets out to solve exclusively problems that do not involve social interactions. It is not clear, then, what the theory is meant to solve in the first place (how omniscient independent agents manage to organize themselves?), or to demonstrate (selfishness entails collective good, except when it doesn’t?).

This issue is actually an important theme of evolutionary theory. Namely, how can social species exist at all, if individualist behavior is rewarded by increased survival and reproduction rate? The answer that evolutionary theory has come to, as well as anthropology and ethology of social animals including primates (see e.g. Frans de Waal’s books), is that social animals display a variety of non-individualist behaviors based on altruism, reciprocity and authority, which ensure successful social interactions and therefore are beneficial for the species. In other words, studies in all those non-economic fields have concurred to demonstrate that efficient collective organizations are not based on individual rationality. This conclusion is not immensely surprising, yet it is essentially the opposite of mainstream economic theory.

In summary, the problem with the “rational behavior” model of human behavior that subtends neoclassical economics is not that people are “irrational”. The problem is that framing human behavior in terms of individual rationality already assumes from the outset that 1) people already have an accurate model of the world, and so no social organization is required to ensure that people’s actions have their intended consequences, this is already solved by people’s “rationality”; 2) people have preexisting fixed “preferences”, and so we don’t need to care about what a “good society” might mean, this is already taken care of by the “preferences”; 3) there is no collective rationality beyond individual rationality, and so there is in fact no society at all, just a group of independent people. Thus, the epistemological implications of the “rational behavior” model are in fact tremendous: essentially, the model amounts to putting aside all the problems that economic theory is supposed to solve in the first place. In other words, the “rational behavior” model of neoclassical economics is not just empirically wrong, it is also theoretically absurd.


p.s.: This is partially related to a recent discussion in perceptual psychology on the presumed optimality of human behavior. Rahnev & Denison (2018) review an extensive literature to show that in perceptual tasks, people are actually not optimal. These findings are referred to in the title as “suboptimality”, but in my view this is an unfortunate terminology. My objection to this terminology is that it implicitly accepts the framework of optimization, in which there already is a fixed model of the world for which we only need to tune the parameters. But this means ignoring what perception is largely about, namely modeling the world (object formation, scene analysis, etc).

How belief systems handle contradiction - (I) Empirical contradiction

In this essay, I will discuss the different ways in which a theory can be contradicted, and how theories react. The scope of this discussion is broader than science, so I will be discussing belief systems, of which scientific theories are a particular kind (although, according to Feyerabend, not that particular). Another kind of belief system is political theories, for example. What is a belief system? Roughly speaking (and it will get more precise in the discussion), it is a set of propositions about the world that have a universal character. In science, this could be the law of gravitation, for example. Those propositions have relations with each other, and thus they form a system. For example, some propositions might logically imply others. In a belief system, there are generally core concepts over which other propositions build upon (examples: the atom; the rational agent of economic theory).

How do we evaluate belief systems? In philosophy of science, it is generally considered that scientific theories are evaluated empirically, by testing the empirical validity of propositions. That is, we evaluate the extent to which propositions are contradicted by facts. This has been the core target of much of modern philosophy of science, and thus I will start by recapitulating arguments about empirical contradiction, and add a few remarks. What has been less discussed is two other types of contradiction, social and theoretical. By social contradiction, I refer to the fact that at any given time, different people hold contradictory beliefs, even when they are aware of the same empirical body of observations. How is it possible and do such contradictions get solved? By theoretical contradiction, I refer to the possibility that a system is in fact not logically coherent. It seems that in philosophy of knowledge, belief systems are generally seen as a set of logically consistent propositions, but I will argue that this view is not tenable or rather is a normative view, and that belief systems actually are in some sense “archipelagos of knowledge”.

Empirical contradiction

Science is largely dominated by empiricism. One version of it is the logical empiricism of the circle of Vienna (or logical positivism), dating from the early 20th century. In a nutshell, it claims that scientific statements are of two types: elementary propositions whose truth can be verified empirically, in other words observations, and propositions that can be logically deduced from those elements. This leads to a bottom-up view of science, where experimental scientists establish facts, and then theoreticians build consistent theories from these facts. As far as I can see in my own field, this view is still held by a large portion of scientists today, even though it has been pretty much demolished by philosophy of science in the course of the 20th century. To give an example, logical empiricism is the philosophical doctrine that underlies the logic of the Human Brain Project, whose core idea is to collect measurements of the brain and then build a model from those.

Karl Popper objected that, on a logical ground, propositions can in fact never be verified if they have a universal nature. For example, to verify the law of gravitation, you would have to make all apples in the world fall, and you would still be unsure of whether another apple in the future might not fall in the way you expect. Universal propositions can only be contradicted by observations. This leads to falsificationism, the idea that scientific theories can only be falsified, not verified. On this view, at any given time, there are different theories that are consistent with the current body of experimental observations, and science progresses by elimination, by coming up with critical tests of theories. For example, one of the motivations advanced for collecting “big data” in neuroscience, such as the connectome, is that theories are presumed to be insufficiently constrained by current data (Denk et al., 2012). This view is extremely popular in biology today, even though again later work in philosophy of science has also pretty much demolished it.

Paraphrasing Quine (see Two dogmas of empiricism and the Duhem-Quine thesis), we can object that a theory never gets tested directly, only models specific of a particular situation do. For example, if you wanted to test Newton’s laws, you could let an apple fall and measure its trajectory. But to do this, you would first need to come up with a model of the situation, where for example you would consider the apple as a point of a given mass subject to only the force of gravity. In this case, you would conclude that Newton laws are false. But you would have concluded differently if you had added an auxiliary assumption, that air also exerts a friction force on the apple, for which you would have to come up with a particular model.

Kuhn and Lakatos have pointed out that in fact, the way empirical contradictions are resolved is almost never by abandoning a theory. The process is rather one of interpretation, that is, of coming up with ways of making the observation congruent with the theory. This could be seen as a rhetorical maneuver, or as a fruitful scientific process. In this example, if you think that Newton laws are valid, then you would actually deduce the laws of friction from the empirical contradiction. Laws of friction are in fact very complicated in general and still an active field of research in physics, which draws on various domains of physics, and to make progress one has to accept the underlying theories.

The key point is to recognize that interpretation is not a flaw of the scientific process, but a logical necessity to confront a theory to reality. A theory is framed in the discrete structure of language, that is, in formal terms. For it to apply to anything in the world, things in the world must be mapped to the formal structure of the theory. This, in essence, is the process of modeling. In contrast with theory, a model does not have a universal character; it applies to a specific situation. In the example above, we would have to introduce the assumption that the apple is a rigid body, and that friction follows a particular law, for example that the friction force is proportional to speed. This implies that it is actually not possible to either verify or falsify a theory on the basis of an empirical observation.

This argument does not lead to a relativistic view (all theories have the same epistemic value and it is a question of taste); in this, I would temper some of the conclusions of Feyerabend. Interpretation is in fact not only a logical necessity, but also a key element in scientific progress. Lakatos proposed that it is not theories that compete, but research programs. Some research programs are “degenerate”: they evolve by adding heteroclite ad hoc hypotheses to account for each new observation. Others are “progressive”: they evolve by extending their theoretical core, which then applies to new situations. In scientific practice, this is obtained by dissolving the specific character of interpretations into the universal character of theories. To come back to the apple example, initially we would interpret the empirical contradiction by coming up with an empirical model of friction, which essentially amounts to calling the empirical error “friction”. More precisely, it is an auxiliary hypothesis that makes the observation compatible with the theory. But this could then be turned into theoretical progress: from an analysis of a number of cases of falls of objects, we could then postulate that there is a friction force that is proportional to the speed of the apple and to its size (Stokes law). By doing so, we make parts of the previous interpretations instances of a new theoretical proposition. Note that this proposition only makes sense in the context of Newton’s laws, and thus we are indeed describing a system and not just a set of independent laws. The evaluative situation of the apple fall has now changed: we are evaluating a broader theoretical body (Newton’s laws + Stokes law) by using a narrower interpretative model (the apple is a rigid sphere).

Thus, interpretation is a key feature of belief systems, both logically necessary and progressive, and it appears to be neglected by the two flavors of empiricism that are broadly observed in science (verificationism and falsificationism). Yet without it, it is impossible to understand why people can disagree, other than by postulating than some must be either idiots or liars. I will address this issue in the next part on social contradiction.

So far, I have argued that scientists face empirical contradiction by interpretation. Theoretical progress is then made by dissolving interpretations into new theory. What this really means is that the very notion of “empirical contradiction” is in fact quite misleading, because for the person doing the interpretative work, there is no real contradiction, only a more complex situation than expected. I will end this part by drawing on developments of psychology, specifically cognitive dissonance theory, and extending to non-scientific situations.

Resolving empirical contradiction by interpretation is not at all specific of science, but is a general feature of how people confront their beliefs to facts. In When Prophecy Fails, Leon Festinger and colleagues infiltrated a UFO sect that believed in an imminent apocalypse, and they examined what happened when the predicted end of the world did not happen. Believers did not at all abandon their belief. Instead, the leader claimed that they had managed to postpone the end of the world thanks to their prayers, and she proposed a new date. This is a case of interpretation of the observation within the belief system. But importantly, as discussed above, interpretation is not a flaw of human nature, but a necessary feature of belief systems. In this case, the believers appear to have arbitrarily made up an ad hoc justification, and we are tempted to dismiss it as a hallmark of irrational thinking. But when observe the anomalous trajectory of Jupiter and, to make up for this anomaly, we postulate that there must be an unobserved satellite orbiting around the planet, we are making an interpretative move of the same nature, except in this case it turns out to be correct. Our initial reaction in the former case is that any reasonable person should reject the theory if the prediction is directly contradicted, yet in the latter case we find the same attitude reasonable. In reality, the main difference between the two cases is not in the way empirical contradictions are handled, but in the perceived plausibility of both the prediction and the interpretation. Specifically, we do not believe that prayers can have any visible effect at all and thus the interpretative move appears irrational. But of course, the situation is quite different if the powers of prayer have a prominent role in our belief system. Thus, it is an error to describe the ad hoc interpretation as irrational. It is actually totally rational, in that it follows from logical reasoning (A, the end of the world should have occurred if we had not intervened; B, we have prayed; C, prayers have an impact; conclusion: the end of the world has probably be been prevented by our prayers). Only, rationality is applied within a highly questionable theoretical framework. In the end, we realize that it is not really the non-occurrence of the end of the world that should lead us to abandon the belief system, but rather the empirical contradictions of the belief system in its globality, for example the fact that prayers actually do not work.

Thus, it should not be surprising that in their field study, the authors find that it takes a number of failed end-of-the-world predictions before the beliefs are finally abandoned. This is what Imre Lakatos called a “degenerative research program”: the theory survives repeated contradictions only by making up an incoherent series of ad hoc assumptions. It ends up being overthrown, but the process can take a very long time (this process of change between scientific theories is well documented also by Thomas Kuhn’s The structure of scientific revolutions).

This phenomenon is particularly visible in political discourse. Any significant political event or observation will be given a different interpretation depending on the political preferences of the person. It never happens that a right wing person turns left wing upon noticing that wealthy countries have many homeless people. Rather, an interpretation is made that revolves around the notion of personal responsibility.

To give a contemporary example, recent demonstrations in France have been met by an extraordinarily repressive response, with hundreds of serious injuries caused by police, some on journalists and children, which are documented by hundreds of videos circulating on social media (see @davduf and a recent article in Le Monde). A recent one is a 47 year old voluntary fire man, father of three, who was demonstrating with his wife until they were dispersed by tear gas. He was later found lying alone in an empty street with a head injury, and an amateur video shows that police shot him in the head from behind with a flash-ball gun and launched a grenade in the street. The man is currently in a coma. In any of those cases, on social media there is invariably part of the comments that suggest that the man or woman must have done something bad (this is indeed the official doctrine, as the Minister of Interior has recently claimed that he is not aware of any case of police violence). It is not in itself irrational: simply, the commenter presumes that police do not hurt innocent citizens, deduces that the citizen involved is not innocent, and concludes that the critics are irrational conspiracy seekers.

There are indeed conspiracy theorists, for example those that claim that the landing on the moon was in fact filmed in Hollywood studios. The fact that it is a conspiracy theory is not itself a reason to discredit it, since there have been conspiracies in History. The theory itself is also not irrational, in that it has logical coherence as well as empirical elements of support. For example, on the video the American flag appears to float in the wind whereas there can be no wind on the moon and in fact the flag should appear folded. Indeed: the flag was made rigid precisely for that reason. But most people do not know this fact, and thus the reasons why the ordinary citizen believes that man actually did land on the moon is that she trusts the source and finds the information plausible. Which attitude is irrational?

These examples illustrate several points. First, isolated empirical contradictions almost never shake a belief system. In fact, this is precisely what we mean when we say that “extraordinary claims require extraordinary evidence”. This proposition, however, is quite misleading since the notion of what is extraordinary is specific of a particular belief system. There is no objective definition of “extraordinary”. Therefore, rather than being a normative feature of scientific method, it simply expresses the inherent conservativeness of belief systems. As Festinger’s study shows, it can take a large number of empirical contradictions to impact a belief system. As explained previously, this is not necessarily a flaw as those contradictions can (sometimes) be turned into theoretical progress within the belief system.

But there are other ways in which empirical contradictions are handled by belief systems, which are documented in psychology by cognitive dissonance theory. A major one is simply to avoid being confronted to those contradictions, for example by reading newspapers with the same political views, or by discrediting the source of information without actually examining the information (e.g. social media propagate fake news, therefore the video cannot be trusted). Another is proselytism, that is, trying to convert other people to your belief system.

These mechanisms explain that at any given moment, mutually contradictory belief systems are held by people who live in the same world and are in contact with each other, and who can even discuss the same empirical observations. The conclusion of our discussion is that the main problematic issue in belief systems is not so much irrationality as dogmatism (but we will come back to irrationality in the third part on theoretical contradiction). Dogmatism arises from two different attitudes: blindness and deafness. Dogmatism is blind in that it actively refuses to see empirical evidence as potentially contradictory: it is not just that it is contradicted by empirical observations (this very notion is questionable), but rather that it dismisses empirical contradiction without seriously trying to accommodate for it in a progressive way (i.e., by strengthening its theoretical core). Dogmatism is deaf in that it refuses to acknowledge for the possibility that other rational belief systems can exist, and may have diverging interpretations of the empirical body of observations. Dogmatism denies the theoretical possibility of disagreement: the opponent is always either an idiot (irrational) or a liar (has ulterior motives). In the next part, I will turn to social contradiction: how different belief systems can co-exist and influence each other.

Epistémologie politique (I) Introduction

Dans cette série, je m’intéresse à l’épistémologie de la pensée politique. J’entends ici par épistémologie la théorie de la connaissance de manière générale, c’est-à-dire pas seulement la connaissance scientifique mais la connaissance du monde, plus généralement. Dans la pensée politique, il y a bien sûr un aspect normatif (comment la société devrait être organisée et dans quel but) mais cette normativité prend toujours appui sur une théorie du monde : comment fonctionnent la société, l’économie, les structures de pouvoir. Elle inclut également toujours un certain nombre de présupposés sur la psychologie humaine. Par exemple, selon une théorie typiquement associée à la droite conservatrice, l’homme est un loup pour l’homme (Hobbes) ; il s’ensuit que des institutions doivent être créées pour protéger l’homme de ses congénères. Une variante (droite libérale) est que l’homme cherche avant tout à maximiser à son intérêt personnel ; il s’ensuit que les institutions sociales doivent être organisées de façon à ce que l’intérêt personnel coïncide avec l’intérêt collectif. Selon une autre théorie ancrée à gauche, l’homme est naturellement altruiste (Rousseau) ; il s’ensuit que la société doit être organisée pour faciliter la coopération entre les hommes (avec bien sûr de nombreuses variantes ; anarchismes, communismes, etc).

Ainsi les doctrines politiques sont en grande partie déterminées par des théories sous-jacentes de l’homme et du monde. Par conséquent, les désaccords politiques sont très souvent liés à des désaccords sur ces théories et se manifestent donc sur le plan épistémologique. Par exemple, les discours de droite tendent à se présenter comme « réalistes » ; la droite libérale comme « rationnelle » ; par opposition à une gauche qui serait « utopiste ». En utilisant ces mots, on se place non sur le plan de la finalité politique mais sur celui de la connaissance : on admet que la finalité des systèmes politiques critiqués est louable, mais on prétend qu’ils reposent sur une vision fausse de la façon dont le monde fonctionne. C’est donc réellement sur la connaissance que porte le jugement, sur la validité empirique des théories sous-jacentes.

Le système politique néolibéral, par exemple (habituellement classé au centre droit), refuse généralement l’appellation « néolibéral », pourquoi ? Parce que ses partisans ne pensent pas suivre une doctrine particulière, mais simplement exprimer ce qui est rationnel, « logique ». D’un point de vue épistémologique, cette posture est critiquable, puisque la logique s’exprime dans un cadre formel, et donc dans un modèle particulier. Autrement dit, la rationalité s’exerce au sein d’une théorie particulière du monde, ce qui fait que deux discours contradictoires peuvent être rationnels, mais relatifs à des théories différentes. Par exemple, les discours Keynésiens et néoclassiques sont deux discours rationnels contradictoires, parce qu’ils reposent sur des modèles différents. Conformément à cette posture rationaliste, la pensée néolibérale ou néoclassique repose largement sur une connaissance mathématisée, c’est-à-dire dont les questions portent sur des aspects formels plutôt que sur la validité empirique du modèle sous-jacent (comme le concept de l’agent rationnel). La théorie tend par conséquent à ignorer les champs du savoir permettant de questionner empiriquement les modèles socio-économiques, tels que l’histoire, la sociologie, l’anthropologie. On peut donc formuler une critique épistémologique de cette pensée politique.

De manière symétrique, le discours de gauche tend à dépeindre la pensée politique de droite comme la manifestation sournoise de mauvaises intentions. Par exemple, le discours économique de droite tend à promouvoir la réduction de l’impôt, en particulier sur les plus riches. Ceci est vu par la critique de gauche comme la défense des intérêts d’une classe dominante. Pourquoi ? Encore une fois, on peut analyser la question sous l’angle épistémologique. Selon les théories de gauche, l’impôt est ce qui permet de répartir équitablement les richesses. Par conséquent, une mesure tendant à réduire l’impôt favorise les classes riches de la population. Il s’ensuit qu’un système politique qui promeut cette mesure doit avoir pour but de favoriser ces classes. Il s’agit encore une fois de la simple expression de la rationalité au sein d’un cadre théorique. On a donc dans le discours néolibéral et dans sa critique deux discours rationnels au sein de cadres théoriques différents.

Ce dernier exemple soulève un autre point épistémologique intéressant, qui est la façon dont les partisans d’une théorie jugent ceux d’une autre théorie. On voit dans cet exemple l’opposition de deux mépris : le néolibéral considère son critique comme idiot (irrationnel) ; le critique considère le néolibéral comme égoïste et de mauvaise foi. Dans les deux cas, le partisan juge son adversaire en utilisant son propre cadre théorique, c’est-à-dire comme si l’adversaire utilisait le même cadre théorique. En effet, le néolibéral juge son critique idiot, parce que ce critique serait effectivement idiot s’il adoptait le cadre théorique néolibéral mais n’en tirait pas les conclusions logiques (d’où le discours récurrent des gouvernants face à leur opposition qu’il faut « faire de la pédagogie », ce qui est perçu à juste titre comme du mépris). De même, l’opposant juge le néolibéral de mauvaise foi, c’est-à-dire qu’il considère que celui-ci est tout à fait conscient que ses propositions politiques favorisent la classe dominante. Or ceci suppose qu’il a adopté le cadre théorique alternatif de l’opposant. Dans les deux cas donc, chacun semble négliger la possibilité que son propre cadre de pensée est une théorie dont il est convaincu, et non une vérité évidente et universelle. Il en résulte une critique relativement stérile, en cela qu’elle ne porte pas sur les fondements (notamment empiriques) des théories en compétition mais sur les supposées compétences des interlocuteurs d’un côté (arguments d’autorité) et sur les intérêts personnels ou de classe de l’autre (invectives).

On touche ici à deux points distincts. D’une part, au statut épistémologique des théories (est-ce qu’elles se valent toutes et ne sont que des points de vue ? ou peut-on les juger empiriquement ou théoriquement ?). Sur ce point, on peut s’appuyer sur une riche littérature en philosophie des sciences. D’autre part, à la psychologie des croyances : qu’est-ce qui fait que l’on croit à certaines théories plutôt qu’à d’autres, et qu’éventuellement on change d’avis ? Sur ce deuxième point, on peut s’appuyer également sur une riche littérature en psychologie sociale, comme la théorie de la dissonance cognitive dont je parlerai dans un prochain texte. Celle-ci propose que l’on cherche à rendre nos actes et nos croyances cohérents non seulement en agissant conformément à nos croyances, mais également dans de nombreux cas en adaptant nos croyances à nos actes (entre autres). Par exemple, une personne qui gagne beaucoup d’argent peut se convaincre qu’un système politique qui favorise l’inégalité est plus efficace. Ceci explique l’alignement entre catégories sociologiques et croyances politiques, qui est plus satisfaisante que la théorie de la mauvaise foi. Ce n’est effectivement pas un hasard si les classes supérieures tendent à adopter une théorie qui justifie leur position sociale (e.g. la théorie néolibérale), mais cela ne veut pas dire pour autant que cette adoption est cynique. Au contraire, ces croyances sont sincères. Simplement, différentes catégories sociologiques et culturelles sont plus ou moins susceptibles d’adopter différentes croyances.

Dans cette série, je compte donc développer une critique épistémologique du discours politique.

What is computational neuroscience? (XXXIV) Is the brain a computer (2)

In a previous post, I argued that the way the brain works is not algorithmic, and therefore it is not a computer in the common sense of the term. This contradicts a popular view in computational neuroscience that the brain is a kind of a computer that implements algorithms. That view comes from formal neural network theory, and the argumentation goes as follows. Formal neural networks can implement any computable function, which is a function that can be implemented by an algorithm. Thus the brain can implement algorithms for computable functions, and therefore is by definition a computer. There are multiple errors in this reasoning. The most salient error is a semantic drift on the concept of algorithm, the second major error is a confusion on what a computer is.


A computable function is a function that can be implemented by an algorithm. But the converse “if a function is computable, then whatever implements this function runs an algorithm” is not true. To see this, we need to be a bit more specific about what is meant by “algorithm” and “computable function”.

Loosely speaking, an algorithm is simply a set of explicit instructions to solve a problem. A cooking recipe is an algorithm in this sense. For example, to cook pasta: put water in a pan; heat up; when water boils, put pasta; wait for 10 minutes. The execution of this algorithm occurs in continuous time in a real environment. But what is algorithmic about this description is the discrete sequential flow of instructions. Water boiling itself is not algorithmic, the high-level instructions are: “when condition A is true (water boils), then do B (put pasta)”. Thus, when we speak of algorithms, we must define what is considered as elementary instructions, that is, what is beneath the algorithmic level (water boils, put pasta).

The textbook definition of algorithm in computer science is: "a sequence of computational steps that transform the input into the output." (Cormen et al., Introduction to algorithms; possibly the most used textbook on the subject). Computability is a way to formalize the notion of algorithm for functions of integers (in particular logical functions). To formalize it, one needs to specify what is considered an elementary instruction. Thus, computability does not formalize the loose notion of algorithm above, i.e, any recipe to calculate something, for otherwise any function would be computable and the concept would be empty (to calculate f(x), apply f to x). A computable function is a function that can be calculated by a Turing machine, or equivalently, which can be generated by a small set of elementary functions on integers (with composition and recursion). Thus, an algorithm in the sense of computability theory is a discrete-time sequence of arithmetic and logical operations (and recursion). Note that this readily extend to any countable alphabet instead of integers, and of course you can replace arithmetic and logical operations with higher-order instructions, as long as they are themselves computable (ie a high-level programming language). But it is not any kind of specification of how to solve a problem. For example, there are various algorithms to calculate pi. But we could also calculate pi by drawing a circle, measuring both the diameter and the perimeter, then dividing perimeter by diameter. This is not an algorithm in the sense of computability theory. It could be called an algorithm in the broader sense, but again note that what is algorithmic about it is the discrete structure of the instructions.

Thus, a device could calculate a computable function using an algorithm in the strict sense of computability, or in the broader sense (cooking recipe), or in a non-algorithmic way (i.e., without any discrete structure of instructions). In any case, what the brain or any device manages to do bears no relation with how it does it.

As pointed out above, what is algorithmic about a description of how something works is the discrete structure (first do A; if B is true, then do C, etc). If we removed this condition, then we would be left with the more general concept of model, not algorithm: a description of how something works. Thus, if we want to say anything specific by claiming that the brain implements algorithms, then we must insist on the discrete-time structure (steps). Otherwise, we are just saying that the brain has a model.

Now that we have more precisely defined what an algorithm is, let us examine whether the brain might implement algorithms. Clearly, it does not literally implement algorithms in the narrow sense of computability theory, i.e., with elementary operations on integers and recursion. But could it be that it implements algorithms in the broader sense? To get some perspective, consider the following two physical systems:

(A) are dominoes, (B) is a tent (illustration taken from my essay “Is coding a relevant metaphor for the brain?”). Both are physical systems that interact with an environment, in particular which can be perturbed by mechanical stimuli. The response of dominoes to mechanical stimuli might be likened to an algorithm, but that of the tent cannot. The fact that we can describe unambiguously (with physics) how the tent reacts to mechanical stimuli does not make the dynamics of the tent algorithmic, and the same is true of the brain. Formal neural networks (e.g. perceptrons or deep learning networks) are algorithmic, but the brain is a priori more like the tent: a set of coupled neurons that interact in continuous time, together and with the environment, with no evident discrete structure similar to an algorithm. As argued above, a specification of how these real neural networks work and solve problems is not an algorithm: it’s a model – unless we manage to map the brain’s dynamics to the discrete flow of an algorithm.


Thus, if a computer is something that solves problems by running algorithms, then the brain is not a computer. We may however consider a broader definition: the computer is something that computes, i.e., which is able to calculate computable functions. As pointed out above, this does not require the computer to run algorithms. For example, consider a box with some gas, a heater (input = temperature T) and a pressure sensor (output = P). The device computes the function P = nRT/V by virtue of physical laws, and not by an algorithm.

This box, however, is not a computer. Otherwise, any physical system would be called a computer. To be called a computer, the device should be able to implement any computable function. But what does it mean exactly? To run an arbitrary computable function, some parameters of the device need to be appropriately adjusted. Who adjusts these parameters and how? If we do not specify how this adjustment is being made, then the claim that the brain is a computer is essentially empty. It just says that for each function, there is a way to arrange the structure of the brain so that this function is achieved. It is essentially equivalent to the claim that atoms can calculate any computable function, depending on how we arrange them.

To call such a device a computer, we must additionally include a mechanism to adjust the parameters so that it does actually perform a particular computable function. This leads us to the conventional definition of a computer: something that can be instructed via computer programming. The notion of program is central to the definition of computers, whatever form this program takes. A crucial implication is that a computer is a device that is dependent on an external operator for its function. The external operator brings the software to the computer; without the ability to receive software, the device is not a computer.

In this sense, the brain cannot be a computer. We may then consider the following metaphorical extension: the brain is a self-programmed computer. But the circularity in this assertion is problematic. If the program is a result of the program itself, then the “computer” cannot actually implement any computable function, but only those that result from its autonomous functioning. A cat, a mouse, an ant and a human do not actually do the same things, and cannot even in principle do the same tasks.

Finally, is computability theory the right framework to describe the activity of the brain in the first place? It is certainly not the right framework to describe the interaction of a tent with its environment, so why would it be appropriate for the brain, an embodied dynamical system in circular relation with the environment? Computability theory is a theory about functions. But a dynamical system is not a function. You can of course define functions on dynamical systems, even though they do not fully characterize the system. For example, you can define the function that maps the current state to the state at some future time. In the case of the brain, we might want to define a function that maps an external perturbation of the system (i.e. a stimulus) to the state of the system at some future time. However, this is not well defined, because it depends on the state of the system at the time of the perturbation. This problem does not occur with formal neural networks precisely because these are not dynamical systems but mappings. The brain is spontaneously active, whether there is a “stimulus” or not. The very notion of the organism as something that responds to stimuli is the most naïve version of behaviorism. The organism has an endogenous activity and a circular relation to its environment. Consider for example central pattern generators: these are rhythmic patterns produced in the absence of any input. Not all dynamical systems can be framed into computability theory, and in fact most of them, including the brain, cannot because they are not mappings.


As I have argued in my essay on neural coding, there are two core problems with the computer metaphor of the brain (it should be clear by now that this is a metaphor and not a property). One is that it tries to match two causal structures that are totally incongruent, just like dominoes and a tent. The other is that the computer metaphor, just as the coding metaphor, implicitly assumes an external operator – who programs it / interprets the code. Thus, what these two metaphors fundamentally miss is the epistemic autonomy of the organism.

Is the coding metaphor relevant for the genome?

I have argued that the neural coding metaphor is highly misleading (see also similar arguments by Mark Bickhard in cognitive science). The coding metaphor is very popular in neuroscience, but there is another domain of science where it is also very popular: genetics. Is there a genetic code? Many scientists have criticized the idea of a genetic code (and of a genetic program). A detailed criticism can be found in Denis Noble’s book “The music of life” (see also Noble 2011 for a short review).

Many of the arguments I have made in my essay on neural coding readily apply to the “genetic code”. Let us start with the technical use of the metaphor. The genome is a sequence of DNA base triplets called “codons” (ACG, TGA, etc). Each codon specifies a particular amino-acid, and proteins are made of amino-acids. So there is a correspondence between DNA and amino-acids. This seems an appropriate use of the term “code”. But even it in this limited sense, it should be used with caution. The fact that a base triplet encodes an amino-acid is conditional on this triplet being effectively translated into an amino-acid (note that there are two stages, transcription into RNA, then translation into a protein). But in fact only a small fraction of a genome is actually translated, about 10% (depending on species); the rest is called “non-coding DNA”. So the same triplets can result in the production of an amino-acid, or they can influence the translation-transcription system in various ways, for example by interacting with various molecules involved in the production of RNA and proteins, thereby regulating transcription and translation (and this is just one example).

Even when DNA does encode amino-acids, it does not follow that a gene encodes a protein. What might be said is that a gene encodes the primary structure of proteins, that is, the sequence of amino-acids; but it does not specify by itself the shape that the protein will take (which determines its chemical properties), the various modifications that occur after translation, the position that the protein will take in the cellular system. All of those crucial properties depend on the interaction of the product of transcription with the cellular system. In fact, even the primary structure of proteins is not fully determined by the gene, because of splicing.

Thus, the genome is not just a book, as suggested by the coding metaphor (some have called the genome the “book of life”); it is a chemically active substance that interacts with its chemical environment, a part of a larger cellular system.

At the other end of the genetic code metaphor, genes encode phenotypes, traits of the organism. For example, the gene for blue eyes. A concept that often appears in the media is the idea of genes responsible for diseases. One hope behind the human genome project was that by scrutinizing the human genome, we might be able to identify the genes responsible for every disease (at least for every genetic disease). Some diseases are monogenic, i.e., due to a single gene defect, but the most common diseases are polygenic, i.e., are due to a combination of genetic factors (and generally environmental factors).

But even the idea of monogenic traits is misleading. There is no single gene that encodes a given trait. What has been demonstrated in some cases is that mutations in a single gene can impact a given trait. But this does not mean that the gene is responsible by itself for that trait (surprisingly, this fallacy is quite common in the scientific literature, as pointed out by Yoshihara & Yoshihara 2018). A gene by itself does nothing. It needs to be embedded into a system, namely a cell, in order to produce any phenotype. Consequently, the expressed phenotype depends on the system in which the gene is embedded, in particular the rest of the genome. There cannot be a gene for blue eyes if there are no eyes. So no gene can encode the color of eyes; this encoding is at best contextual (in the same way as “neural codes” are always contextual, as discussed in my neural coding essay).

So the concept of a “genetic code” can only be correct in a trivial sense: that the genome, as a whole, specifies the organism. This clearly limits the usefulness of the concept, however. Unfortunately, even this trivial claim is also incorrect. An obvious objection is that the genome specifies the organism only in conjunction with the environment. The deeper objection is that the immediate environment of the genome is the cell itself. No entity smaller than the cell can live or reproduce. The genome is not a viable system, and as such it cannot produce an organism, nor can it reproduce. An interesting experiment is the following: the nucleus (and thus the DNA) from an animal cell is transferred to the egg of an animal of another species (where the nucleus has been removed) (Sun et al., 2005). The “genetic code” theory would predict that the egg would develop into an animal of the donor species. What actually happens (this was done in related fish species) is that the egg develops into some kind of hybrid, with the development process closer to that of the recipient species. Thus, even in the most trivial sense, the genome does not encode the organism. Finally, since no entity smaller than the cell can reproduce, it follows that the genome is not the unique basis of heritability – the entire cell is (see Fields & Levin, 2018).

In summary, the genome does not encode much except for amino-acids (for about 10% of it). It should be conceptualized as a component that interacts with the cellular system, not as a “book” that would be read by some cellular machinery.

What is computational neuroscience? (XXXIII) The interactivist model of cognition

The interactivist model of cognition has been developed by Mark Bickhard over the last 40 years or so. It is related to the viewpoints of Gibson and O’Regan, among others. The model is described in a book (Bickhard and Tervenn, 1996) and a more recent review (Bickhard 2008).

It starts with a criticism of what Bickhard calls “encodingism”, the idea that mental representations are constituted by encodings, correspondences between things in the world and symbols (this is very similar to my criticism of the neural coding metaphor, except Bickhard’s angle is cognitive science while mine was neuroscience). The basic argument is that the encoding “crosses the boundary of the epistemic agent”: the perceptual system stands on only one side of the correspondence, so there is no way it can interpret symbols in terms of things in the world since it never has access to things in the world at any point. The interpretation of the symbols in terms of things in the world would require an interpreter, some entity that makes sense of a priori arbitrary symbols. But this was precisely the epistemic problem to be solved, so the interpreter is a homunculus and this is an incoherent view. This is related to the skeptic argument about knowledge: there cannot be valid knowledge since we acquire knowledge by our senses and we cannot step outside of ourselves to check that it is valid. Encodingism fails the skeptic objection. Note that Bickhard refutes neither the possibility of representations nor even the possibility of encodings, but rather the fact that encodings can be foundational of representations. There can be derivative encodings, based on existing representations (for example Morse is a derivative encoding, which presupposes that we know about both letters and dots and dashes).

A key feature that a representational system must have is what Bickhard calls “system-detectable errors”. A representational system must be able to test whether its representations are correct or not. This is not possible in encodingism because the system does not have access to what is being represented (knowledge that cannot be checked is what I called “metaphysical knowledge” in my Subjective physics paper). No learning is possible if there are no system-detectable errors. This is the problem of normativity.

The interactivist model proposes the following solution: representations are anticipations of potential interactions and their expected impact on future states of the systems, or on the future course of processes of the system (this is close to Gibson’s “affordances”). I give an example taken from Subjective physics. Consider a sound source located somewhere in space. What does it mean to know where the sound came from? In the encoding view, we would say that the system has a mapping between the angle of the source and properties of the sounds, and so it infers the source’s angle from the captured sounds. But what can this mean? Is the inferred angle in radians or degrees? Surely radians and degrees cannot make sense for the perceiver and cannot have been learned (this is what I called “metaphysical knowledge”), so in fact the representation cannot actually be in the form of the physical angle of the source. Rather, what it means that the source is at a given position is that (for example) you would expect that moving your eyes in a particular way would make the source appear in your fovea (see more detail about the Euclidean structure of space and related topics in Subjective physics). Thus, the notion of space is a representation of the expected consequences of certain types of actions.

The interactivist model of representations has the desirable property that it has system-detectable errors: a representation can be correct or not, depending on whether the anticipation turns out to be correct or not. Importantly, what is anticipated is internal states, and therefore the representation does not cross the boundary of the epistemic agent. Contrary to standard models of representation, the interactivist model successfully addresses the skeptic argument.

The interactivist model is described at a rather abstract level, often referring to abstract machine theory (states of automata). Thus, it leaves aside the problem of its naturalization: how is it instantiated by the brain? Important questions to address are: what is a ‘state’ of the brain? (in particular given that the brain is a continuously active dynamical system where no “end state” can be identified); how do we cope with its distributed nature, that is, that the epistemic agent is itself constituted of a web of interacting elementary epistemic agents? how are representations built and instantiated?

Better than the grant lottery

Funding rates for most research grant systems are currently very low, typically around 10%. This means that 90% of the time spent on writing and evaluating grant applications is wasted. It means that if each grant spans 5 years, then a PI has to write about 2 grants per year to be continuously funded; in practice, to reduce risk it should be more than 2 per year. It is an enormous waste, and in addition to that, it is accepted that below a certain funding rate, grant selection is essentially random (Fang et al., 2016). Such competition also introduces conservative biases (since only those applications that are consensual can make it to the top 10%), for example against interdisciplinary studies. Thus, low funding rates are a problem not only because of waste but also because they introduce distortions.

For these reasons, a number of scientists have proposed to introduce a lottery system (Fang 2016; see also Mark Humphries’ post): after a first selection, of say, the top 20-30%, the winners are picked at random. This would reduce bias without impacting quality. Thus, it would certainly be a progress. However, it does not address the problem of waste. 90% of applications would still be written in vain.

First, there is a very elementary enhancement to be implemented: pick at random before you evaluate the grants, i.e., directly reject every other grant, then select the best 20%. This gives exactly the same result, except the cost of evaluation is divided by two.

Now I am sure it would feel quite frustrating for an applicant to write a full grant only to get immediately rejected by the flip of a coin. So there is again a very simple enhancement: decide who will get rejected before they write the application. Pick at random 50% of scientists and invite them to submit a grant. Again, the result is the same, but in addition you reduce the time spent on grant writing by two.

At this point we might wonder why do this initial selection at random? This introduces variance for no good reason. You never know in advance whether you will be allowed to get funding next year and this seems arbitrary. Thus, there is an obvious enhancement: replace lottery by rotation. Every PI is allowed to submit a grant only every two years. Again, this is equivalent on average to the initial lottery system, except there is less variance and less waste.

This reasoning leads me to a more general point. There is a simple way to increase the success rate of a grant system, which is to reduce the number of applications. The average funding rate of labs does not depend on the number of applications; it depends on the budget and only on the budget. If you bar 50% of scientists from applying, then you don’t divide by two the average budget of every lab. The average budget allocated to each lab is the same, but the success rate is doubled.

The counter-intuitive part is that individually, you increase your personal success rate if you apply to more calls. But collectively it is exactly the opposite: the global success rate decreases if there are more calls (for the same overall budget), since there are more applications. This is because the success rate is low because of other people submitting, not because you are submitting. This is a tragedy of commons phenomenon.

There is a simple way to solve it, which is to add constraints. There are different ways to do it: 1) reduce the frequency of calls, and merge redundant calls, 2) introduce a rotation (e.g. those born on even years submit on even years), 3) do not allow submission if you are already funded (or say, in the first years). Any of these constraints mechanically increases the success rate, thus reduces both waste and bias, with no impact on average funding. It is better than a lottery.


p.s.: There is also an obvious and efficient way to reduce the problem, which is to increase base funding, so that scientists do not need grants in order to survive (see this and other ideas in a previous post).

Revues prédatrices : quel est le problème ?

Un récent article du Monde alerte sur un phénomène qui prend de l’ampleur dans l’édition scientifique : les revues prédatrices (voir aussi l’éditorial). Il s’agit d’éditeurs commerciaux qui publient des articles scientifiques en ligne, contre rémunération, sans aucune éthique scientifique, en particulier en acceptant tous les articles sans qu’ils soient revus par des pairs. De manière similaire, les fausses conférences se multiplient ; des entreprises organisent des conférences scientifiques dans un but purement commercial, sans se soucier de la qualité scientifique.

En réaction, certaines institutions commencent à monter des « listes blanches » de journaux à éviter. C’est compréhensible, puisque le phénomène a un coût important. Mais la réponse néglige le problème fondamental. Il faut se rendre à l’évidence : l’éthique commerciale (recherche du profit) n’est pas compatible avec l’éthique scientifique (recherche de la vérité). Les entreprises dont on parle ne sont pas illégales, à ma connaissance. Elles organisent des conférences qui sont réelles ; elles publient des journaux qui sont réels. Simplement, elles ne se soucient pas de la qualité scientifique, mais de leur profit. On considère cela comme immoral ; mais une entreprise commerciale n’a pas de dimension morale, il s’agit simplement d’une organisation dont le but est de générer du profit. On ne peut s’attendre à ce que les intérêts commerciaux correspondent comme par magie exactement aux intérêts scientifiques.

  1. Le problème de l’édition commerciale

Ceci est vrai aux deux extrémités du spectre de la publication académique : pour les journaux prédateurs comme pour les journaux prestigieux. L’article parle de « fausse science » ; mais la plupart des cas de fraude scientifique ont été révélés dans des journaux prestigieux, pas dans des journaux prédateurs – qui de toutes façons ne sont pas lus par la communauté scientifique (voir par exemple Brembs (2018) pour le lien entre qualité méthodologique et prestige du journal). Pour les journaux commerciaux prestigieux, la stratégie commerciale des éditeurs est non pas de maximiser le nombre d’articles publiés, mais de maximiser le prestige perçu de ces journaux, qui servent ensuite d’appâts pour vendre les collections de journaux de l’éditeur. Autrement dit, c’est une stratégie de marque. Cela passe notamment par une sélection drastique des articles soumis, opérée par des éditeurs professionnels, c’est-à-dire pas par des scientifiques professionnels, sur la base de l’importance perçue des résultats, poussant ainsi une génération de scientifiques à gonfler les prétentions de leurs articles. Cela passe par la promotion auprès des institutions publiques de métriques douteuses comme le facteur d’impact, et plus généralement la promotion d’une mythologie de la publication prestigieuse, à savoir l’idée fausse et dangereuse qu’un article doit être jugé par le prestige du journal dans lequel il est publié, plutôt que par sa valeur scientifique intrinsèque – qui elle est évaluée par la communauté scientifique, pas par un éditeur commercial, ni même par deux scientifiques anonymes. En proposant d’éditer des listes de mauvais journaux, on ne résout pas le problème car l’on adhère implicitement à cette logique perverse.

Il suffit de regarder les marges dégagées par les grandes multinationales de l’édition scientifique pour comprendre que le modèle commercial n’est pas adapté. Pour Elsevier par exemple, les marges sont de l’ordre de 40%. La simple lecture de ce chiffre devrait nous convaincre immédiatement que l’édition scientifique devrait être gérée par des institutions publiques, du moins non commerciales (par exemple des sociétés savantes, comme c’est le cas d’un certain nombre de journaux). Quel est la justification pour faire appel à un opérateur commercial pour gérer un service public, ou n’importe quel service ? La motivation est que la compétition permet de diminuer les coûts et d’améliorer la qualité. Or si les marges sont de 40%, c’est que visiblement la compétition n’opère pas. Pourquoi ? Simplement parce que lorsqu’un scientifique soumet un article, il ne choisit pas le journal en fonction du prix ni même du service rendu (qui est en réalité essentiellement rendu par des scientifiques bénévoles), mais en fonction de la visibilité et du prestige du journal. Il n’y a donc pas de compétition sur les prix. Le pire qui pourrait arriver pour un éditeur commercial est que les articles scientifiques soient jugés à leur valeur intrinsèque plutôt que par le journal dans lequel ils sont publiés, parce qu’alors ce modèle commercial unique s’effondrerait et les journaux seraient en compétition sur les prix et les services qu’ils doivent fournir, comme n’importe quelle autre entreprise commerciale. C’est le pire qui puisse arriver aux éditeurs commerciaux, et le mieux qui puisse arriver à la communauté scientifique. Voilà pourquoi les intérêts commerciaux et scientifiques sont divergents.

Quoi qu’il en soit, il faut se rendre à l’évidence : des marges aussi énormes signifient que le modèle commercial est inefficace. Il faut donc cesser immédiatement de faire appel à des journaux commerciaux. Ce n’est pas très difficile : les institutions publiques sont tout à fait capables de gérer des journaux scientifiques ; il en existe et depuis longtemps. Un exemple récent est eLife, un des journaux les plus innovants actuellement en biologie. Cela ne devrait pas être très étonnant : le cœur de l’activité des journaux, à savoir la relecture des articles, est déjà faite par des scientifiques, y compris chez les éditeurs commerciaux qui font appel à leurs services gratuitement. Cela ne veut pas dire que l’on ne peut pas faire appel à des entreprises privées pour fournir des services, par exemple héberger des serveurs, gérer les sites web, fournir de l’infrastructure. Mais les journaux ne doivent plus appartenir à des sociétés commerciales, dont l’intérêt est de gérer ces journaux comme des marques. L’éthique scientifique n’est pas compatible avec l’éthique commerciale.

Comment faire ? En réalité c'est assez évident. Il s’agit pour les pouvoirs publics d’annuler la totalité des abonnements aux éditeurs commerciaux et de cesser de payer des droits de publication à ces éditeurs. De nos jours, il n’est pas difficile d’avoir accès à la littérature scientifique sans passer par les journaux (par les prépublications ou ‘preprints’ ou simplement en écrivant aux auteurs qui sont généralement ravis que l’on s’intéresse à leurs travaux). L’argent économisé peut être réinvesti en partie dans l’édition scientifique non commerciale.

  1. Le mythe de la revue par les pairs

Je veux maintenant en venir à une question d’épistémologie plus subtile mais fondamentale. Quel est au fond le problème des revues prédatrices ? Clairement, il y a le gaspillage d’argent public. Mais l’article du Monde pointe également des problèmes scientifiques, à savoir le fait que de fausses informations sont propagées, sans avoir été vérifiées. L’éditorial parle en effet de ‘la sacro-sainte « revue par les pairs »’, qui n’est pas effectuée par ces revues. Mais est-ce vraiment le problème fondamental ?

L’idée que ce qui fait la valeur d’un article scientifique est qu’il a été validé par la relecture par les pairs avant publication est un mythe tenace mais néanmoins erroné. Cela est faux d’un point de vue empirique, et d’un point de vue théorique.

D’un point de vue empirique, à tout instant, il existe dans la littérature des conclusions contradictoires à propos d’un grand nombre de sujets, publiées dans des revues traditionnelles. Les cas de fraude récents concernent des articles qui ont pourtant subi une relecture par les pairs. Mais c’est le cas aussi d’une quantité beaucoup plus importantes d’articles non frauduleux, mais dont les conclusions ont été contestées par la suite. L’histoire des sciences est remplie de théories scientifiques contradictoires et coexistantes, d’âpres débats entre scientifiques. Ces débats ont lieu, justement, après publication, et le consensus scientifique se forme généralement assez lentement, pratiquement jamais sur la base d’un seul article (voir par exemple Imre Lakatos en philosophie des sciences, ou Thomas Kuhn). Par ailleurs, les résultats scientifiques sont également souvent diffusés dans la communauté scientifique avant publication formelle ; c’est le cas aujourd’hui avec les prépublications (« preprints » en ligne), mais c’était déjà partiellement le cas auparavant avec les conférences. L’article publié reste la référence parce qu’il fournit des détails précis, notamment méthodologiques, mais la contribution des relecteurs sollicités par les journaux n’est dans la plupart des cas pas essentielle, d’autant que celle-ci n’est généralement pas rendue publique.

D’un point de vue théorique, il n’y a aucune raison que la relecture par les pairs « valide » un résultat scientifique. Il n’y a rien de magique dans la revue par les pairs : simplement deux, parfois trois scientifiques donnent leur avis éclairé sur le manuscrit. Ces scientifiques ne sont pas plus experts que ceux qui vont lire l’article lorsqu’il sera publié (je parle bien sûr de la communauté scientifique et pas du grand public). Le fait qu’un article soit publié dans un journal ne dit pas grand chose en soi de la réception des résultats par la communauté ; lorsqu’un article est rejeté d’un journal, il est resoumis ailleurs. La publication finale n’atteste absolument pas d’un consensus scientifique. Par ailleurs, lorsqu’il s’agit d’études empiriques, les relecteurs n’ont pas en réalité la possibilité de vérifier les résultats, et notamment de vérifier s’il n’y a pas eu de fraude. Tout ce qu’ils peuvent faire, c’est vérifier que les méthodes employées semblent appropriées, et que les interprétations semblent sensées (deux points souvent sujets à débat). Pour valider les résultats (mais pas les interprétations), il faudrait au minimum pouvoir refaire les expériences en question, ce qui suppose le temps et l’équipement nécessaire. Ce travail indispensable est fait (ou tenté), mais il n’est pas fait au moment de la publication, ni commissionné par le journal. Il est fait après publication par la communauté scientifique. Le travail de « vérification » (mot inapproprié car il n’y a pas de vérité absolue en science, ce qui la distingue justement de la religion) est le travail de fond de la communauté scientifique, ce n’est pas le travail ponctuel du journal.

C’est cette idée reçue qu’il faut déconstruire : que le travail de revue interne au journal « valide » d’une certaine manière les résultats scientifiques. Ce n’est pas le cas, cela n’a jamais été le cas, et cela ne peut pas être le cas. La validation scientifique est la nature même de l’entreprise scientifique, qui est un travail collectif et de longue haleine. On ne peut pas lire un article et conclure « c’est vrai »; il faut pour cela l’intégrer dans un ensemble de connaissances scientifiques, confronter l’interprétation à des points de vue différents (car toute interprétation requiert un cadre théorique).

C’est justement cette idée reçue que les journaux prestigieux tentent au contraire de consolider. Il faut y résister. L’antidote est de rendre public et transparent le débat scientifique, qui actuellement reste souvent confiné aux couloirs des laboratoires et des conférences. On prétend que la relecture par les pairs valide les résultats scientifiques, mais ces rapports ne sont la plupart du temps pas publiés ; et quid des rapports non publiés lorsque l’article est rejeté par un journal ? Comment savoir alors ce qu’en pense la communauté ? Il faut au contraire rendre public le débat scientifique. C’est par exemple l’ambition de sites comme PubPeer, qui a mis à jour un certain nombre de fraudes, mais qui peut être utilisé simplement pour le débat scientifique de manière générale. Plutôt que de conditionner la publication à un accord confidentiel de scientifiques anonymes, il faut au contraire inverser ce système : publier l’article (c’est en fait déjà le cas par la prépublication), puis solliciter les avis de la communauté, qui seront également publiés, argumentés, discutés par les auteurs et le reste de la communauté. C’est ainsi que les scientifiques, mais également le plus grand public, pourront obtenir un vision plus juste de la valeur scientifique des articles publiés. La revue par les pairs est un principe fondamental de la science, oui, mais pas celle effectuée dans la confidence par les journaux, celle au contraire effectuée au grand jour et sans limite de temps par la communauté scientifique.

What is computational neuroscience? (XXXII) The problem of biological measurement (2)

In the previous post, I have pointed out differences between biological sensing and physical measurement. A direct consequence is that it is not so straightforward to apply the framework of control theory to biological systems. At the level of behavior, it seems clear that animal behavior involves control; it is quite documented in the case of motor control. But this is the perspective of an external observer: the target value, the actual value and the error criterion are identified with physical measurements by an external observer. But how does the organism achieve this control, from its own perspective?

What the organism does not do, at least not directly, is measure the physical dimension and compare it to a target value. Rather, the biological system is influenced by the physical signal and reacts in a way that makes the physical dimension closer to a target value. How? I do not have a definite answer to this question, but I will explore a few possibilities.

Let us first explore a conventional possibility. The sensory neuron encodes the sensory input (eg muscle stretch) in some way; the control system decodes it, and then compares it to a target value. So for example, let us say that the sensory neuron is an integrate-and-fire neuron. If the input is constant, then the interspike interval can be mapped back to the input value. If the input is not constant, it is more complicated but estimates are possible. There are various studies relevant to this problem (for example Lazar (2004); see also the work of Sophie Denève, e.g. 2013). But all these solutions require knowing quite precisely how the input has been encoded. Suppose for example that the sensory neuron adapts with some time constant. Then the decoder needs somehow to de-adapt. But to do it correctly, one needs to know the time constant accurately enough, otherwise biases are introduced. If we consider that the encoder itself learns, e.g. by adapting to signal statistics (as in the efficient coding hypothesis), then the properties of the encoder must be considered unknown by the decoder.

Can the decoder learn to decode the sensory spikes? The problem is it does not have access to the original signal. The key question then is: what could the error criterion be? If the system has no access to the original signal but only streams of spikes, then how could it evaluate an error? One idea is to make an assumption about some properties of the original signal. One could for example assume that the original signal varies slowly, in contrast with the spike train, which is a highly fluctuating signal. Thus we may look for a slow reconstruction of the signal from the spike train; this is in essence the idea of slow feature analysis. But the original signal might not be slowly fluctuating, as it is influenced by the actions of the controller, so it is not clear that this criterion will work.

Thus it is not so easy to think of a control system which would decode the sensory neuron activity into the original signal so as to compare it to a target value. But beyond this technical issue (how to learn the decoder), there is a more fundamental question: why splitting the work into two units (encoder/decoder), if the function of the second one is essentially to undo the work of the first one?

An alternative is to examine the system as a whole. We consider the physical system (environment), the sensory neuron, the actuator, and the interneurons (corresponding to the control system). Instead of seeing the sensory neuron as involved in an act of measurement and communication and the interneurons as involved in an act of interpretation and command, we see the entire system as a distributed dynamical system with a number of structural parameters. In terms of dynamical systems (rather than control), the question becomes: is the target value for the physical dimension an attractive fixed point of this system, or more generally, is there such a fixed point? (as opposed to fluctuations) We can then derive complementary questions:

  • robustness: is the fixed point robust to perturbations, for example changes in properties of the sensor, actuator or environment?
  • optimality: are there ways to adjust the structure of the system so that the firing rate is minimized (for example)?
  • control: can we change the fixed point by an intervention on this system? (e.g. on the interneurons)

Thus, the problem becomes one of designing a spiking system that has an attractive fixed point in the physical dimension, with some desirable properties. Framing the problem in this way does not necessarily require that the physical dimension is explicitly extracted (“decoded”) from the activity of the sensory neuron. If we look at such a system, we might not be able to identify in any of the neurons a quantity that corresponds to the physical signal, or to the target value. Rather, physical signal and target value are to be found in the physical environment, and it is a property of the coupled dynamical system (neurons-environment) that the physical signal tends to approach the target value.