This week's paper selection (20-27 Jan 2016)

 

This week's paper selection (13-20 Jan 2016)

The magic of Popper vs. the magic of Darwin

In a previous post, I pointed out that Darwin's theory of evolution is incomplete, because it does not explain why random variations are not arbitrary. The emergence of complex adapted living beings does not follow from the logical structure of the explanation: introduce random variations and select the best ones. I gave the example of binary programs: introduce random variations in the code and select the best ones. It doesn't work. Yet the Darwinian explanation has something self-evident in it: if you select the best variations around something, then you should end up with something better. The fallacy is that a variation does not necessarily result in better and worse outcomes with comparable probabilities. For a program, changing one bit generally results in a faulty program. This is why no one in software engineering uses the Darwinian method to write any program.

For the record, and even though it should be self-evident, I am not advocating for “intelligent design” of any sort. I find the debate between educated creationists (ie intelligent design advocates) and neo-Darwinians generally quite disappointing. Creationists would point out some mysterious aspect of life and evolution, and assume that anything mysterious must be divine. Neo-Darwinians would respond that there is nothing mysterious. I would think that a scientific attitude is rather to point out that mysterious does not imply divine, and to try to understand that mysterious thing.

Later it occurred to me that the same fallacy occurs in epistemology, namely in Karl Popper's view of science, probably the most influential epistemological theory among scientists. Popper proposed that a scientific statement, as opposed to a metaphysical statement, is something that can be falsified by an observation. A scientific statement is nothing else than a logical proposition, and you can make an infinite number of such propositions. To distinguish between them, you need to do experiments that falsify some of them. So many scientists seem to think that the scientific process is to design experiments that can distinguish between different theories. This explains the focus on tools that I have talked about before.

There is a rather direct analogy with Darwin's theory of evolution. The focus is on selection, but neglects a critical aspect of the process, which is the creative process: how do you come up with candidate theories in the first place? I discussed this problem in a previous post. For any given set of observations, there is an infinite number of logical statements that are consistent with it. Therefore, you cannot deduce theories from observations; this is the problem of induction. How then do scientists propose theories, and why do we test some theories and not others that would have the same degree of logical validity? (e.g theory = existing set of observations + random prediction of a new observation) This is what we might call the hard problem of epistemology, in reference to the hard problem of consciousness. Popper doesn't address that problem, yet it is critical to the scientific process. How about this epistemological process:

  • Consider a set of theories, which are logical propositions.
  • Select the best ones using Popper's falsificationnism.
  • Allow those theories to reproduce, kill the other ones.
  • Introduce random variations, e.g. randomly add/remove quantifiers or variables.
  • Repeat.

How well would that work? Science does seem to follow a sort of evolutionary process, but selection itself is not sufficient to explain it; one also needs to explain the creative process.

It is true that Popper has correctly identified experiment as a central aspect of science, just as Darwin has correctly identified selection as a central aspect of evolution. But Popper does not address the hard problem of epistemology, just as Darwin does not address the hard problem of evolution, and Tononi does not address the hard problem of consciousness.

This week's paper selection (6 – 13 Jan 2016)

 

What is computational neuroscience? (XXIV) - The magic of Darwin

Darwin’s theory of evolution is possibly the most important and influential theory in biology. I am not going to argue against that claim, as I do believe that it is a fine piece of theoretical work, and a great conceptual advance in biology. However, I also find that the explanatory power of Darwin’s theory is often overstated. I recently visited a public exhibition in a museum about Darwin. Nice exhibition overall, but I was a bit bothered by the claim that Darwin’s theory explains the origin, diversity and adaptedness of species, case solved. I have the same feeling when I read in many articles or when I hear in conversations with many scientists that such and such observed feature of living organisms is “explained by evolution”. The reasoning generally goes like this: such biological structure is apparently beneficial to the organism, and therefore the existence of that structure is explained by evolution. As if the emergence of that structure directly followed from Darwin’s account of evolution.

To me, the Darwinian argument is often used as magic, and is mostly void of any content. Replace “evolution” by “God” and you will notice no difference in the logical structure or arguments. Indeed, what the argument actually contains is 1) the empirical observation that the biological organism is apparently adapted to its environment, thanks to the biological feature under scrutiny; 2) the theoretical claim that organisms are adapted to their environment. Note that there is nothing in the argument that actually involves evolution, i.e., the change of biological organisms through some particular process. Darwin is only invoked to back up the theoretical claim that organisms are adapted, but there is nothing specifically about Darwinian evolution that is involved in the argument. It could well be replaced by God, Lamarck or aliens.

What makes me uneasy is that many people seem to think that Darwin’s theory fully explains how biological organisms get to be adapted to their environment. But even in its modern DNA form, it doesn’t. It describes some of the important mechanisms of adaptation, but there is an obvious gap. I am not saying that Darwin’s theory is wrong, but simply that it only addresses part of the problem.

What is Darwin’s theory of evolution? It is based on three simple steps: variation, heredity and selection. 1) Individuals of a given species vary in different respects. 2) Those differences are inherited. In the modern version, new variations occur randomly at this step, and so variations are introduced gradually over generations. 3) Individuals with adapted features survive and reproduce more than others (by definition of “adapted feature”), and therefore spread those features in the population. There is ample empirical evidence for these three claims, and that was the great achievement of Darwin.

The gap in the theory is the nature and distribution of variations. In the space of all possible small variations in structure that one might imagine, do we actually see them in a biological population? Well for one, there are a substantial number of individuals that actually survive for a certain time, so a large number of those variations are not destructive. Since the metaphor of the day is to see the genome as a code for a program, let us consider computer programs. Take a functional program and randomly change 1% of all the bits. What is the probability that 1) the program doesn’t crash, 2) it produces something remotely useful? I would guess that the probability is vanishingly small. You will note that this is not a very popular technique in software engineering. Another way to put it: consider the species of programs that calculate combinatorial functions (say, factorials, binomial coefficients and the like). Surely one might argue that individuals vary by small changes, but conversely, would a small random change in the code typically produce a new combinatorial function?

So it doesn’t follow logically from the three steps of Darwin’s theory that biological organisms should be adapted and survive to changing environments. There is a critical ingredient that is missing: to explain how, in sharp contrast with programs, a substantial fraction of new variations are constructive rather than destructive. In more modern terms, how is it that completely random genetic mutations result in variations in phenotypes that are not arbitrary?

Again I am not saying that Darwin is wrong, but simply that his theory only addresses part of the problem, and that it is not correct to claim that Darwin’s theory fully explains how biological organisms are adapted to their environment (ie, perpetuate themselves). A key point, and a very important research question, is to understand how new variations can be constructive. This can be addressed within the Darwinian framework, as I outlined in a previous post. It leads to a view that departs quite substantially from the program metaphor. A simple remark: the physical elements that are subject to random variation cannot be mapped to the physical elements of structure (e.g. molecules) that define the phenotype, for otherwise those random variations would lead to random (ie mostly destructive) phenotypes. Rather, the structure of the organism must be the result of a self-regulatory process that can be steered by the elements subject to random variation. This is consistent with the modern view of the genome as a self-regulated network of genes, and with Darwin’s theory. But it departs quite substantially from the magic view of evolution theory that is widespread in the biological literature (at least in neuroscience), and instead points to self-regulation and optimization processes operating at the scale of the individual (not of generations).

Is peer review a good thing?

Everyone knows peer review is a good thing for science. It's almost what makes it science: papers are critically examined by scientists and so only (or mostly) the good stuff get published, or at least errors are generally corrected in the process. Of course no system is perfect, but it certainly raises the quality of science.

Well for someone interested in epistemology like myself, this is quite interesting. Here are a bunch of scientists, possibly most scientists, who are claiming that the claims of their fellow scientists should be scrutinized for adequate empirical evidence (especially p-values !) and filtered so as to improve the quality of science. And how about that claim? Is it backed up by any empirical evidence?

So the situation is interesting, because the widespread belief in peer review is purely theoretical and based on no evidence at all. Sounds a bit like the liar paradox! Some people have looked for evidence though, but as far as I know what they found is mostly negative evidence (see for example Smith (2010), Classical peer review: an empty gun).

What is peer review meant to be good for?

  1. to remove errors (verification);
  2. to identify the most significant work and increase their visibility (publicity);
  3. to give proper credit to good work (a label of quality used for grant applications and careers).
  4. to scare scientists into producing better papers (prevention).

1) We all know that any study can be published, whether it's flawed or not, significant or not. There are just so many journals that it ought to get published somewhere. We also know that errors also often get into most visible journals. Why wouldn't they? Whether prestigious or not, the journal still relies on a couple of scientists giving an opinion on a paper. Private peer-review is constitutively incapable of spotting errors. If the paper is not selected by the peers, no one will ever know those errors, which will be published in another journal. A more useful alternative would be that scientists who spot errors or disagree with the interpretations publicly write their comments under their name, to which the authors can respond, all of this linked to the original publication. If this were constitutive of the publication process, it could perhaps be done in a civilized way (as opposed to the occasional angry letters of the editor).

2) In the most prestigious journals, the selection is not done by peers anyway. So the publicity argument doesn't work. More importantly, there appears to be no reason to apply this editorial filter before peer review. Since the paper is going to be peer reviewed anyway, why not do an editorial selection afterwards? This would then take the form of a reader's digest. Editorial boards could invite selected authors to write a shortened version of their work, for example. One could imagine many methods of increasing the visibility of selected papers, which do not need to be centralized by journals. For example, each scientist could make a selection of papers he liked most; someone who values the choices of that scientist could then have a look. An organized online social network might do the job.

3) My feeling is that this is what scientists are actually expecting of the peer-review system: credit. That is, to identify the best scientists out there. Careers are made and grants are distributed based on the outcomes of the publication system. Note: those are not based on peer review itself (in particular on the criticisms that may have been raised in the process), but on which peer-reviewed journal the paper lands in. So it's actually not exactly based on peer-review, since as noted above the most stringent filters are not applied by peers but by editors. But in any case, if what we are expecting of this process is a “label of quality”, then there appears to be no reason why it should be applied before publication rather than after, and that could be done as outlined above.

4) I think the only vaguely valuable argument in favor of pre-publication peer review is that it influences the way scientists write their papers. In particular, one might argue that even though peer review does not do a great job at spotting errors, it may nevertheless reduce the flaws in publications by its preventive action: scientists are more careful because they want their paper published. This might be true, but why would it be different if peers commented on the paper after publication? I would actually think that the opposite should be true: to spot flaws in the private circle of journals and peer reviewers is not so bad for one's reputation as spotting flaws in the public sphere.

So what is the conclusion? My feeling is that the current peer review system is based on an outdated publication system, with printed journals having to make a selection of papers for economical reasons. Now that publication is actually not the limiting factor anymore (e.g. you can publish on arxiv), there appears to be no reason to apply filters before publication. Papers could be reviewed, commented and selected by scientific communities, after online publication. This would save a lot of effort and money, produce a richer scientific debate and possibly reduce some biases.

This week's paper selection (23 Dec – 6 Jan 2016)

This week I decided to make a subselection because my lists tend to be quite long!