The psychology of academic hiring committees (IV) Attribute susbtitution: how much does the candidate/lab/field need the position?

Even though experience is only a substitute for the attribute that is supposed to be evaluated in hiring decisions, there is at least some correlation between the two attributes, or at least in some cases (like number of publications). More troubling are the following criteria which have little to do with the target attribute:

- The number of times the candidate has previously applied.

- Whether another candidate is applying for the same lab (which would be bad).

- Whether the lab had a successful candidate the previous year.

- Whether the sub-discipline has not had a successful candidate for some time (which would be good).

- In more senior positions, whether the candidate already has a (junior) position in France (bad).

These criteria, which are used in actual rankings, answer an entirely different question: how much does the candidate or discipline or lab need the position? This has nothing to do with any of the official criteria.

The “queue”

In some disciplines, committee members will easily tell you that there is in effect a “queue”, because there are so many good candidates. You should not expect to get the position the first time, even if you are very good (although it certainly helps!).

First remark, in defence of the committees: the second time you apply, you should certainly have a better chance to get the position, since a number of people who were better ranked than you got the position and therefore will not be competing anymore. This is obvious, but it is not exactly what is meant by “there is a queue”. As I have understood it, what is meant is the following. Again as I wrote before, selecting young scientists is a very difficult task, especially for a committee with heterogeneous expertise. Committee members are therefore happy to use an easier substituted attribute. When two good applications are discussed, and one candidate applies for the first time while the other one applies for the third time, it is tempting for the committee to reason as follows: let us give the position to the older applicant, and the younger applicant will get it next time; implied: this way we actually pick the two applicants who are both good. This is of course a fallacy since no position is created in the process. Whichever applicant is chosen, it will mean that another applicant will not get the position. So the reasoning is illogical. The applicant that should be selected is the best one, not the one that has applied the most number of times. Since the number of positions granted each year is not going to be changed by the decision process, the result of such fallacious reasoning is only to artificially create a “queue” and increase the average hiring age.

Quotas

Criteria based on discipline or lab quotas are not necessary irrational, although they have nothing to do with the individual's scientific quality. But there is a similarly irrational criterion, this time generally in more advanced positions, 1st class junior scientists (CR1 in administrative slang). In France, there are two types of junior positions: full-time researcher (e.g. in CNRS; 2nd class or 1st class) and assistant professor in a university, which is both research and teaching. In principle, you could apply to a full-time research position if you are a postdoc, or if you are an assistant professor and you want to do more research. Officially no distinction is made. However, in practice, any committee member will tell you that it is next to impossible for an assistant professor to get such a position. Why is that? Again this is a case of substitution: failing to clearly distinguish between good scientists on the basis of their expected scientific career, committees answer a different question: who needs the position most. So the reasoning (which I have heard explicitly many times) is as follows: 1) if the candidate already has a permanent position, then he/she needs the advertised position less than a candidate who is currently a postdoc; 2) if the candidate has a permanent position but abroad, then he should be favoured over the candidate who has a permanent position in France, because it increases the number of faculty positions in France by one.

Again this is a fallacy because no committee decision whatsoever can create or destroy a position, or has any effect on public budget. The only impact is on who gets the position. If an assistant professor in France gets the research position, then the budget corresponding to the former position is now freed and another assistant professor is hired instead. Whoever is selected by the committee, it will not increase or decrease the amount of public budget allocated to permanent academic hiring, which is an independent political decision.

The consequence of basing decisions on substituted attributes, or simply taking these irrelevant attributes into account, is logically obvious: it reduces the weight given to the target attribute in the final ranking, i.e., worse candidates are selected.

The psychology of academic hiring committees (III) Attribute substitution: what is the experience of the candidate?

First I will comment on the criteria that corresponds to the question “what is the experience of the candidate?”, rather than the original question that the committee is supposed to answer: “how likely is the candidate to have a brilliant scientific career over the next 30-40 years”. There is no doubt that the candidate's experience is one factor that should be taken into account to make the hiring decision. Attribute substitution is when that factor is not simply taken into account, but mistaken for the target attribute that constitutes the object of the judgment.

Here are a few criteria that are unofficially used to assess the experience of the candidate:

- The number of publications; in particular, there is often an unofficial threshold for that number.

- The age of the candidate (older is better, but not too old).

- In teaching positions (not CNRS), whether the candidate has already taught quite a bit.

- Whether the candidate did a postdoc abroad (rather than in the same country, which would be bad).

The key here is to note that these criteria are about the substituted attribute, the experience of the candidate at the time of the application. They are not directly about the target attribute, the future career of the scientist. But they are assessed as if they were actually directly about that target attribute.

The number of publications

Consider the number of publications. Quite obviously, a candidate who has not published (not as a main author), should not be hired. Even if the candidate is brilliant, there is simply no information to know it. But using the number of publications as a proxy for “scientific excellence” (one of the official criteria) is another story. “Scientific excellence” is about productivity. All other things being equal (e.g. quality of the publications), more is better: it is better a hire a scientist who will publish in his/her entire life 100 high-quality papers than 10 papers of the same quality. The problem is that “number of publications at the time of application” is a rather poor substitute for future productivity. Imagine you have all the information you need, that is, the number of future publications of the candidate if he/she is hired. For a given productivity, the number of publications at time of application obviously correlates with the time for which the candidate has been publishing. Every year the substituted attribute (number of publications) increases with no change in the target attribute (productivity). This leads to paradoxical decisions: a candidate who has published 10 papers in 15 years will be ranked higher than a candidate who has published 8 papers in 4 years (again, all other things being equal - I am only considering the number of publications). The substituted attribute has no direct relationship with the target attribute. Yet it seems to be used as such (at least in biology sections).

The age of the candidate

Over the last few decades, the average hiring age in CNRS junior positions has increased very substantially. There was a time when scientists were hired right after they obtained their PhD. This would seem almost crazy now, yet a number of the older committee members got their position at that time, and would now argue (cognitive dissonance!, I will come back to that) that it would not be reasonable to hire scientists so young. Why did hiring age increase so much? A major fact to consider is that the number of applications has considerably increased without a corresponding increase in the number of positions, but let us consider a few hypotheses.

Hypothesis 1: age is a reasonable criterion, but and committee members failed to recognize it before. Given that the way committees are composed has not changed and that, as I noted above, committees have no feedback to learn from errors, this seems unlikely.

Hypothesis 2: young scientists were much better in the old times, and so informed decisions could be taken at an earlier age. This would require that young scientists published more before (otherwise there can be no informed decision), but the trend is in fact opposite.

Hypothesis 3: the way science is done has completely changed, and so now people need much more experience. I have heard this hypothesis, in particular to explain why young biologists must now do several postdocs before finding a position. Note that the argument is about experience, a substituted attribute, and not about future scientific career. I observe that: 1) one can learn things before or after being permanently hired, 2) during the time when they learn the things that are now required for being a good permanent scientist, the candidate is a non-permanent scientist, i.e. a scientist on a different type of contract; so it would seem that the requirement if for having a permanent contract, not for doing science. In my field, which is interdisciplinary, I know of many examples of older renown scientists who made a career in a different field (e.g. physics) before changing field (e.g. biology/neuroscience) while being on permanent positions. So empirically, having a permanent position does not seem to prevent one to learn new things and even change fields. Take Gerald Edelman: Nobel prize for his immunology work, then changed fields to work in neuroscience. He was not a postdoc when he got the Nobel prize. Therefore, this hypothesis does not seem to have a clear rational or empirical basis.

Hypothesis 4: finally, the explanation I have heard most often from committee members (so, people who have a role in this age increase) is the following: the number of candidates has increased, and so “mechanically” hiring age increases. More recently, there has been another increase in mean hiring age after the legal limit on application age was raised, which was explained to me as follows: now there is a competition with older candidates who necessarily have a better application, and since we take the best application, we have to take the oldest candidates. “Better application” means more publications here. So quite explicitly, it appears that the committee substitutes experience for excellence. I note again that the older scientist once was a younger scientist, with fewer publications (= substituted attribute), and yet both have the same scientific career (= target attribute).

Teaching experience

Another typical case of substituting experience is in hiring assistant professors in universities. As I have been in a few such committees, I can tell that one strong criterion (which is a threshold, i.e., a pass/fail criterion) is the number of hours that the candidate has taught, and it has to be sufficient. In general, lectures are taught only by permanent faculty (assistant professors and professors), and so candidates have taught tutorials, which is considered fine. At first sight, it seems to make sense to consider teaching experience for a teaching position, and this is why this attribute is used as a substitute, without even necessarily noting the substitution. However, the target attribute is not how much one as taught, but how good the candidate will be as a teacher. The number of hours that someone has taught is essentially irrelevant, since it gives no indication as to whether the candidate is a good or a bad teacher. In general, students teach during their PhD because either it is mandatory or they get a substantial amount of money. Students who do not teach during their PhD have a contract that does require them to teach, and these students want to spend more time on their research. It doesn't make necessarily them bad teachers.

Now one could (and would) oppose the following argument: candidates who have more teaching experience will be better teachers. First of all, since we are talking of a permanent position, the few hours one has taught at the time of application will have little impact on the timescale of a 30-40 years career. But more importantly, this argument is illogical. Candidates who have taught before being hired, and who are supposedly better teachers now, were then worse teachers when they started (which apparently was not an issue then). The decision of permanently hiring when they start teaching or after they start teaching has no impact whatsoever on their teaching experience over the course of their teaching career. Experience in teaching at the time of application should not be relevant to the hiring decision if there is no feedback on the teaching experience. This illustrates a clear case of attribute substitution.

Unfortunately, this substituted criterion is in broad use and prevents a number of good young scientists to apply. Indeed, in France, to be authorized to locally apply to a faculty position in a university, one must first pass a national screening stage (“qualification”) in which a national committee decides, with their own criteria, whether the candidate has the required credentials to apply to faculty positions. In general, committees require candidates to have taught a certain number of hours. The committees have no information on the quality of teaching, only on what was taught and for how many hours. So let me be clear: an explicit requirement to be allowed to teach on a permanent contract is to have taught on a fixed-term contract (e.g. PhD or ATER, which is temporary assistant professor), in which case there is no requirement.

It gets better: there is one committee for each discipline, but if this screening stage is passed, then the candidate can apply to any faculty position in any other discipline. I have a persona example. I did a PhD in computational neuroscience, an interdisciplinary field, during which I gave math tutorials. After I got my diploma, I applied to that screening stage in different committees. At the time, I already had a few published papers and had taught about 200 hours. But the computer science committee decided not to grant me the authorization, because (I assume) I had not taught computer science, or perhaps they considered the research not relevant to their discipline, even though it was quite obvious that I had the required credentials. This has happened again this year for former PhD student of mine: he had taught the right number of hours, but both in maths and computer science instead of only in computer science, so they rejected the application. Fortunately for me, the neuroscience committee did grant me the authorization. The next year, I got a position in the computer science department of Ecole Normale Supérieure, thanks to the authorization from the neuroscience committee. It is clear here that the question that the computer science committee answered was not, as it should have been: “should this guy be authorized to apply for a faculty position in some discipline?” but a substituted question, “did that guy teach 200 hours of computer science?”. This is particularly problematic for interdisciplinary science.

Postdoc experience

In many committees, in particular in biology, but also in many others, it is considered that one must have done a postdoc abroad to be seriously considered. It acts as a first screening: no postdoc abroad = fail. In some committees, the candidate must also have published during that postdoc, so as to show that the PhD publications were not just because of the supervisor, who is generally a co-author – although the postdoc is generally not an independent position either.

I also have a personal example. When I applied after the PhD, I failed. I asked one committee member for feedback. He told me: your application was considered excellent, so you should do a postdoc abroad and then next year you will get the position. For personal reasons, I did not want to move far, and I could not understand the logic behind that requirement. I already had two publications as a single author, and in fact my supervisor did not sign any of my publications, so the argument I previously mentioned did not apply. But in any case, as it was phrased, what the committee wanted was not that I publish in another lab, they just wanted me to check the box “postdoc abroad” (there actually is a box in the CNRS application forms). Also, the requirement was not simply: postdoc in another lab, but in a different country. I also did not understand that geographic requirement: I had spent an extended period in England during studies, which the committee knew, and so the linguistic argument did not apply either. It turned out that what I wanted to do was best done in another lab in France, but it didn't help me check the required box. So apparently, the single fact that a postdoc was done abroad, without knowing how the postdoc actually went, was a decisive criterion in the ranking, independently of any other consideration. Imagine I had actually planned to do a postdoc abroad, and already had made arrangements to do it at the time of application. Then, given the information, the committee would have known with certainty that the next year, I would have done a postdoc abroad and therefore that I would definitely pass that criterion. So actually doing the postdoc abroad was apparently irrelevant to their decision. So there was no rational basis to that requirement.

The committee member had given me that piece of information without blushing, and he did not seem to be embarrassed by the fact that the committee insisted on such an irrational criterion. It surprised me at first that people involved in irrational decision making seem at the same time to be very confident about of the correctness of their decisions. This occurs even though it is clear that the decision to be taken is difficult and there is considerable uncertainty in the choice. But in fact this is a very well established psychological phenomenon, which is explained for example in Daniel Kahneman's book. The degree of confidence one has in a decision or judgment is essentially determined not by the rationality of that decision or objective facts, but by the consistence of the causal story that one makes to explain the decision. So one would say: it is clear that Peter should be ranked before Paul, since he has two more publications; we could not do otherwise. But the story neglects the fact that the number of publications is actually a substituted attribute, not the target attribute of the judgment. The same goes for other substituted attributes. People involved in the decision can hear the objective elements that contradict the decision, but if they cannot be fit in the story, they are essentially ignored. This is related to cognitive dissonance theory, certainly one of the most brilliant theories in psychology, developed by Festinger in the 1950s. I will talk about it later.

The psychology of academic hiring committees (II) Attribute substitution

Substitution occurs in difficult judgments for which there is no direct access to the target attribute to be evaluated. This is typically the case of permanently hiring a young scientist: one wants to know whether the scientist will have a successful career, but the outcome is uncertain and especially difficult to assess if you do not know the scientist's field. In such situations, the target attribute is replaced by some other more available attribute. An obvious one would be: the number of publications of the candidate.

To some extent, committees know that the decision is difficult and that they have to use indirect criteria. So they agree in advance on a list of criteria that are made public. An example from the neurophysiology section: scientific excellence, quality and originality of the scientific production, mobility (i.e., whether the candidate did a postdoc abroad), a good scientific project, a good oral presentation and interview.

Establishing a list of indirect criteria is not irrational in such a difficult situation, since the target attribute (future career) is not directly accessible. But what you might notice in this list is that, to the exception of one criterion (mobility), all criteria are still fairly vague and difficult to evaluate. What is “scientific excellence” and how can someone outside the field evaluate it? How can someone who is not in the field of the candidate know if the research is original? This is where attribute substitution occurs. From what I have heard by discussing with many committee members from different disciplines, here are some of the criteria that turned out to be actually used in their decisions, which I have categorized as a function of the question that is substituted for the actual question:

Substituted question: what is the experience of the candidate?

- The number of publications; in particular, there is often an unofficial threshold for that number.

- The age of the candidate (older is better, but not too old).

- Whether the candidate did a postdoc abroad (rather than in the same country, which would be bad).

- In teaching positions (not CNRS), whether the candidate has already taught quite a bit.

Substituted question: how much does the candidate/lab/field need the position?

- The number of times the candidate has previously applied.

- Whether another candidate is applying for the same lab (which would be bad).

- Whether the lab had a successful candidate the previous year.

- Whether the sub-discipline has not had a successful candidate for some time (which would be good).

- For more senior positions (a separate competition), whether the candidate already has a (junior) position in France (bad).

And a few other criteria I will talk about later, because they are not so much about attribute substitution:

- Whether one committee member knows the candidate (good).

- Whether one committee member knows a person who recommended the candidate (good).

- Whether the candidate “made a good impression” during the oral presentation.

- Publications in sexy journals such as Nature and Science.

There are also other criteria that have more to do with politics than with psychology, such as committee members pushing their former students, or candidates for their own labs. I will not comment them. I will comment a few of the unofficial criteria listed above. To fully understand, bear in mind that the application of a candidate is generally read in full detail by a single member of the committee, who is the referee.

The psychology of academic hiring committees (I)

When a university or research institution wants to hire a professor or other permanent academic position (e.g. full-time research scientist in CNRS in France), a committee is appointed (or elected) to decide who to hire. From my experience as a candidate and as a member of such committees, the way committees work is very interesting from the point of view of psychology. It seems like the perfect illustration of many known cognitive biases in decision making (see the excellent book of Nobel prize psychologist Daniel Kahneman, “Thinking, fast and slow”). I believe it also applies to grant selection committees.

Scientists tend to see them as rational beings. But it is now well established that 1) humans (including “experts”) are not rational at all in many situations, 2) we tend to think of ourselves as more rational than we actually are (i.e., we are confident in our biased judgments). Selecting the right candidate for a position is a difficult decision making problem, and most scientists in hiring committees do not know much about either decision making or psychology. Therefore it is likely that they are subject to cognitive biases. In addition, these are collective decisions, which come with additional biases described in social psychology.

I will focus on one particular situation, the selection of junior scientists by national research organizations in France, the largest one being the CNRS. I will also briefly mention a couple of other cases. After reading this text, you will probably feel that I have exposed some serious problems in the hiring process. However, the aim of this text is not to point fingers at individuals. On the contrary, my aim is to provide a psychological perspective on the process, that is to show that these problems reflect general human cognitive biases, which are well documented. Of course, I also believe that there are ways to reduce these problems, but this means changing the processes, not the individuals (who will still be humans). This text focuses on the psychological side of the problem (explaining what happens), not on its political side (changing the processes).

1. The situation

1.1. The decision to be taken

In France, each year, the CNRS offers a number of permanent positions to junior scientists (“junior” meaning generally in their 30s) in all academic disciplines. The call is national, not tied to a particular university (which have their own system). There is a general call for each entire discipline (say computer science) and there are many, many candidates.

Given that the positions are permanent (i.e., about 30-40 years of employment), the goal is to select the most promising young scientists, those who will have the most brilliant career. It is therefore a judgment on the expected future scientific output of the candidates, which I call the “target attribute”. Note that this is quite different from to the decision to a hire a postdoc for a specific 2-3 years project, where the target attribute is the correct accomplishment of the project.

The judgment is based on information available at the time of the decision, which is the past scientific career of the candidate, education and any other element available at that time.

1.2. The committee

For each discipline there is a committee of about 20 scientists (2/3 are elected, 1/3 are nominated). A large part of these scientists are junior scientists themselves, therefore they do not have either a long scientific experience or much experience in hiring people (e.g. postdocs). There is no external review, meaning that all applications are reviewed internally by some of these 20 scientists. Each candidate has a referee in the committee who assesses the application in detail.

1.3. Information and (lack of) feedback

The committee has to select a small number of candidates from a very large and diverse set of applications. For most of these candidates, there is no expert in the committee. The application consists of a CV (in particular list of publications), research project and report on previous work. There may also be reference letters, although there is no consistent rule across committees. The candidate also makes a short oral presentation and is interviewed for a short time by a subset of the committee (for practical reasons).

Part of the committee has no experience in hiring. How much can the committee learn from experience? First of all, the composition of committees changes every 4 years – even though some members can remain. To learn from experience requires feedback on decisions. A decision is: select a candidate, or reject a candidate. The target attribute to be judged is the future scientific output of the candidate during his/her 30-40 years contract.

First option: the candidate is selected, he or she goes to a lab. The lab is in general not the lab of one of the committee members. As far as I know, there is no follow-up to the decisions, e.g. to see how well the selected scientist does. If there was some, it would in any case be very limited in time, since the lifetime of a committee is 1/10th the duration of the scientist's career.

Second option: the candidate is rejected. Candidates that are rejected may 1) quit science, 2) find a position elsewhere, 3) apply again the next year. In the first two cases, the event is generally not known to the committee. But in any case, it is not possible to know how well the candidate would have done if he/she had been selected. So there is no feedback on the decision in these two cases. In the third case, there is still little useful feedback on the decision not to hire the person the previous year, given that the target attribute is the life-long career of an individual.

Finally, discussions and reports of the committee are not public, in fact they are strictly confidential. Therefore there can be no external feedback on the committee, and committees of different disciplines cannot exchange information.

1.4. Summary

In summary, the situation is that of a group of people that must take a very difficult decision, whose good or bad outcome can be assessed only in the long run, with limited information, and who have no opportunity to learn from experience. Therefore, decisions are based not on experience, but on the self-confidence of the committee in their own judgments. Additionally, information is unevenly distributed across the committee, because one member (the referee) examines the application in detail, and a subset of the committee is present for the oral presentation and interview.

The setting is perfect for all sorts of interesting cognitive phenomena. In the next posts, I will discuss in particular attribute substitution, cognitive dissonance, the halo effect, the illusion of validity, obedience to authority.

What is computational neuroscience? (XVIII) Representational approaches in computational neuroscience

Computational neuroscience is the science of how the brain “computes”: how it recognizes faces or identifies words in speech. In computational neuroscience, standard approaches to perception are representational: they describe how neural networks represent in their firing some aspect of the external world. This means that a particular pattern of activity is associated to a particular face. But who makes this association? In the representational approach, it is the external observer. The approach only describes a mapping between patterns of pixels (say) and patterns of neural activity. The key step, of relating the pattern of neural activity to a particular face (which is in the world, not in the brain), is done by the external observer. How then is this about perception?

This is an intrinsic weakness of the concept of a “representation”: a representation is something (a painting, etc) that has a meaning for some observer, it is not about how this meaning is formed. Ultimately, it does not say much about perception, because it simply replaces the problem of how patterns of photoreceptor activity lead to perception by the problem of how patterns of neural activity lead to perception.

A simple example is the neural representation of auditory space. There are neurons in the auditory brainstem whose firing is sensitive to the direction of a sound source. One theory proposes that the sound's direction is signaled by the identity of the most active neuron (the one that is “tuned” to that direction). Another one proposes that it is the total firing rate of the population, which covaries with direction, that indicates sound direction. Some other theory considers that sound direction is computed as a “population vector”: each neuron codes for direction, and is associated a vector oriented in that direction, with a magnitude equal to its firing rate; the population vector is sum of all vectors.

Implicit in these representational theories is the idea that some other part of the brain “decodes” the neural representation into sound's direction, which ultimately leads to perception and behavior. However, this part is left unspecified in the model: neural models stop at the representational level, and the decoding is done by the external observer (using some formula). But the postulate of a subsequent neural decoder is problematic. Let us assume there is one. It takes the “neural representation” and transforms it into the target quantity, which is sound direction. But the output of a neuron is not a direction, it is a firing pattern or rate that can perhaps be interpreted as a direction. So how is sound direction represented in the output of the neural decoder? It appears that the decoder faces the same conceptual problem, which is that the relationship between output neural activity and the actual quantity in the world (sound direction) has to be interpreted by the external observer. In other words, the output is still a representation. The representational approach leads to an infinite regress.

Since neurons are in the brain and things (sound sources) are in the world, the only way to avoid an external “decoding” stage that relates the two is to include both the world and the brain in the perceptual model. In the example above, this means that, to understand how neurons estimate the direction of a sound source, one would not look for the “neural representation” of sound sources but for neural mechanisms that, embedded in an environment, lead to some appropriate orienting behavior. In other words, neural models of perception are not complete without an interaction with the world (i.e., without action). In this new framework, “neural representations” become a minor issue, one for the external observer looking at neurons.

What is computational neuroscience? (XVII) What is wrong with computational neuroscience?

Computational neuroscience is the field that aims at explaining the neural mechanisms that underlie cognitive abilities, by developing quantitative models of neural mechanisms that are able to display these cognitive abilities. It can be seen as the “synthetic” approach to neuroscience. On one hand, it is widely believed that a better understanding of “how the brain does it” should allow us to design machines that can outperform the best computer programs we currently have, in tasks such as recognizing visual objects or understanding speech. On the other hand, there is also a broad recognition in the field that the best algorithms for such tasks are always to be found in computer science (e.g. machine learning), because these algorithms are specifically developed for these tasks, without the “burden” of having to explain biology (for example, support vector machines or hidden markov chains). In fact, part of the work done in computational neuroscience aims at connecting biological mechanisms with preexisting computer algorithms (e.g. seeing synaptic plasticity as a biological implementation of ICA). Given this, the belief that better algorithms will somehow arise from a better understanding of biology seems rather magical.

What is wrong here is that, while it is proposed that new generation computers should take their inspiration from brains, the entire field of computational neuroscience seems to invert this proposition and to take the computer as a model of the brain. I believe there are two main flaws with the computer analogy: 1) the lack of an environment, 2) the idea that there is a preexisting plan of the brain.

 

1) The lack of an environment

Neural models that address cognitive abilities (e.g. perception) are generally developed under the input-output paradigm: feed data in (an image), get results out (label). This paradigm, inspired by the computer, is also the basis of many experiments (present stimulus, observe behavior/neural activity). It follows that such models do not interact with an environment. In contrast with this typical setting, in a behaving animal, sensory inputs are determined both by the outside world and by the actions of the animal in the world. The relationship between “inputs” and “outputs” is not causal but circular, and the environment is what links the outputs to the inputs. In addition, the “environment” of neural models is often only an abstract idealization, often inspired by a specific controlled lab experiment. As a result, such models may be able to reproduce results of controlled experimental situations, but it is not so clear that they have any explanatory value for ecological situations, or that they can be considered as models of a biological organism.

A corollary of the absence of environment is the lack of autonomy. Such neural models do not display any cognitive abilities since they cannot “do” anything. Instead, the assessment of model performance must rely on the intervention of the external observer, as in the coding paradigm: models are designed so as to “encode” features in the world, meaning that the external observer, not the organism, decodes the activity of the model. This weakness is an inevitable consequence of the strong separation between perception and action, as the “output” of a sensory system is only meaningful in the context of the actions that it drives. This issue again comes from the computer analogy, in which the output of a program is meaningful only because an external observer gives it a meaning.

These criticisms are in fact very similar to those expressed against traditional artificial intelligence in the 80s, which have given rise in particular to the field of behavior-based robotics. But they do not seem to have made their way in computational neuroscience.

 

2) The plan of the brain

There is another criticism of the computer analogy, which has to do with the idea that the brain has been engineered by evolution in the same way as a computer is engineered. A computer has a program that has been written so as to fulfill a function, and the brain has a structure that has evolved so as to fulfill a function. So in virtually all neuron models, there are a number of parameters (for example time constants) whose values are either chosen “because it works”, or because of some measurements. It is then assumed that these values are somehow “set by evolution”. But genes do not encode parameter values. They specify proteins that interact with other chemical substances. The “parameter values”, or more generally the structure of the brain, result from all these interactions, in the body and with the environment.

The structure of the brain is highly dynamic, most obviously during development but also in adulthood. Synaptic connections are plastic, in strength, structure, conduction delay. But almost everything else is plastic as well, the density and location of ionic channels, the morphology of dendrites, the properties of channels. Activity can even determine whether a neuron becomes excitatory or inhibitory. Therefore what the genes specify is not the structure, but an organization of processes that collectively determine the structure. Humberto Maturana pointed out that what characterizes a life form is not its structure, which is highly dynamic, but its self-sustaining organization. This is a fundamental distinction between engineered things and living things.

 

A different approach to computational neuroscience could take biological organisms, rather than the computer, as models. The first point is that neural models must be embedded in an environment and interact with them, so that the external observer is not part of the cognitive process. This implies in particular that perceptual systems cannot be studied as isolated modules. The second point is to focus on organizational mechanisms that guarantee sustainability in an unknown environment, rather than on a structure that specifies a particular input-output function.

What is computational neuroscience? (XVI) What is an explanation?

An explanation can often be expressed as the answer to a question starting with “why”. For example: why do neurons generate action potentials? There are different kinds of explanations. More than 2000 years ago, Aristotle categorized them as “four causes”: efficient cause, material cause, formal cause and final cause. They correspond respectively to origin, substrate, structure and function.

Efficient cause: what triggers the phenomenon to be explained. Why do neurons generate action potentials? Because their membrane potential exceeds some threshold value. A large part of science focuses on efficient causes. The standard explanation of action potential generation in biology textbooks describes the phenomenon as a chain of efficient causes: the membrane potential exceeds some threshold value, which causes the opening of sodium channels; the opening of sodium channels causes an influx of positively charged ions; the influx causes an increase in the membrane potential.

Material cause: the physical substrate of the phenomenon. For example, a wooden box burns because it is made of wood. Why do neurons generate action potentials? Because they have sodium channels, a specific sort of proteins. This kind of explanation is also very common in neuroscience, for example: Why do we see? Because the visual cortex is activated.

Formal cause: the specific pattern that is responsible for the phenomenon. Why do neurons generate action potentials? Because there is a nonlinear voltage-dependent current that produces a positive feedback loop with a bifurcation. Note how this is different from material cause: the property could be recreated in a mathematical or computer model that has no protein, or possibly by proteins that are not sodium channels but have the required properties. It is also different from efficient causes: the chain of efficient causes described above only produces the phenomenon in combination with the material cause; for example if sodium channels did not have nonlinear properties, then there would not be any bifurcation and therefore no action potential. Efficient causes are only efficient in the specific context of the material causes – i.e.: the efficient cause describes what happens with sodium channels. The formal cause is what we call a model: an idealized description of the phenomenon that captures its structure.

Final cause: the function of the phenomenon. Why do neurons generate action potentials? So as to communicate quickly with distant neurons. Final causes have a special role in biology because of the theory of evolution, and theories of life. According to evolution theory, changes in structure that result in increased rates of survival and reproduction are preferentially conserved, and therefore species that we observe today must be somehow “adapted” to their environment. For example, there is some literature about how ionic channels involved in action potentials have coordinated properties that ensures maximum energetic efficiency. Theories of life emphasize the circularity of life: the organization of a living organism is such that structure maintains the conditions for its own existence, and so an important element of biological explanation is how mechanisms (the elements) contribute to the existence of the organism (the whole).

A large part of physics concerns formal cause (mathematical models of physical phenomena) and final cause (e.g. expression of physical phenomena as the minimization of energy). In the same way, theoretical approaches to neuroscience tend to focus on formal cause and final cause. Experimental approaches to neuroscience tend to focus on material cause and efficient cause. Many epistemological misunderstandings between experimental and theoretical neuroscientists seem to come from not realizing that these are distinct and complementary kinds of explanation. I quote from Killeen (2001), “The Four Causes of Behavior”: “Exclusive focus on final causes is derided as teleological, on material causes as reductionistic, on efficient causes as mechanistic, and on formal causes as “theorizing.””. A fully satisfying scientific explanation must come from the articulation between different types of explanation.

In biology, exclusive focus on material and efficient causes is particularly unsatisfying. A good illustration is the case of convergent evolution, in which phylogenetically distant species have evolved similar traits. For example, insects and mammals have a hearing organ. Note that the terms “hearing organ” refers to the final cause: the function of that organ is to allow the animal to hear sounds, and it is understood that evolution has favored the apparition of such an organ because hearing is useful for these animals. However, the ears of insects and mammals are physically very different, so the material cause of hearing is entirely different. It follows that the chain of efficient causes (the “triggers”) is also different. Yet it is known that the structure of these organs, i.e., the formal cause, is very similar. For example, at a formal level, there is a part of the ear that performs air-to-liquid impedance conversion, although with different physical substrates. The presence of this air-to-liquid impedance conversion stage in both species can be explained by the fact that it is necessary to transmit airborne sounds to biological substrates that are much denser (= final cause). Thus, the similarity between hearing organs across species can only be explained by the articulation between formal cause (a model of the organ) and final cause (the function).

In brief, biological understanding is incomplete if it does not include formal and final explanations, which are not primarily empirical. At the light of this discussion, computational neuroscience is the subfield of neuroscience whose aim is to relate structure (formal cause = model) and function (final cause). If such a link can be found independently of the material cause (which implicitly assumes ontological reductionism), then it should be possible to simulate the model and observe the function.

What is computational neuroscience? (XV) Feynman and birds

“Philosophy of science is about as useful to scientists as ornithology is to birds”. This quote is attributed to Richard Feynman, one of the most influential physicists of the 20th century. Many other famous scientists, including Einstein, held the opposite view, but nonetheless it is true that many excellent scientists have very little esteem for philosophy of science or philosophy in general. So it is worthwhile reflecting on this quote.

This quote has been commented by a number of philosophers. Some have argued, for example, that ornithology would actually be quite useful for birds, if only they could understand it – maybe they could use it to cure their avian diseases. This is a funny remark, but presumably quite far from what Feynman meant. So why is ornithology useless to birds? Presumably, what Feynman meant is that birds do not need the intellectual knowledge about how to fly. They can fly because they are birds. They also do not need ornithology to know how to sing and communicate. So the comparison implies that scientists know how to do science, since they are scientists, and this knowledge is not intellectual but rather comes from their practice. It might be interesting to observe after the fact how scientists do science, but it is not useful for scientists, because the practice of science comes before its theory, in the same way as birds knew how to fly before there were ornithologists.

So this criticism of philosophy of science entirely relies on the idea that there is a scientific method that scientists master, without any reflections on this method. On the other hand, this method must be ineffable or at least very difficult to precisely describe, in the same way as we can walk but the intellectual knowledge of how to walk is not so easy to convey. Otherwise philosophy of science would not even exist as a discipline. If the scientific method is not something that you learn in an intellectual way, then it must be like a bodily skill, like flying for a bird. It is also implicit that scientists must agree on a single scientific method. Otherwise they would start arguing about the right way to do science, which is doing philosophy of science.

This consensual way of doing science is what Thomas Kuhn called “normal science”. It is the kind of science that is embedded within a widely accepted paradigm, which does not need to be defended because it is consensual. Normal science is what scientists learn in school. It consists of paradigms that are widely accepted at the time, which are presented as “the scientific truth”. But of course such presentation hides the way these paradigms have come to be accepted, and the fact that different paradigms were widely accepted before. For example, a few hundred years ago, the sun revolved around the Earth. From times to times, science shifts from one paradigm to another one, a process that Kuhn called “revolutionary science”. Both normal science and revolutionary science are important aspects of science. But revolutionary science requires a critical look on the established ways of doing science.

Perhaps Feynman worked at a time when physics was dominated by firmly established paradigms. Einstein, on the other hand, developed his most influential theories at a time when the foundations of physics were disputed, and he was fully aware of the relevance of philosophy of science, and philosophy in general. Could he have developed the theory of relativity without questioning the philosophical prejudices about the nature of time? Here are a few quotes from Einstein that I took from a paper by Howard (“Albert Einstein as a philosopher of science”):

“It has often been said, and certainly not without justification, that the man of science is a poor philosopher. Why then should it not be the right thing for the physicist to let the philosopher do the philosophizing? Such might indeed be the right thing to do at a time when the physicist believes he has at his disposal a rigid system of fundamental concepts and fundamental laws which are so well established that waves of doubt can’t reach them; but it cannot be right at a time when the very foundations of physics itself have become problematic as they are now. [...] Concepts that have proven useful in ordering things easily achieve such authority over us that we forget their earthly origins and accept them as unalterable givens. Thus they come to be stamped as “necessities of thought,” “a priori givens,” etc. [...] A knowledge of the historic and philosophical background gives that kind of independence from prejudices of his generation from which most scientists are suffering. This independence created by philosophical insight is - in my opinion - the mark of distinction between a mere artisan or specialist and a real seeker after truth.”

In my opinion, these views fully apply to computational and theoretical neuroscience, for at least two reasons. First, computational neuroscience is a strongly interdisciplinary field, with scientists coming from different backgrounds. Physicists come from a field with strongly established paradigms, but these paradigms are often applied to neuroscience as analogies (for example Hopfield’s spin glass theory of associative memory). Mathematicians come from a non-empirical field, to a field that is in its current state not very mathematical. Physics, mathematics and biology have widely different epistemologies. Anyone working in computational neuroscience will notice that there are strong disagreements on the value of theories, the way to make theories and the articulation between experiments and theory. Second, computational neuroscience, and in fact neuroscience in general, is not a field with undisputed paradigms. There are in fact many different paradigms, which are often only analogies coming from other fields, and there is no accepted consensus about the right level of description, for example.

Computational neuroscience is perhaps the perfect example of a scientific field where it is important for scientists to develop a critical look on the methods of scientific enquiry and on the nature of scientific concepts.

What is sound? (XV) Footsteps and head scratching

When one thinks of sounds, the image that comes to mind is a speaker playing back a sound wave, which travels through air to the ears of the listener. But not all sounds are like that. I will give two examples: head scratching and footsteps.

When you scratch your head, a sound is produced that travels in the air to your ears. But there is another pathway: the sound is actually produced by the skull and the skin, and it propagates through the skull directly to the inner ear. This is called “bone conduction”. A lot of the early work on this subject was done by von Békésy (see e.g. Hood, JASA 1962). Normally, bone conduction represents a negligible part of sounds that we hear. When an acoustical wave reaches our head, the skull is put in vibration and can transmit sounds directly to the inner ear by bone conduction. But because of the difference in acoustical impedance between air and skin, the wave is very strongly attenuated, on the order of 60 dB according to these early works. It is actually the function of the middle ear to match these two impedances.

But in the case of head scratching, the sound is actually already produced on the skull, so it is likely that a large proportion of the sound is transmitted by bone conduction, if not most of it. This implies that sound localization cues (in particular binaural cues) are completely different from airborne sounds. For example, sound propagates faster (as in water) and there are resonances. Cues might also depend on the position of the jaw. There is a complete set of binaural cues that are specific of the location of scratching on the skull, which are directly associated with tactile cues. To my knowledge, this has not been measured. This also applies to chewing sounds, and also to the sound of one’s own voice. In fact, it is thought that the reason why one’s own voice sounds higher when it is played back is because our perception of our own voice relies on bone conduction, which transmits lower frequencies better than higher frequencies.

Let us now turn to footsteps. A footstep is a very interesting sound – not even mentioning the multisensory information in a footstep. When the ground is impacted, an airborne sound is produced, coming from the location of the impact. However, the ground is not a point source. Therefore when it vibrates, the sound comes from a larger piece of material than just the location of the impact. This produces binaural cues that are unlike those of sounds produced by a speaker. In particular, the interaural correlation is lower for larger sources, and you would expect that the frequency-dependence of this correlation depends on the size of the source (the angular width, from the perspective of the listener).

When you walk in a noisy street, you may notice that you can hear your own footsteps but not those of other people walking next to you, even though the distance of the feet might be similar. Why is that? In addition to the airborne sound, your entire skeleton vibrates. This implies that there should be a large component of the sound that you hear that is in fact coming from bone conduction through your body. Again these sounds should have quite peculiar binaural cues, in addition to having stronger low frequencies. In particular, there should be different set of cues for the left foot and for the right foot.

You might also hear someone else’s footstep. In this case there is of course the airborne sound, but there is also another pathway, through the ground. Through this other pathway, the sound reaches your feet before the airborne sound, because sound propagates much faster in a solid substance than in air. Depending on the texture of the ground, higher frequencies would also be more attenuated. In principle, this vibration in your feet (perhaps if you are bare feet) will then propagate through your body to your inner ear. But it is not so clear how strong this bone conducted sound might be. Clearly it should be much softer than for your own footstep, since in that case there is an impact on your skeleton. But perhaps it is still significant. In this case, there are again different binaural cues, which should depend on the nature of the ground (since this affects the speed of propagation).

In the same way, sounds made by touching or hitting an object might also include a bone conducted component. It will be quite challenging to measure these effects, since ideally one should measure the vibration of the basilar membrane. Indirect methods might include: measurements on the skull (to have an idea of the magnitude), psychoacoustic methods using masking sounds, measuring otoacoustic emissions, electrophysiological methods (cochlear microphonics, ABR).

Sensory modalities and the sense of temperature

Perception is traditionally categorized into five senses: hearing, vision, touch, taste and olfaction. These categories seem to reflect the organs of sense, rather than the sensory modalities themselves. For example, the sense of taste is generally (in the neuroscience literature) associated with the taste receptors in the tongue (sweet, salty etc). But what we refer to as taste in daily experience actually involves the tongue, including “taste” receptors (sweet, salty) but also “tactile” receptors (the texture of food), the nose (“olfactory” receptors), and in fact probably also the eyes (color) and the ears (chewing sounds). All these are involved in a unitary experience that seems to be perceptually localized in the mouth, or on the tongue – despite the fact the most informative stimuli, which are chemical, are actually captured in the nose. One may consider that taste is then a “multimodal” experience, but this is not a very good description. If you eat a crisp, you experience the taste of a crisp. But if you isolate any of the components that make this unitary experience, you will not experience taste. For example, imagine a crisp without any chemically active component and no salt: you experience touch with your tongue, and the crisp has “no taste”. If you only experience the smell, then you have an experience of smell, not of taste. This is another sensory modality, despite the fact that the same chemical elements are involved. If only the “taste” receptors on your tongue were stimulated, you would have an experience of “salty”, not of a crisp. So the modality of taste involves a variety of receptors, but that does not make it more multimodal than vision is multimodal because it involves many photoreceptors.

“Touch” is also very complex. There is touch as in touching something: you make contact with objects and you feel their texture or shape. There is also being touched. There is also the feeling of weight, which involves gravity, and also movement. There is the feeling of pain, which is related to touch, but not classically included in the 5 senses. Finally there is the feeling of temperature, which I will discuss now from an ecological point of view (in the way of Gibson).

The sense of temperature is not usually listed in the 5 senses. It is often associated with touch, because by touch you can feel that an object is hot or cold. But you can also feel that “it” (=the weather) is cold, in a way that is not well localized. Physically, it is a quantity that is not mechanical, and in this sense it is completely different from touch. But like touch, it is a proximal sense that involves the interface between the body and either the medium (air or water) or substances (object surfaces). The sense of temperature is much more interesting that it initially seems. First, there is of course “how hot it is”, the temperature of the medium. The image that comes to mind is that of the thermometer. But temperature can be experienced all over the body. So spatial gradients of temperature can be sensed. When touching an object, parts of the object can be more or less hot. So spatial gradients of temperature can potentially be sensed through an object, in the same way as the mechanical texture can be sensed. Are there temperature textures?

The most interesting and, as far as I know, underappreciated aspect of the temperature sense is its sensorimotor structure. The body produces heat. Objects react to heat by warming up. Some materials, like metal, conduct temperature well, others, like wood, don’t. So both the temporal changes in temperature when an object is touched, and the spatial gradient of temperature that develops, depends on the material and possibly specifies it. So it seems that the sense of temperature is rich enough to qualify as a modality in the same way as touch.