Frankenstein models and the double-edged sword of modelling tools

The reason we develop simulation tools is to boost productivity. Coding numerical integration schemes for neuron models can be rather time-consuming, and we want to invest our time in the science, rather than in the implementation. It can also be tricky. For example, simulating the cable equation in an efficient and accurate way is quite a bit of work. This is why we developed the Brian simulator, and presumably why the Neuron simulator was developed. As those tools become successful, scientists start to publish and share their models. There is for example the modeldb database, a great resource to find model code from published papers. In turn, this can yield a great increase in productivity, because models can be reused. This second productivity benefit of tools is also a motivation for the development of common formats for models and data.

On paper, this looks great. I'm all in favor of encouraging everyone to share everything they can in the most convenient way. However, I think we should be aware that the development of model databases also encourages problematic modeling practices. Making a tool entails hiding what exactly the tool does and how he does it. This can be a good thing; you don't really need to know how the cable equation is numerically integrated to understand it. But when it comes to the models themselves, I think it's a problem. The meaning of a model is specific of the context in which it has been conceived. For example, the original Hodgkin-Huxley model is a model of a space-clamped squid giant axon, which actually results from the fusion of hundreds of cells; it is not a neuron model. So taking pieces of published models to build a new model for a different scientific question and context is a delicate exercise. Unfortunately, model sharing encourages the development of what one might call “Frankenstein models”.

Let me be more specific about what I mean by “Frankenstein models”. Most detailed biophysical neuron models that people use today (see e.g. modeldb) use ion channels that have been developed from different species, different cell types and brain areas, different experimental conditions. Then those models are almost always hand-tuned, i.e. parameters are changed relative to measurements, because a model never works when you just put disparate things together. So for example, one would shift the activation curve of a channel so as to have the right spike threshold. Then if you examine a modern detailed model, you will find that not only the ionic channels are aggregates from various conditions and species, but that it has a history of hand-tuning in its components, where the hand tuning was done for various unrelated problems. That these detailed models are considered realistic puzzles me (and by these, I mean about all published models except perhaps the original Hodgkin-Huxley model).

That these models are not realistic might not be that bad, if one were lucid about the question (i.e., acknowledged the assumptions and limitations of the model). Unfortunately, there seems to be a general lack of education in neuroscience about epistemological questions related to modeling, and one often seems happy to observe that the model “has sodium channels” with experimentally measured properties. So I will make a few remarks to emphasize that to “have sodium channels” is essentially meaningless. Take the original (isopotential) Hodgkin-Huxley model. It is a model of the space-clamped squid giant axon, i.e., obtained by inserting a metal wire in the axon. As I mentioned above, that axon is actually one exception to the neuron doctrine, i.e., it is a syncytium resulting from the fusion of hundreds of cells: it is not, and has never been, a neuron model. Secondly, the exponents in the channel currents (number of gates) are obtained by optimization of spike shape. The potassium channel has exponent 4 only because they couldn't try higher numbers, because of the computers of the time. It was recognized later that the potassium current is actually more delayed and a higher exponent would be needed. You can still read in the literature today that the squid giant axon's spike is highly energetically inefficient, but that actually refers to the HH model; with a better computer the squid's spike would have turned out to be more efficient!

Then, one must realize that there is no such thing as “the sodium channel”. A sodium channel consists of a main unit, called the alpha subunit, of which there are 9 subtypes (corresponding to 9 genes), and a bunch of beta-units. The function of the channel depends on the subtype and on which beta-units are expressed. But not only. The alpha subunits can also take different forms, by alternate splicing. More importantly, their function can be modulated by a bunch of things, for example they can be phosphorylated by enzymes, which can change every aspect of their kinetics (activation and inactivation curves and dynamics). All these things depend not only on cell type (brain area etc) but on the individual cell, and it can vary over time. To top it all, properties of different channels (say sodium and potassium) are also co-tuned in cells (not surprisingly, since various compounds such as PKA modulate many types of channels). So when you measure channel properties in an oocyte, or simply in different cells, and put things together, there is close to zero probability that you will get a model that produces normal function – not mentioning the subcellular localization of channels, which is highly heterogeneous. This is why, in the end, all models are the result of some more or less laborious hand-tuning. In general, the more detailed, the more laborious and most likely inaccurate. I recently published a review where I argued that the integrate-and-fire model is actually more realistic than isopotential Hodgkin-Huxley models, but there is a general point: the more detailed a model, the more likely it is to be plain wrong. Worse, detailed models are likely to be wrong in unpredictable ways.

Model databases and standards promote the illusion that realistic models can be built simply by reusing bits of previous models, standing on overly reductionist views of the living, when systemic views should prevail. This leads to the spread of Frankenstein models with a dangerous illusion of realism.

What should we do about it? I still think that sharing models is good and should be encouraged. But how to avoid the spread of Frankenstein models? There are at least two directions I can think of. First, the way models are communicated and shared: it should be about science, not about computers. If all we do is load a file into our software tool and never actually look at the model itself, then we're doomed. There is a trend of developing standards for models, so as to facilitate exchanges. I think these standards should not go beyond the mathematical formulation of the models, for otherwise we throw away all the context in which the models are supposed to be meaningful. This is also the reason why in the Brian simulator, we insist that models are specified by their equations, and not by pre-designed components (e.g. the sodium channel). In this way it is clear what specific model is used, and we do not introduce modeling biases. The second direction is to educate scientists as regards epistemological and scientific questions in modeling. But that won't be easy!

Une réflexion au sujet de « Frankenstein models and the double-edged sword of modelling tools »

  1. Great article and points, I'd suggest that each time a modelling paper is published, that a standard set of measures of simple model dynamics also be included. For example, what does a single spike under minimally sufficient activation look like numerically. An perhaps a few other setups and regimen as well.

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *