décembre | 2025 | Romain Brette

I can’t believe I need to write this, but apparently, I do. Many journals and preprint sites are now overwhelmed with chatbot-generated submissions. This is bad, but at least it gets characterized as fraud, something that we need to defend against. What I find much more worrying is that respectable scientists don’t seem to see the problem with generating part of their papers with a chatbot, and journals are seriously considering using chatbots to review papers, as they struggle to find reviewers. This is usually backed up by anthropomorphic discourse, such as calling a chatbot “PhD level AI”. This is to be expected from the CEO of a chatbot company, but unfortunately, it is not unusual to hear colleagues describe chatbots as some sort of “interns”. The rationale is that what the chatbot produces looks like a text that a good intern could produce. Or that the chatbot “writes better than me”. Or that it “knows” much more than the average scientist about most subjects (does an encyclopedia “know” more than you too?).

First of all, what’s a chatbot? Of course, I am referring to large language models (LLMs), which are statistical models of text tuned on very large text databases. An LLM is not about truth, but about what is written in the database (whether right, wrong, invented or nonsensical), and it is certainly not about personhood. But the term chatbot emphasizes the deceptive dimension of this technology, namely, it is a statistical model that is conceived in such a way as to fool the user into believing that an actual person is talking. It is the intersection of advanced statistical technology and bullshit capitalism. We have become familiar with bullshit capitalism: a mode of financing based not on the expected revenues of the company that can be reasonably anticipated from a well-conceived business plan, but on closed-loop speculation about the short-term explosion of the share value of a company that sells nothing, based on a “pitch”. Thus, funders are apparently perfectly fine with a CEO explaining that his business plan to make revenue is to build a superhuman AI and ask it to come up with an idea. It’s a joke, right. But he still did not clearly explain how he would make a revenue, so not really a joke.

Scientists should not fall for that. Chatbots are essentially statistical models. No one is actually speaking or taking responsibility for what is being written. The argument that chatbot-generated text resembles scientist-generated text is a tragic misunderstanding of the nature of science. Doing science is not producing text that looks sciency. A PhD is not about learning to do calculations, code, or developing sophisticated technical skills (although it’s obviously an important part of the training). It is about learning to prove (or disprove) claims. At core, it is the development of an ethical attitude to knowledge.

I know, a number of scientists who have read or heard of a couple of 1970-1980s classics in philosophy or sociology of science will object that truth doesn’t exist, or even that it is an authoritarian ideal, and it’s all conventions. Well, at a time when authoritarian politicians are claiming that scientific discourse has no more value than political opinions, we should perhaps pause and reflect a little bit on that position. First of all, it’s a self-defeating position. If it is true, then maybe it’s not true. This alone should make one realize that there might be something wrong with that position. Sure, truth doesn’t exist. That is because any general statement can always potentially be contradicted by future observations. In science, general claims are provisional. Sure. But consistency with current observations and theories does exist, and wrongness certainly does exist too. And sure, scientific claims are necessarily expressed in a certain theoretical context, and this context can always be challenged. But on what basis do we challenge theories and claims? Well, obviously based on whether we think the theories are incorrect or partial or misleading, so on the basis of epistemic norms – don’t call it “truth” if you want to sound philosophically aware, but we’re clearly in the same lexical field.

So, “truth doesn’t exist” is a fine provocative slogan, but certainly a misleading one, unless its meaning is carefully unpacked. Science is all about arguing, challenging claims with arguments, backing up theories with reasoning and experiments, looking for loopholes in reasoning, and generally about demonstrating. Therefore, what defines scientific work is not the application of specific methods and procedures (these differ widely between fields), but an ethical commitment: a commitment to an ideal of truth (or “truth”, if you prefer). This is what a PhD student is supposed to learn: to back up each of their claims with arguments; to challenge the claims of others, or to look for loopholes in their own arguments; to try to resolve apparent contradictions; to think of what might support or disprove a position.

It should be obvious then that to write science is not to produce text that simply looks like a scientific text. The scientific text must reflect the actual reasoning of the scientist, which reflects their best efforts to demonstrate claims. This is precisely what a statistical model cannot do. Everyone has noticed that a chatbot can be cued to claim one thing and the contrary within a few sentences. Nothing surprising there. It is very implausible that everything that has been written on the internet is internally consistent, so a good statistical model of that text will never produce consistent reasonings.

Let us see now some concrete use cases of chatbots in science. Given the preceding remarks, the worst possible case I can see is reviewing. No, a statistical model is not a peer, and no it doesn’t “reason” or “critically thinks”. Yes, it can generate sentences that look like reasoning or criticisms. But it could be right, it could be wrong, who knows. I hear the argument that human reviews are often pretty bad anyway. What kind of argument is that? Since mediocre reviewing is the norm, why not just generalizing it? The scientific ecosystem has already been largely sabotaged by managerial ideologies and publishing sharks, so let’s just finish the work? If that’s the argument, then let’s just give up science entirely.

Other use case: generating the text of your paper. It’s not as bad, but it’s bad. Of course, there are degrees. I can imagine that, like myself, one may not be a native English speaker, and some technological help to polish the language could be helpful (I personally don’t use it for that because I think even the writing style of a chatbot is awful). But the temptation is great to use it to turn a series of vague statements into a nice-sounding prose. The problem is that, by construction, whatever was not in the prompt is made up. The chatbot does not know your study, it does not know the specific context of your vague statements. It can be a struggle to turn “raw results” into scientific text, but that is largely because it takes work to make explicit all the implicit assumptions that you make, to turn your intuitions into sound logical reasoning. And if you did not explicitly put it in the prompt in the first place, then the statistical model makes it up – there’s no magic. It may sound good, but it’s not science.

Even worse is the use of a chatbot to write introduction and discussion. Many people find it hard to write those parts. They are right: it is hard. This is because this is where the results get connected to the whole body of knowledge, where you try to resolve contradictions or reinterpret previous results, where you must use scholarship. This is particularly hard for students because it requires experience and broad scholarship. But it is in making those connections, by careful contextualized argumentation, that the web of knowledge gets extended. Sure, this is not always done as it should be. But scientists should work on that skill, not improve the productivity of mediocre writing.

One might object that there is already much story-telling in scientific papers currently written by humans, especially in the “prestigious” journals, and especially in biology (not mentioning neuroscience, which is even worse). Well yes, but that is obviously a problem to solve, not something we should amplify by automation!

Let me briefly comment on other uses of this technology. One is to generate code. This can be helpful, say, to quickly generate a user interface, or to find the right commands to make a specific plot. This is fine when you can easily tell whether the code is correct or not – looking for the right syntax for a given command is such a use case. But it starts getting problematic when you use it to perform some analysis, especially when the analysis is not standard. There is no guarantee whatsoever that the analysis is done correctly, other than by checking yourself, which requires understanding it. So, in a scientific context, I anticipate that this will cause some issues. When I review a paper, I rarely check the code (it is usually not available anyway) to make sure it does what the paper claims. I trust the authors (unless of course some oddity catches my attention). Some level of trust is inevitable in peer review. I can see the temptation of a programming-averse biologist to just ask a chatbot to do their analyses, rather than looking for technical help. But the result of that is likely to be a rise in the rate of hard-to-spot technical errors.

Another common use is bibliographic search. Some tools are in fact quite powerful, if you understand that you are dealing with a sophisticated search engine, not an expert who summarizes the findings of the literature or of individual papers. For example, I could use it to look for a pharmacological blocker of an ionic channel, which will generally not be the main topic of the papers that use it. The model will output a series of matching papers. In general, the generated description of those papers is pretty bad and untrustable. But, if the references are correct, I can just look up the papers and check for myself. It is basically one way to do content-based search and it should be treated like that, in complement of other methods (looking for reviews, following the tree of citations etc.).

In summary, no, chatbots are not scientists, not even baby scientists: science is about (at least should be about) proving what you claim, not about producing sciency text. Science is an ethical attitude to knowledge, not a bag of methods, and only persons have ethics. Encouraging scientists to write their papers with a chatbot, or worse, automating reviewing with chatbots, is an extremely destructive move and should not be tolerated. This is not the solution to the problems that science currently faces. The solution to those problems is to be political, not technological, and we know it. And please, please, my dear fellow scientists, stop with the anthropomorphic talk. It’s an algorithm, it’s not a person, and you should know it.

All this comes in addition to the many other ethical issues that so-called AI raises, on which a number of talented scholars have written at length (a few pointers: Emily Bender, Abeba Birhane, Olivia Guest, Iris van Rooij, Melanie Mitchell, Gary Marcus).

Romain Brette

Theoretical Neuroscience

Archives mensuelles : décembre 2025

The philosophical misconception behind the LLM cult (or why LLMs will always bullshit)

No, chatbots are not scientists