Archives de catégorie : AI

No, chatbots are not scientists

I can’t believe I need to write this, but apparently, I do. Many journals and preprint sites are now overwhelmed with chatbot-generated submissions. This is bad, but at least it gets characterized as fraud, something that we need to defend against. What I find much more worrying is that respectable scientists don’t seem to see the problem with generating part of their papers with a chatbot, and journals are seriously considering using chatbots to review papers, as they struggle to find reviewers. This is usually backed up by anthropomorphic discourse, such as calling a chatbot “PhD level AI”. This is to be expected from the CEO of a chatbot company, but unfortunately, it is not unusual to hear colleagues describe chatbots as some sort of “interns”. The rationale is that what the chatbot produces looks like a text that a good intern could produce. Or that the chatbot “writes better than me”. Or that it “knows” much more than the average scientist about most subjects (does an encyclopedia “know” more than you too?).

First of all, what’s a chatbot? Of course, I am referring to large language models (LLMs), which are statistical models of text tuned on very large text databases. An LLM is not about truth, but about what is written in the database (whether right, wrong, invented or nonsensical), and it is certainly not about personhood. But the term chatbot emphasizes the deceptive dimension of this technology, namely, it is a statistical model that is conceived in such a way as to fool the user into believing that an actual person is talking. It is the intersection of advanced statistical technology and bullshit capitalism. We have become familiar with bullshit capitalism: a mode of financing based not on the expected revenues of the company that can be reasonably anticipated from a well-conceived business plan, but on closed-loop speculation about the short-term explosion of the share value of a company that sells nothing, based on a “pitch”. Thus, funders are apparently perfectly fine with a CEO explaining that his business plan to make revenue is to build a superhuman AI and ask it to come up with an idea. It’s a joke, right. But he still did not clearly explain how he would make a revenue, so not really a joke.

Scientists should not fall for that. Chatbots are essentially statistical models. No one is actually speaking or taking responsibility for what is being written. The argument that chatbot-generated text resembles scientist-generated text is a tragic misunderstanding of the nature of science. Doing science is not producing text that looks sciency. A PhD is not about learning to do calculations, code, or developing sophisticated technical skills (although it’s obviously an important part of the training). It is about learning to prove (or disprove) claims. At core, it is the development of an ethical attitude to knowledge.

I know, a number of scientists who have read or heard of a couple of 1970-1980s classics in philosophy or sociology of science will object that truth doesn’t exist, or even that it is an authoritarian ideal, and it’s all conventions. Well, at a time when authoritarian politicians are claiming that scientific discourse has no more value than political opinions, we should perhaps pause and reflect a little bit on that position. First of all, it’s a self-defeating position. If it is true, then maybe it’s not true. This alone should make one realize that there might be something wrong with that position. Sure, truth doesn’t exist. That is because any general statement can always potentially be contradicted by future observations. In science, general claims are provisional. Sure. But consistency with current observations and theories does exist, and wrongness certainly does exist too. And sure, scientific claims are necessarily expressed in a certain theoretical context, and this context can always be challenged. But on what basis do we challenge theories and claims? Well, obviously based on whether we think the theories are incorrect or partial or misleading, so on the basis of epistemic norms – don’t call it “truth” if you want to sound philosophically aware, but we’re clearly in the same lexical field.

So, “truth doesn’t exist” is a fine provocative slogan, but certainly a misleading one, unless its meaning is carefully unpacked. Science is all about arguing, challenging claims with arguments, backing up theories with reasoning and experiments, looking for loopholes in reasoning, and generally about demonstrating. Therefore, what defines scientific work is not the application of specific methods and procedures (these differ widely between fields), but an ethical commitment: a commitment to an ideal of truth (or “truth”, if you prefer). This is what a PhD student is supposed to learn: to back up each of their claims with arguments; to challenge the claims of others, or to look for loopholes in their own arguments; to try to resolve apparent contradictions; to think of what might support or disprove a position.

It should be obvious then that to write science is not to produce text that simply looks like a scientific text. The scientific text must reflect the actual reasoning of the scientist, which reflects their best efforts to demonstrate claims. This is precisely what a statistical model cannot do. Everyone has noticed that a chatbot can be cued to claim one thing and the contrary within a few sentences. Nothing surprising there. It is very implausible that everything that has been written on the internet is internally consistent, so a good statistical model of that text will never produce consistent reasonings.

Let us see now some concrete use cases of chatbots in science. Given the preceding remarks, the worst possible case I can see is reviewing. No, a statistical model is not a peer, and no it doesn’t “reason” or “critically thinks”. Yes, it can generate sentences that look like reasoning or criticisms. But it could be right, it could be wrong, who knows. I hear the argument that human reviews are often pretty bad anyway. What kind of argument is that? Since mediocre reviewing is the norm, why not just generalizing it? The scientific ecosystem has already been largely sabotaged by managerial ideologies and publishing sharks, so let’s just finish the work? If that’s the argument, then let’s just give up science entirely.

Other use case: generating the text of your paper. It’s not as bad, but it’s bad. Of course, there are degrees. I can imagine that, like myself, one may not be a native English speaker, and some technological help to polish the language could be helpful (I personally don’t use it for that because I think even the writing style of a chatbot is awful). But the temptation is great to use it to turn a series of vague statements into a nice-sounding prose. The problem is that, by construction, whatever was not in the prompt is made up. The chatbot does not know your study, it does not know the specific context of your vague statements. It can be a struggle to turn “raw results” into scientific text, but that is largely because it takes work to make explicit all the implicit assumptions that you make, to turn your intuitions into sound logical reasoning. And if you did not explicitly put it in the prompt in the first place, then the statistical model makes it up – there’s no magic. It may sound good, but it’s not science.

Even worse is the use of a chatbot to write introduction and discussion. Many people find it hard to write those parts. They are right: it is hard. This is because this is where the results get connected to the whole body of knowledge, where you try to resolve contradictions or reinterpret previous results, where you must use scholarship. This is particularly hard for students because it requires experience and broad scholarship. But it is in making those connections, by careful contextualized argumentation, that the web of knowledge gets extended. Sure, this is not always done as it should be. But scientists should work on that skill, not improve the productivity of mediocre writing.

One might object that there is already much story-telling in scientific papers currently written by humans, especially in the “prestigious” journals, and especially in biology (not mentioning neuroscience, which is even worse). Well yes, but that is obviously a problem to solve, not something we should amplify by automation!

Let me briefly comment on other uses of this technology. One is to generate code. This can be helpful, say, to quickly generate a user interface, or to find the right commands to make a specific plot. This is fine when you can easily tell whether the code is correct or not – looking for the right syntax for a given command is such a use case. But it starts getting problematic when you use it to perform some analysis, especially when the analysis is not standard. There is no guarantee whatsoever that the analysis is done correctly, other than by checking yourself, which requires understanding it. So, in a scientific context, I anticipate that this will cause some issues. When I review a paper, I rarely check the code (it is usually not available anyway) to make sure it does what the paper claims. I trust the authors (unless of course some oddity catches my attention). Some level of trust is inevitable in peer review. I can see the temptation of a programming-averse biologist to just ask a chatbot to do their analyses, rather than looking for technical help. But the result of that is likely to be a rise in the rate of hard-to-spot technical errors.

Another common use is bibliographic search. Some tools are in fact quite powerful, if you understand that you are dealing with a sophisticated search engine, not an expert who summarizes the findings of the literature or of individual papers. For example, I could use it to look for a pharmacological blocker of an ionic channel, which will generally not be the main topic of the papers that use it. The model will output a series of matching papers. In general, the generated description of those papers is pretty bad and untrustable. But, if the references are correct, I can just look up the papers and check for myself. It is basically one way to do content-based search and it should be treated like that, in complement of other methods (looking for reviews, following the tree of citations etc.).

In summary, no, chatbots are not scientists, not even baby scientists: science is about (at least should be about) proving what you claim, not about producing sciency text. Science is an ethical attitude to knowledge, not a bag of methods, and only persons have ethics. Encouraging scientists to write their papers with a chatbot, or worse, automating reviewing with chatbots, is an extremely destructive move and should not be tolerated. This is not the solution to the problems that science currently faces. The solution to those problems is to be political, not technological, and we know it. And please, please, my dear fellow scientists, stop with the anthropomorphic talk. It’s an algorithm, it’s not a person, and you should know it.

All this comes in addition to the many other ethical issues that so-called AI raises, on which a number of talented scholars have written at length (a few pointers: Emily Bender, Abeba Birhane, Olivia Guest, Iris van Rooij, Melanie Mitchell, Gary Marcus).

Notes on consciousness. (XI) Why large language models are not conscious

The current discussions about AI are plagued with anthropomorphism. Of course, the name “artificial intelligence” has probably something to do with the matter. A conscious artificial intelligence might sound like science-fiction, but a sentient statistical model certainly sounds a bit more bizarre. In this post, I want to address the following question: are large language models (LLM), such as ChatGPT, or more generally deep learning models, conscious? (perhaps just a little bit?)

The first thing that must be realized is that the question of what it takes for something to be conscious is a very old question, which has been the subject of considerable work in the past. The opinions and arguments that we read today are not original. Very often, they unknowingly repeat common arguments that have been discussed at length in the past. It might seem for the general public or AI engineers that the spectacular successes of modern AI programs should bring a new light on these old questions, but that is not really the case. These spectacular programs already existed 50 years ago, just as thought experiments: which for logical arguments makes no difference at all. For example, in 1980, Searle articulated an argument against the idea that a program that seems to behave like a human would necessarily have a mind. The argument was based on a thought experiment, where an average English-speaking person would apply the operations of a sophisticated program that takes a question written in Chinese as input and outputs an answer in Chinese. Searle argues that while that person is able to have a conversation in Chinese by mechanically applying rules, she does not understand Chinese. Therefore, he concludes, understanding cannot be identified with the production of acceptable answers. Now such programs exist, and their existence does not change anything to either the argument or the objections that have been raised then. All these discussions were precisely based on the premises that such systems would exist.

This is an important preamble: a lot has been said and written about consciousness, what it means, what it takes for something to be conscious, etc. Answers to these questions are not to be found in current AI research but in the philosophy literature, and whether a program produces convincing answers to questions, or can correctly categorize images, bears very little relevance to any of these discussions. A corollary is that AI engineering skills provide no particular authority on the subject of AI consciousness, and indeed, as far as I can tell, many arguments we hear in the AI community on this subject tend be rather naïve, or at least not original and with known weaknesses.

Another important remark is that, while there is a lot of literature on the subject, there is no consensus whatsoever in the specialized community on those questions. Views on consciousness go from consciousness does not exist (eliminativism/illusionism) to everything is conscious (panpsychism(s)). In my view, there is no consensus because there is currently no convincing theory of consciousness (some say that there can be no theory of consciousness, which is another proposition on which there is no consensus). There are good reasons for this state of affair, which I scratched here and there (and others have too, obviously).

This remark is important because I have seen it become an argument for why a particular artefact might be conscious: we don’t know what consciousness is exactly, or what it takes for something to be conscious, therefore we cannot rule out the possibility that something in particular is conscious.

However, this is a fallacy. We do not know what it takes for an entity to be conscious, but we certainly do know that certain entities cannot be conscious, if the meaning we give to this notion must have some vague resemblance with its usual meaning.

Now, I will make a few general points against the idea that LLMs, or more generally formal deep neural networks, are conscious, by discussing the concept of a “conscious state” (i.e., “how it feels like now”).

Once we remove the buzzwords, such as “intelligence”, “learning”, etc., a modern deep learning model is essentially a massively parallel differentiable program. In effect, it is essentially a tensor calculator, the state of which is updated by a series of iterations.

The default assumption in the computational view of mind is that a mental state is something like a program state. But a conscious state is a very particular kind of state. First of all, a conscious state has a subject (the entity that is conscious) and an object (what it is about). Both relate to well-known topics in philosophy of mind, namely the unity of consciousness and the intentionality of consciousness.

When we wonder whether something is conscious, that thing is typically an organism, or a machine, or even, if you adopt panpsychism, a rock or an atom. But we could consider many other combinations of molecules and ask whether they are conscious. How about the entity made of a piece of my finger (still attached to my body) plus two coins in a fountain in Rome? This seems absurd, but why? There is a reason why: two objects that do not interact at all have the same properties whether we look at them one by one or together. New properties above the individual elements can only arise if there is some kind of interaction between the elements. So, if neither my finger nor any of the coins is conscious, then the combination is not conscious either. Thus, we can say that a necessary condition for a set of components to constitute a subject is that there is some causal integration between the components. This is actually the basis of one particular theory of consciousness, Integrated Information Theory (which I have criticized here, there and there, essentially because a necessary condition is not a sufficient condition). If a deep learning network is conscious, then which layer is the subject?

I leave this tricky question hanging there to address the more critical ones (but in case you want to dig, look up these keywords: autonomy of the living, biological organism, biological agency). One is the question of intentionality: the fact that a conscious state is about something: I am conscious of something in particular. A large language model is a symbol processing system. The problem is that it is humans who give meaning to symbols. The program is fed with sequences of bits, and outputs sequences of bits. If one asks an LLM “What is the color of a banana?” and the LLM replies “yellow”, does the program understand what a banana is? Clearly, it doesn’t have the visual experience of imagining a banana. It has never seen a banana. It doesn’t experience the smell or touch of a banana. All it does is output 0100, which we have decided stands for the word banana, when we input a particular stream of numbers. But the particular numbers are totally arbitrary: we could choose to assign a different sequence of numbers to each of the words and we would still interpret the computer’s output in exactly the same way, even though the program would now have different inputs and outputs. So, if the program has a conscious experience, then it is about nothing in particular: therefore, there is no conscious experience at all.

This is known as the “symbol grounding problem”, again a well-known problem (which I examined in the context of “neural codes”). Many people consider that a necessary ingredient has to do with embodiment, that is, the idea that the machine has to have a body interacting with the world (but again, careful, necessary does not mean sufficient). It seems that it is Yann Le Cun’s position, for example. Again, it is not clear at all what it takes for something to be conscious, but it is clear that there can be no conscious experience at all unless that experience is somehow about something, and so the symbol grounding problem must be addressed.

These are well-known problems in philosophy of mind. The final point I want to discuss here is more subtle. It relates to the notion of “state”. In computationalism (more generally, functionalism), a conscious state is a sort of program state. But there is a big confusion here. Both terminologies use the word “state”, but their meaning is completely different. A state of mind, an experience, is not at all a “state” in the physical sense of pressure, volume and so on, that is, the configuration of a system. Consider a visual experience. Can it be the case that to see something is to have the brain in a certain state? For example, some neurons in the inferotemporal cortex fire when a picture of Jennifer Anniston is presented: is the visual experience of Jennifer Anniston the same as the active state of the neuron? If the answer is positive, then why should you have the experience, rather than that neuron? Otherwise, perhaps the neuron triggers the experience, but then we need to assign the experience to the state of some downstream neurons and we face the problem of infinite regress.

The issue is that an experience is simply not a physical state; to treat it as such is a category error. To see this, consider this thought experiment, which I called the “Bewitched thought experiment” (self-quote):

In the TV series Bewitched, Samantha the housewife twitches her nose and everyone freezes except her. Then she twitches her nose and everyone unfreezes, without noticing that anything happened. For them, time has effectively stopped. Was anyone experiencing anything during that time? According to the encoding view of conscious experience, yes: one experiences the same percept during the entire time, determined by the unchanging state of the brain. But this seems wrong, and indeed in the TV series the characters behave as if there had been no experience at all during that time.

It would seem bizarre that people experience something when their brain state is maintained fixed. That is because to experience is an activity, not a thing. Therefore, a system cannot experience something just by virtue of being in some state. The vector of activation states of a neural network is not a conscious “state”. It is not even false: it is a category error. So, when a deep network outputs a symbol that we associate to Jennifer Anniston, it doesn’t actually see Jennifer Anniston. Having the output “neuron” in a particular state is not an experience, let alone an experience of seeing Jennifer Anniston.

All these remarks, which tend to show that the application of tensor calculus does not produce conscious experience, can be perplexing because it is hard to imagine what else could possibly produce conscious experience. Current “artificial intelligence” is our best shot so far at mimicking consciousness, and I am saying that it is not even a little bit conscious. So, what else could consciousness be then?

My honest answer is: I don’t know. Will we know someday? I also don’t know. Should it be possible, hypothetically, to build a conscious artifact? Maybe, but I don’t know (but why not? I don’t know, how then? I also don’t know). I wish I did know, but I don’t find it particularly bizarre that certain things are still unexplained and unclear. It doesn’t mean that there’s nothing interesting to be said on consciousness.

But in any case, I want to point out that the “what else” argument is a self-destructive argument. Asking “what else?” is just admitting that you have no logical argument that allows you to prove your proposition. This is not an argument, but rather an admission of failure (which is fine).

Anyway: I don’t know what it takes for something to be conscious, but I think we can be fairly confident that LLMs or deep networks are not conscious.

On the existential risks of artificial intelligence

The impressive progresses in machine learning have revived the fear that humans might eventually be wiped out or enslaved by artificial superintelligences. This is hardly a new fear. For example, this fear is the basis of most of Isaac Asimov’s books, who imagined that robots are built with three laws to protect humans.

My point here is not to demonstrate that such events are impossible. On the contrary, my point is that autonomous human-made entities already exist, and cause the exact same risks that AI alarmists are talking about, except they are real. In this context, evil AI fantasies are an anthropomorphic distraction.

Let me quickly dismiss some misconceptions. Does ChatGPT understand language? Of course not. Large language models are (essentially) algorithms tuned to predict the next words. But here we don’t mean “word” in the human sense. In the human sense, a word is a symbol that means something. In the computer sense, a word is a symbol, to which we humans attribute meaning. When ChatGPT talks about bananas, it has no idea what a banana tastes like (well, it has no idea). It has never seen a banana or tasted a banana (well, it has never seen or tasted). “Banana” is just a node in a big graph of other nodes, totally disconnected from the outside world, and in particular from what “banana” might actually refer to. This is known in cognitive science as the “symbol grounding problem”, and it is a difficult problem that LLMs do not solve. So, maybe LLMs “understand” language, but only if you are willing to define “understand” in such a way that it is not required to know what words mean.

Machine learning algorithms are not biological organisms, they do not perceive, they are not conscious, they do not have intentions in the human sense. But it doesn’t matter. The broader worry about AI is simply that these algorithms are generally designed so as to optimize some predefined criterion (e.g., prediction error), and if we give them very powerful means to do so, in particular means that involve real actions in the world, then who knows whether using those means might not be harmful to us? At some point, without necessarily postulating any kind of evil mind, we humans might become means in the achievement of some optimization criterion. We built some technical goals into the machine, but it is very difficult to ensure that those are aligned with human values. This is the so-called “alignment” problem.

Why not. We are clearly not there, but maybe, in a hypothetical future, or at least as a thought experiment. But what strikes me with the misalignment narrative is that this scenario is not at all hypothetical if you are willing to look beyond anthropomorphic evil robots. Have you really never heard of any human-made entities with their own goals, which might be misaligned with human values? Entities that are powerful and hard to control by humans?

There is an obvious answer if you look at the social rather than technological domain: it is the modern financialized multinational corporation. The modern corporation is a human-made organization that is designed in such a way as to maximize profit. It does not have intentions or goals in a human sense, but exactly like in the AI alignment narrative, it is simply designed in such a way that it will use all means available in order to maximize a predefined criterion, which may or may not be perfectly aligned with human values. Let’s call these companies “profit robots”.

To what extent are profit robots autonomous from humans? Today’s modern large corporations are owned not by people but in majority by institutional stakeholders, such as mutual funds, i.e., other organizations with the same goals. As is well known, their multinational nature makes them largely immune to the legislation of states (hence the issues of fiscal optimization, social dumping, etc). As is also well known, a large part of the resources of a profit robot is devoted to marketing and advertisement, that is, in manipulating humans into buying their products.

Profit robots also engage in intense lobbying to bend human laws in their favor. But more to the point, the very notion of law is not the same for a profit robot as for humans. For humans, a law is something that sets boundaries on what could be done or should not be done, morally. But a profit robot is not a person. It has no moral principles. So, law is just one particular constraint, in fact a financial cost or risk – a company does not go to prison. A striking example of this is the “Dieselgate”: Volkswagen (also not owned by humans) intentionally programmed their engines so that their car emissions remained hidden during the pollution tests required to authorize their cars on the US market. As far as I know, shareholders were not informed, and neither were consumers. The company autonomously decided to break the law for profit. Again, the company is not evil: it is not a person. It behaves in this non-human way because it is a robot, exactly like in the AI misalignment narrative.

We often hear that ultimately, it is the consumers who have power, by deciding what to buy. This is simply false. Consumers did not know that Volkswagen cheated on pollution tests. Consumers rarely know in what exact conditions the products are made, or even to what corporation the products belong. This type of crucial information is deliberately hidden. Profit robots, on the other hand, actively manipulate consumers into buying their products. What to think of planned obsolescence? Nobody wants products that are deliberately designed to break down prematurely, yet that is what a profit robot makes. So yes, profit robots are largely autonomous from the human community.

Are profit robots an existential risk for humans? That might be a bit dramatic, but they certainly do cause very significant risks. A particular distressing fact illustrates this. As the Arctic ice melts because of global warming, oil companies get ready to drill the newly available resources. Clearly this is not in the interest of humans, but this is what a company like Shell, who is only directly owned by humans in the proportion of 6%, needs to do to pursue its goals, which as any other profit robot, is to generate profit by whatever means.

So yes, there is a risk that powerful human-made entities get out of control and that their goals are misaligned with human values. This worry is reasonable because it is already realized, except not in the technological domain. It is ironic (but not so surprising) that billionaires buy into the AI misalignment narrative but fail to see that the same narrative fully applies to the companies that their wealth depends on, except it is realized.

The reasonable worry about AI is not that AI takes control of the world: the worry is that AI provides even more powerful means for the misaligned robots that are already out of control now. In this context, evil AI fantasies are an anthropomorphic distraction from the actual problems we have already created.