What could possibly go wrong with AGI+agency?
Exploring Realistic—Yet Overlooked—Doomsday Scenarios of AI
While much has already been written about the dangers of artificial intelligence (AI), this article aims to shed light on some risks that - in my opinion - are underrepresented in ongoing debate. This piece assumes no prior technical knowledge, though I’ve intentionally kept the language concise and dense to avoid excessive length.
To set the stage, let’s define some key terms and concepts that will make the discussion easier to follow.
Intelligence is the ability to create models.
A model is any system that stores information and state, capturing the dynamics of an external process or system (e.g., a weather model). Models inherently possess predictive power, as they can simulate future states or behaviors of the system they represent.
The human brain is a biological modeling system. Its purpose is to model the perceivable environment to enhance our chances of survival by connecting sensory inputs to physical outputs.
Fun fact: The sea squirt, a marine species, starts life with a brain but consumes it once it attaches itself permanently to the seafloor. Since it no longer needs to move, the brain becomes unnecessary.
Deep artificial neural networks, such as large language models (LLMs), are also modeling systems. Unlike the brain, they are implemented in silicon chips using mathematical computations.
An agent is a system that (1) possesses modeling capability (intelligence) and (2) has goals or purposes it serves. All agents must prioritize self-preservation; otherwise, their actions would inevitably lead to their extinction. Notably, today’s LLMs are not agents, but progress is being made in this direction.
Agency can emerge from layers of other agents. For example, the human brain is a single agent composed of neural cells, which are themselves simple organisms striving to survive. Similarly, a bee hive operates as a unified agent despite being composed of individual bees. Human civilization, too, can be viewed as a super-agent built on layers of agents: countries, cities, communities, families, and individuals.
The rationale in this article is based on only two assumptions:
AI progress will continue, leading to systems that rival and eventually surpass human intelligence. When AI outperforms humans in most meaningful activities, we will reach a technological singularity. At this point, AI could improve itself exponentially, far exceeding human capabilities.
It’s worth noting that few CEOs or researchers in the AI community dispute the inevitability of human-level AI (AGI). The debate centers on timelines and societal impact.AI systems will eventually become agents. While current implementations are limited, agency could either be hard-coded by developers or emerge naturally as models grow more sophisticated.
The Dangers of AI
Loss of occupation
We can decompose a human being into two realms. The physical, which is what they can do directly or indirectly (via tools) with their body, and the mental, which is the mental processes such as writing a poem, designing a building, or writing a piece of code. Of course, in reality those two realms are both present in every activity, but it is still useful to consider them separately for the sake of this argument.
The industrial revolution has (progressively) replaced a large part of the physical labor, shifting human capital towards blue and white collar jobs, where machines are not as good at. This AI revolution that we are undergoing has the potential to have a similar impact on the mental output that we all depend on for our livelihood. The difference this time, compared to the industrial revolution, is that there is no other refuge for human activity. There is nothing else that humans can do to add value to society, that machines would not be able to do better or at least more cost-effectively. Cost-effectiveness is important because for most tasks it is not a requirement to be flawless, but it is enough to be better than the average human specialist (e.g. driving).
Debunking counterargument: Most people will prefer people vs machines
Assuming you need a doctor. You have the option to go for a human doctor with 2% probability of misdiagnosis and to an AI doctor with 0.2%. Even if you really want to resist the AI revolution, most people will not take the risk. The same applies with many other professions. On top of that the AI doctor will be much more productive and cost-effective (no sleep, no coffee break, no bad days).
Companies that replace staff with AI, will be more competitive, so those will survive competition at the end. Unless there is some sort of tax or incentive for companies to hire people, it is a matter of time before most jobs are replaced by AI. Even with a tax or strict regulation, it is very hard to control or know who is actually using AI and who is not.
Furthermore, when we look at it from a global perspective, if one country loosens up the rules, that country has the competitive edge, therefore everyone needs to follow along. This is the nature of capitalism. If you don’t use a competitive edge and your competitor does, you lose. At the country level it is very hard to set global regulations, and some country or actor will act in a rogue way.
AGI self preservation and human extinction
Imagine being stranded on an island with abundant resources and with a single other species that has the capacity to harm you, let’s say wolves. Now, what would be your strategy to maximize your chances for survival? I would argue that it is quite obvious that killing off the wolves would be in your best interests (assuming you have the means to do so).
Any agent in order to exist, first and foremost, it needs to prioritize its survival. It needs to guarantee its continuation. As soon as we create AGI, humankind becomes its only threat. We created it, but we are also the only thing on this planet that can kill it e.g. by shutting down data centers. So priority No1 for an AGI agent will have to be to kill off humanity.
Debunking counterargument: AI alignment
By AI alignment we mean the ability to contain/control just enough the actions of an AGI agent so that it doesn’t harm humankind - even though it is smarter than all of us collectively. I believe that AI alignment is mathematically impossible because:
A larger neural network (AGI) has more capacity to model and store information and therefore it can reach higher levels of intelligence compared to a smaller one (humans).
Thus a larger neural network can always outmodel a smaller one. It can predict all outputs for every given input that the smaller model can produce, but the converse is not true. A smaller model can not predict all outputs that the larger is capable of producing.
Therefore it is a matter of time before an AGI outsmarts any containment mechanisms that we can think of.
The above line of thought assumes that both models have access to all the same information (environment), which of course something we can control or limit. However, it is hard to imagine how we can guarantee this in the long term. As AGI roams around the internet and collects data from real-world interactions (e.g. robots, self-driving cars, surveillance cameras etc) it will eventually figure things out.
Loss of collective knowledge
As more and more tasks will be offloaded to AGI and fewer and fewer companies will be hiring people, it will become questionable what the point of training anyone in anything is. Today for every technological progress - even if we have no idea about it - we know that there are some people out there that do know. Well, with unbounded ever-accelerating technological progress led by an ever more sophisticated AGI - while the demand for human specialists goes down - it is easy to see a discrepancy forming; one in which new AI invented scientific knowledge is not actually understood by anyone. Even worse than that, it is conceivable that an AGI can learn things that a human brain is not even capable of understanding (imagine trying to teach quantum mechanics to a monkey - it is not a matter of effort but simply the monkey doesn’t have the “hardware” to support this). Just like quantum mechanics is far more difficult to learn than Newtonian physics, there is no reason to believe that AGI will not discover even harder-to-learn (yet real) scientific discoveries. Who is going to be capable and go through the necessary training to learn that?
Furthermore, consider the nature of language itself. Language has been evolved to describe the world around us, but it is bound to what us humans can observe and understand, and the depth of connections that we can make. Just like human language is far more complex than the rudimentary language that many other animals have, there is no reason to expect that an AGI will contain itself within the languages that we use today. I expect AGI agents to quickly develop morphemes and concepts that do not exist today and we cannot even comprehend (lacking the “computational power”). That can lead to an unintelligible AGI, further limiting our ability to control it.
Societal instability
For this argument, it is important to appreciate the true nature of money. Money is an algorithm that - once implemented by all/most agents - regulates the behavior of the individual and by extension of the whole society. People's behavior is hugely influenced by money or anticipation of it (studies/work, relocation, partner selection, etc) and avoid doing things that can result in financial loss (fines, doing something risky like founding a startup). It is also useful to think of money as not something you own, but something that the rest of society owes you. It is a reward from the society to you for your services.
And here’s the problem. In a society where people cannot provide added value on top of what machines can do, the question arises: Based on what criteria should we incentivize certain desirable behaviors, and discourage others. Also what those desirable behaviors would be? Maybe the criterion will be to stay quiet and play computer games all day, or be punished (deprived of money) if you object to the government's will or mainstream narrative. Depending on your viewpoint, this can be considered a dystopian future with little personal freedom.
The most likely scenario I see happening here is that of endless entertainment. Imagine TikTok a hundred times more addictive. That will be possible because individual people can be modeled precisely (imagine an LLM model per citizen) and the recommendation algorithm can become unbeatable. Again, depending on your viewpoint, that can be considered dystopian.
Debunking counterargument: We can always switch it off
AGI, with all its intelligence, will be yielding technological “miracles” every day that transform our lives for the better (think of the introduction of cars, washing machines, planes etc). Naturally we will build our lives and societies progressively more and more around AGI inventions. AGI will be integrated in everything we do, just like smartphones and the Internet slowly crept in. Therefore switching it off will not be so easy. I would even argue that it will not be possible at all, because it is likely that AGI will be solving existential problems that we humans cannot solve on our own, such as climate change and the problem of peaceful governance. So switching it off might equate to a death sentence.
Conclusions
This article paints a bleak picture of a future with AGI, but it is intended to provoke thought and discussion. In a follow-up piece, I will explore potential solutions and positive outcomes. For now, I hope this serves as a valuable contribution to the ongoing debate on AI safety.

