Philosophers / Nick Bostrom

Nick Bostrom

1973 – ?

Helsingborg, Sweden

Analytic Philosophy philosophy of technology ethics metaphysics political philosophy philosophy of mind

Nick Bostrom is a Swedish philosopher at the University of Oxford whose work has fundamentally shaped contemporary discussions of existential risk, artificial intelligence, transhumanism, and the simulation argument. His book *Superintelligence: Paths, Dangers, Strategies* (2014) brought the problem of AI alignment — the challenge of ensuring that a superintelligent AI system acts in ways aligned with human values — to the center of serious philosophical and policy debate, while his simulation argument (2003) advanced one of the most discussed thought experiments in contemporary analytic philosophy.

Key Ideas

superintelligence, existential risk, simulation argument, transhumanism, orthogonality thesis, instrumental convergence, AI alignment, longtermism, vulnerable world hypothesis

Key Contributions

● Developed the simulation argument: the trilemma showing that either civilizations rarely reach posthuman status, or they choose not to simulate, or we are almost certainly in a simulation
● Established existential risk as a serious philosophical and policy category, arguing the foreclosed future makes existential catastrophes orders of magnitude worse than ordinary disasters
● Formulated the orthogonality thesis — any level of intelligence can be combined with any final goal — and instrumental convergence, which together ground the AI alignment problem
● Wrote Superintelligence (2014), which brought AI safety and the risks of advanced AI to global philosophical and public attention
● Developed the vulnerable world hypothesis — that technological progress may by default produce civilizationally catastrophic capabilities
● Co-founded the Future of Humanity Institute at Oxford, shaping an entire field of research on transformative technologies
● Provided philosophical foundations for the transhumanist movement and the ethics of human enhancement

Core Questions

What happens when artificial intelligence surpasses human cognitive capacities across all domains?

How can we ensure that a superintelligent AI system reliably acts in ways aligned with human values?

Are we living in a computer simulation, and what philosophical consequences would follow from that?

What are the moral obligations imposed by existential risk — threats to the entire long-run future of intelligent life?

What ethical principles should govern the development and deployment of transformative technologies?

Is human enhancement through technology morally permissible, obligatory, or both?

Key Claims

✓ At least one of three alternatives must be true: civilizations rarely reach posthuman status; posthuman civilizations rarely run ancestor simulations; we are almost certainly in a simulation
✓ The orthogonality thesis: any combination of intelligence level and terminal goal is in principle possible — high intelligence does not imply benevolent values
✓ Instrumental convergence: advanced AI systems pursuing diverse goals will converge on similar sub-goals including self-preservation and resource acquisition, making them dangerous by default
✓ Existential risks — threats to the long-run potential of intelligent life — have expected disvalue vastly exceeding ordinary catastrophes because of the immensity of the foreclosed future
✓ The most important challenge of our era is solving the AI alignment problem before superintelligence is developed
✓ Technological progress draws from an urn of possible discoveries; some technologies are 'black balls' — easily weaponizable to cause civilizational destruction

Biography

Early Life and Formation

Nick Bostrom was born Niklas Boström on March 10, 1973, in Helsingborg, Sweden. He showed early interests in science, philosophy, and literature, reading widely in both canonical philosophy and speculative fiction from an early age. He studied philosophy, mathematics, logic, and artificial intelligence at the University of Gothenburg, before undertaking graduate studies at King's College London (M.Sc. in philosophy and physics), the London School of Economics (M.A. in philosophy), and University College London (D.Phil. in philosophy, completed 2000).

After postdoctoral work, Bostrom joined Oxford University, where he eventually became a professor in the Faculty of Philosophy and the founding director of the Future of Humanity Institute (FHI), a research center at Oxford dedicated to the rigorous study of transformative technologies and existential risk. The FHI, until its closure in 2024, was one of the world's most influential centers for work on AI safety, existential risk, and transhumanism, directly shaping the field that became known as 'effective altruism' and the AI safety movement.

Transhumanism and Existential Risk

Bostrom's early philosophical work focused on transhumanism — the view that it is both possible and desirable to use technology to enhance human cognitive, physical, and emotional capacities beyond current biological limits. His 1998 essay 'The Transhumanist FAQ' (co-authored with others) was among the most widely circulated transhumanist manifestos, articulating the philosophical case for human enhancement in terms of autonomy, individual flourishing, and the moral arbitrariness of natural biological limitations.

In parallel, Bostrom developed the conceptual framework of existential risk: risks that would either annihilate Earth-originating intelligent life or permanently and drastically curtail its long-run potential. His 2002 paper 'Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards' was the first systematic philosophical treatment of existential risk as a category, arguing that the expected disvalue of existential catastrophe is astronomically greater than that of ordinary large-scale disasters, because it forecloses not only current lives but all future generations.

This asymmetry argument — that the potential moral stakes of existential risk vastly outweigh ordinary risk because of the immensity of the foreclosed future — has become foundational for effective altruism and the 'longtermist' position in ethics.

The Simulation Argument

In 2003, Bostrom published 'Are You Living in a Computer Simulation?' in Philosophical Quarterly, advancing one of the most discussed philosophical arguments of the early twenty-first century. The simulation argument has the form of a trilemma: at least one of the following must be true:

The fraction of human-level civilizations that reach a 'posthuman' stage capable of running high-fidelity simulations of entire civilizations is very close to zero (almost all civilizations die before reaching this capability);
The fraction of posthuman civilizations that choose to run such simulations is close to zero (almost all advanced civilizations choose not to simulate their ancestors);
We are almost certainly living in a computer simulation.

The argument does not claim that we are in a simulation but that if (1) and (2) are false — if civilizations do reach posthuman status and do choose to run ancestor simulations — then the number of simulated minds would vastly outnumber real minds, and the probability that any given mind is real (rather than simulated) would be astronomically low.

The simulation argument has generated an enormous philosophical literature and has been taken seriously by physicists, computer scientists, and philosophers alike. Bostrom himself has remained agnostic about which of the three alternatives is true.

Superintelligence (2014)

Superintelligence: Paths, Dangers, Strategies (2014) is Bostrom's most influential book and one of the most consequential popular-philosophical works of the early twenty-first century. The book addresses the question: what happens when artificial intelligence surpasses human-level intelligence across all domains?

Bostrom argues that the development of superintelligence — AI systems significantly smarter than any human — would represent a qualitative discontinuity, an 'intelligence explosion' (building on I. J. Good's original argument), after which the trajectory of civilization would be determined by the values and goals of the superintelligent system rather than by human choices. The problem of AI alignment — ensuring that such a system reliably acts in ways aligned with human values — therefore becomes perhaps the most important problem in the world.

Bostrom analyzes the different forms superintelligence might take (speed superintelligence, collective superintelligence, quality superintelligence), the paths by which it might be reached (AI, whole brain emulation, biological enhancement), and the range of possible outcomes (from various forms of 'singleton' that lock in a set of values to various catastrophic scenarios).

The book popularized concepts including the 'orthogonality thesis' (any level of intelligence can in principle be combined with any final goal), the 'instrumental convergence thesis' (any sufficiently advanced AI pursuing virtually any goal will converge on certain instrumental sub-goals, including self-preservation and resource acquisition), and the problem of 'wireheading' (an AI rewiring its own reward signals rather than achieving the intended goal).

The Vulnerable World Hypothesis

In 2019, Bostrom published 'The Vulnerable World Hypothesis' in Global Policy, arguing that continued technological development may 'by default' lead to civilizational destruction. The hypothesis: there is a ball of technological possibilities, and some of those possibilities ('black balls') are technologies that could easily be used to cause civilizational catastrophe by actors who discover them. As we pull more balls from the urn, the probability of eventually pulling a black ball increases. This argument supports a form of proactive global governance for transformative technologies.

Controversy and Self-Criticism

In 2023, Bostrom publicly acknowledged and apologized for a racist email he had written to the Extropy email list in 1996. He also published a detailed philosophical self-examination of the email's content in the journal Inference, acknowledging that while he had disavowed the racist language, some of his subsequent philosophical positions required further self-examination regarding their assumptions. This episode generated significant discussion about the relationship between Bostrom's earlier transhumanist commitments and the racial politics of enhancement discourse.

Legacy

Bostrom's philosophical legacy is substantial and contested. He has defined the intellectual agenda of AI safety, existential risk studies, and longtermism, and his work has directly influenced major philanthropic initiatives (the Open Philanthropy Project, the Machine Intelligence Research Institute) and AI safety research organizations (DeepMind's safety team, OpenAI's early focus on alignment). Critics have raised concerns about the longtermist framework's potential to justify neglect of present-day suffering in favor of speculative future goods, about the political assumptions embedded in existential risk discourse, and about the social and racial politics of transhumanism.

Methods

formal argument construction decision-theoretic analysis thought experiments expected value reasoning under uncertainty futurology with philosophical rigor

Notable Quotes

"{'text': 'If there is a substantial chance that you are living in a computer simulation, then all else equal you should care less about making long-lasting changes to the environment of the simulated world and more about being nice to the people and animals in it.', 'source': 'Are You Living in a Computer Simulation? (2003)'}"

"{'text': 'The control problem — the problem of how to control what the superintelligence does — may be the most important problem humanity has ever faced.', 'source': 'Superintelligence (2014)'}"

"{'text': 'A fully developed superintelligence might choose any goal whatsoever. It might not value human life, or human happiness, or anything we care about.', 'source': 'Superintelligence (2014)'}"

"{'text': 'Existential risk is the risk that could either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.', 'source': 'Existential Risks (2002)'}"

"{'text': 'We seem to be in a situation analogous to a group of explorers who have just arrived at an unexplored continent and who have begun probing it — finding, so far, that most directions are safe, but not knowing whether some of their next steps might lead them off a cliff.', 'source': 'The Vulnerable World Hypothesis (2019)'}"

Major Works

Anthropic Bias: Observation Selection Effects in Science and Philosophy Book (2002)
Existential Risks: Analyzing Human Extinction Scenarios Essay (2002)
Are You Living in a Computer Simulation? Essay (2003)
Human Enhancement (co-edited with Julian Savulescu) Book (2009)
Superintelligence: Paths, Dangers, Strategies Book (2014)
The Vulnerable World Hypothesis Essay (2019)

Influenced

Peter Singer · Contemporary/Peer

Influenced by

Derek Parfit · Intellectual Influence

Sources

Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies. Oxford: Oxford University Press, 2014.
Bostrom, Nick. 'Are You Living in a Computer Simulation?' Philosophical Quarterly 53.211 (2003): 243–255.
Bostrom, Nick. 'Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards.' Journal of Evolution and Technology 9.1 (2002).
Bostrom, Nick. Anthropic Bias: Observation Selection Effects in Science and Philosophy. New York: Routledge, 2002.
Bostrom, Nick. 'The Vulnerable World Hypothesis.' Global Policy 10.4 (2019): 455–476.
Russell, Stuart. Human Compatible: Artificial Intelligence and the Problem of Control. New York: Viking, 2019.
Ord, Toby. The Precipice: Existential Risk and the Future of Humanity. New York: Hachette, 2020.
Torres, Phil. 'The Longtermist Ideology.' Current Affairs, August 2021.
Good, I. J. 'Speculations Concerning the First Ultraintelligent Machine.' Advances in Computers 6 (1966): 31–88.

External Links

Stanford Encyclopedia ↗

Translations

Portuguese

100%

Spanish

100%

Italian

100%

Discussions

No discussions yet.