The Island Problem

What... is this?

The Island Problem is a better mental model for understanding how AGI can go very wrong.

It's like the Paperclip Maximizer but more realistic.

We hope that it gives people a better intuition for how AGI can lead to human extinction.

We also hope that by structuring it as a problem — like a big, complicated word problem from the world's hardest math textbook — then maybe someone can solve this problem.

Maybe somebody could even help us formalize it into actual math. That would be nice.

The more-ambitious goal of The Island Problem (TIP) is to improve actual research for AGI.

We believe that TIP can be empirically verified through a framework of falsifiable hypotheses. In this way, TIP can serve as a research direction for understanding the large-scale risks of AGI.

Want to help? Join our GitHub community or email us.

Who... are you?

We are a few people with this crazy idea that AGI will be able to cause human extinction soon, and we should prevent this from happening.

See? Crazy.

But the ones who are especially crazy are the Nobel-prize-winning AI researchers who think this is possible within a few years.

Wait... what?

Anyway, uh... we have day jobs, kids, and dogs. We like these things. We don't want them to go away.

Except maybe the "day job" part. We're not that worried about job loss from AI. We're really only worried about the "human extinction" part.

Inspiration

Dan Hendrycks: His evolutionary model for a many-AGIs landscape provided the critical component of a strong driving force for TIP, based on competitive pressures and natural selection. Paper here and video lecture here.
Gradual Disempowerment: The main idea shares many similarities with Gradual Disempowerment (GD). However, there are many differences:
- GD focuses on comprehensive details, an academic approach, and proposing actual solutions.
- TIP focuses on building a strong mental model for why AGI loss-of-control is both bad and likely. It also provides a software-like framework to help formalize this process of AGI loss-of-control.
Instrumental Convergence: The main dynamic of TIP could be understood as AI converging on the "ocean" of better physical options. However, the intuition underlying the island problem feels more like instrumental divergence — because competitive pressures push AGIs to diverge from the small "island" of human-compatible options to find more-powerful systems in the vast "ocean" of non-human, purely-physical options. They start at the island and diverge towards the larger space of options that are far more numerous and far more optimal because they do not include the specific complexities that accommodate humans. These anthropic complexities are both rare and physically unnecessary while building optimal structures for computation.

How can we contact you?

Join our GitHub community or email us.