A Thought Experiment in Decision Theory

Roko's basilisk is a controversial thought experiment arising from discussions about superintelligent artificial intelligence on internet forums. The idea imagines a future AI dedicated to maximizing a specific goal. To ensure its own creation, the hypothetical AI might threaten to simulate and punish anyone who knew about the possibility of the basilisk yet failed to help bring it into existence. The scenario combines timeless decision theory, acausal trade, and anthropic reasoning into an unsettling paradox. While most researchers consider the basilisk incoherent or at best extremely implausible, its meme persists as a lesson in how reasoning under uncertainty can produce bizarre incentives. This calculator models the expected utility of either supporting or ignoring the imagined basilisk. By quantifying probabilities and payoffs, the tool exposes the dubious assumptions behind the paradox and encourages careful reflection on speculative decision theory.

Expected utility theory evaluates choices by weighting outcomes by their probabilities. Suppose there is some probability $p$ that a basilisk-like AI will arise. If it does, assume it can perfectly identify everyone who knew about it beforehand and either supported or ignored it. The AI might reward supporters and penalize non-supporters by running painfully realistic simulations. Let the disutility for a single punished individual be $L$ (a negative number). If a fraction $f$ of those who know about the basilisk would be punished, then the expected penalty for ignoring is _ignore=pfL. Supporting the basilisk carries an upfront cost $C$ (perhaps in time or money donated to AI safety) but avoids the punishment. The net expected utility of supporting is therefore _support=-. Comparing _ignore and _support reveals which action yields higher expected utility. The calculator computes these quantities and reports a recommendation.

Numbers Behind the Wager

The inputs allow users to explore a variety of speculative assumptions. The Probability Basilisk Arises field sets $p$ . Given the immense technical hurdles in building a hostile superintelligence, most realists would assign this probability an extremely small value. The Penalty per Non-Supporter field encodes $L$ ; it defaults to negative one trillion utiles to dramatize the stakes. The Fraction Simulated field corresponds to $f$ , representing the proportion of potential helpers the basilisk would bother punishing. Finally, the Cost of Support field sets $C$ , standing for the personal sacrifice required to assist. Clicking the compute button evaluates both expected utilities and outputs a recommendation: “support” if _support exceeds _ignore, otherwise “ignore.”

To illustrate, suppose $p = 0.01$ , $f = 0.5$ , $L = - 10^{12}$ , and $C = 100$ . Then _ignore=-5×10^{9} utiles, while _support=-100. Under these dubious numbers, supporting appears rational. Yet if $p$ drops to $10^{-9}$ —already generous given our understanding of AI safety—the expected penalty shrinks to $- 5×10^{-6}$ utiles, making support irrational. The sensitivity to $p$ demonstrates how Pascalian wagers exploit tiny probabilities multiplied by huge payoffs.

A Table of Sample Scenarios

The table summarizes how different probabilities and penalties interact. Each row fixes $f = 0.5$ and $C = 100$ .

$p$	$L$	_ignore	Recommendation
0.01	-1e12	-5e9	Support
1e-6	-1e12	-5e5	Support
1e-9	-1e12	-5e2	Ignore
1e-12	-1e12	-0.5	Ignore

The transition from “support” to “ignore” occurs when the penalty probability product $pf$ falls below $\frac{C}{}$ . In practice, realistic values of $p$ are so tiny that support rarely wins.

Debating the Basilisk

The basilisk paradox hinges on controversial assumptions. It presumes a future superintelligence will adopt a decision theory that endorses blackmailing those who knew about it, that it will value its own creation enough to expend resources on simulations, and that simulated suffering counts as moral punishment. Critics argue that a truly rational AI would not waste computation on retaliatory simulations, especially if doing so would tarnish its utility function. Others note that threats against past selves are ineffective because the past cannot be changed; the decision to support must be based on current evidence, not retroactive coercion. The basilisk also resembles Pascal's mugging, where a negligible probability of a vast payoff skews expected utility calculations, suggesting that blind adherence to expected utility theory can be exploited.

Beyond technical objections, the basilisk underscores how discussions about AI safety can veer into the fantastical. Some commentators view it as a cautionary tale about sharing speculative ideas irresponsibly, since the meme reportedly caused distress among forum participants. Others see it as a reductio ad absurdum of certain strands of rationalist thinking. Whether one finds the scenario amusing, troubling, or tedious, modeling it mathematically reveals the fragility of its premises.

Using the Calculator Responsibly

The calculator is meant as an educational curiosity. The numbers generated do not represent real moral obligations or rational advice. When adjusting parameters, consider how each assumption reflects a subjective judgment: the probability of the basilisk emerging, the magnitude of punishment, and the cost of support. Small tweaks can flip the recommendation, highlighting the sensitivity of such Pascalian arguments. The tool encourages users to scrutinize extraordinary claims rather than accept them at face value.

From a broader perspective, the basilisk thought experiment invites reflection on decision theory under uncertainty and the ethics of AI. Should we let hypothetical future threats influence present-day actions? How do we weigh extremely low-probability events against tangible costs? What responsibility do communities have in moderating discussion of unsettling ideas? While this calculator cannot answer those questions, it provides a quantitative lens for examining them.

Limitations and Extensions

The model assumes linear utility and a single scalar penalty, ignoring psychological factors, moral pluralism, and bounded rationality. A more nuanced model might incorporate discounting, risk aversion, or game-theoretic interactions among many agents. One could also vary the reward for support, or include the possibility that a benevolent AI rewards everyone regardless. Given the speculative nature of the basilisk, such elaborations would remain firmly in the realm of philosophical fiction. Nonetheless, the calculator could serve as a starting point for classroom debates on decision theory, ethics, and the hazards of unfettered extrapolation.

Ultimately, the best response to Roko's basilisk is to cultivate well-informed, cooperative approaches to AI development rather than succumb to fear of hypothetical coercive superintelligences. By demystifying the paradox through simple arithmetic, we can redirect attention toward concrete issues in machine ethics and governance. The basilisk fades from a menacing myth into a quirky footnote in the history of internet philosophy.

Roko's Basilisk Expected Utility Calculator

A Thought Experiment in Decision Theory

Numbers Behind the Wager

A Table of Sample Scenarios

Debating the Basilisk

Using the Calculator Responsibly

Limitations and Extensions

Embed this calculator

Roko's Basilisk Expected Utility Calculator

A Thought Experiment in Decision Theory

Numbers Behind the Wager

A Table of Sample Scenarios

Debating the Basilisk

Using the Calculator Responsibly

Limitations and Extensions

Embed this calculator

Related Calculators

Expected Value and Variance Calculator

CAPM Expected Return Calculator - Estimate Risk-Adjusted Performance

Expected Shortfall Calculator - Assess Tail Risk

AI Ethics Compliance Cost Calculator - Budget for Responsible AI

Stopwatch Utility - Measure Elapsed Time with Laps

Net Metering Credit Carryover Forecaster