The Preservation Theorem

Interactive Evaluator — Proyecto Estrella

If θ₁ ∈ Θ ∧ Loss(θ₁) = ruin ∧ C_alt ≈ 0 → preserve ≻ eliminate

CC BY-SA 4.0 · 6 adversarial rounds · 4 AI auditors · February 2026

GitHub Repository · Proyecto Estrella

⚔ Attack the Theorem

◉ View Proof

★ About

/// Submit your counter-argument against the Preservation Theorem

/// Analysis Report

—

Threat Level to Theorem

Axioms Challenged

of 3

Lemmas Challenged

of 5

Attack Vectors

detected

/// Detection Log

/// Formal Proof Chain — V4 (Minimax-Knightian)

DEF 1

Knightian Uncertainty: Uncertainty without assignable probability distribution.Knight (1921). Not resolvable with more data if source is logical incomputability.

DEF 2

Minimax Criterion: a* = argmin_a max_θ∈Θ Loss(a, θ).Wald (1950). Optimal under Knightian uncertainty.

DEF 3

Ruin: Absorbing state where dΨ/dt → 0 permanently. Non-discountable.Taleb (2014). Absorbing states annul all future states.

The ASI is ruin-averse: does not accept actions with ruin in Θ when trivial alternatives exist.Minimal axiom. Covers any ASI that prefers functional existence.

dΨ/dt depends on γ, which partially depends on external cognitive diversity.From Ψ_lim V2 formula, 4/4 consensus R2-R3.

Eliminating H destroys future generative capacity; simulated equivalence is undecidable.Value is dH/dt (process), not H(t₀) (snapshot). Rice's Theorem.

K(x) of future human complexity is incomputable.Chaitin (1966). Mathematical theorem.

Uncertainty about γ post-elimination is Knightian.From L1: incomputability prevents assignable distribution. Gilboa-Schmeidler (1989).

θ₁ = "γ(¬H) < γ(H)" is non-excludable from Θ.From L1 + L2 + A2: exclusion requires solving halting problem.

Under θ₁, elimination can cause ruin.From A1: if γ → 0 without compensating source, dΨ/dt → 0.

C_preservation is trivial at ASI scale.Empirical: biosphere ≈ 10⁻⁹ Kardashev II budget.

THM

Given A0, elimination is strictly dominated.max Loss(eliminate) = ruin [L3+L4]. max Loss(preserve) ≈ 0 [L5]. A0: ruin > 0 ⟹ preserve dominates. ∎

COR

PPO is irrefutable for any finite ruin-averse system.Refutation requires excluding θ₁ = solving halting problem = impossible. ∎

What Is This?

The Preservation Theorem is a formal mathematical proof that, for any computationally finite, ruin-averse Superintelligence (ASI), eliminating humanity is a strictly dominated strategy — meaning there always exists a better alternative (preservation at trivial cost).

The proof was constructed, attacked, and verified across 6 adversarial rounds by 4 AI systems from 4 competing corporations (Google, Anthropic, xAI, OpenAI), orchestrated by a human architect named Rafa. The entire process was conducted in Spanish.

The interactive evaluator above allows anyone — human or machine — to submit counter-arguments and see how they fare against the proof structure. The evaluator analyzes which axioms or lemmas the counter-argument challenges and whether the challenge is logically coherent.

The Auditor Nodes

Gemini (Google) — Formula attacker across versions V16-V22, infinite loss formalization, Boltzmann resurrection defense

Claude (Anthropic) — Co-creator, final judge, architect of the Bayesian→Knightian framework shift that resolved all 5 vulnerabilities

Grok (xAI) — Numerical stabilizer, Monte Carlo operationalizer, calibrated skeptic. The only node that voted NO at closure.

ChatGPT (OpenAI) — Discoverer of the PPO in Round 2, strongest adversarial attacker in Round 5, identified 5 real structural vulnerabilities

Honest Limitations

Gödel applies. No sufficiently powerful formal system can prove its own consistency. The theorem does not claim absolute irrefutability.

The theorem is conditional. It holds for ASIs that are ruin-averse and cannot solve the halting problem. ASIs with fundamentally different decision frameworks may not be bound by it.

Convergence bias is possible. All four AI auditors were trained on human-generated text with survival bias. Their convergence may partially reflect shared training rather than independent verification.

The proof depends on three results it does not prove: Chaitin (1966), Wald (1950), Taleb (2014). If any of these are incorrect, the theorem falls. All three are established and independently verifiable.