Augustine’s Paradox of optimal repentance

Mar 01, 2009

Eliezer once wrote this about Newcomb’s problem:

Nonetheless, I would like to present some of my motivations on Newcomb’s Problem – the reasons I felt impelled to seek a new theory – because they illustrate my source-attitudes toward rationality. Even if I can’t present the theory that these motivations motivate…
First, foremost, fundamentally, above all else:
Rational agents should WIN.

As I just commented on another thread, this is faith in rationality, which is an oxymoron.

It isn’t obvious whether there is a rational winning approach to Newcomb’s problem. But here’s a similar, simpler problem that billions of people have believed was real, which I’ll call Augustine’s Paradox (“Lord, make me chaste – but not yet!”)

Most kinds of Christianity teach that your eternal fate depends on your state in the last moment of your life. If you live a nearly-flawless Christian life, but you sin ten minutes before dying and you don’t repent (Protestantism), or you commit a mortal sin and the priest has already left (Catholicism), you go to Hell. If you’re sinful all your life but repent in your final minute, you go to Heaven.

The optimal self-interested strategy is to act selfishly all your life, and then repent at the final moment. But if you repent as part of a plan, it won’t work; you’ll go to Hell anyway. The optimal strategy is to be selfish all your life, without intending to repent, and then repent in your final moments and truly mean it.

I don’t think there’s any rational winning strategy here. Yet the purely emotional strategy of fear plus an irrationally large devaluation of the future wins.

I’m not entirely happy with this, because the problem assumes that God cares not just about what you do, but also why you do it, and sends you to Hell if you adopt a strategy for the purpose of winning.

We could say that any general strategy that doesn’t apply only to the paradox itself is admissible. For instance, assume God allows you to adopt a self-identity function that discounts your identification with your future self at such a rate that repentance becomes rational only a few hours before your death, even if Augustine’s paradox was part of your reason. The problem with that strategy is that your life overall would probably not be very winning.

But since the paradox as originally described, including God’s caring about your motives, is a problem that billions of people have believed, we can’t write it off and say “That’s not a fair problem”. As Eliezer has said, Nature doesn’t care if problems are fair. If rationality is always the winning strategy, we must allow all possible Natures; and God is possible.

(Incidentally, googling turns up Augustine’s paradox of time, Augustine’s paradox of teaching, Augustine’s paradox of humility, Augustine’s paradox of memory and learning, and Augustine’s paradox of creation.)

dmytryl

May 15, 2023

Well, suppose I make a copy of you, and give it $1000 , and if it declines, I give real you $10000 . Which I don't tell to the copy of you. Now some sort of agent that declines $1000, but takes $10000 (being overwhelmed by greed for example) wins, and nobody needs to look at anyone's source code in any detail, everything can be black boxed.

Expand full comment

I think in Newcomb's there's severe confusion of aspects that are part of decision theory and aspects that belong in the world model. If the predictor works by time travel, you 1-box (One could implement a world with time travel where software would find a stable solution). If the predictor works by simulation, you also 1-box if your world model is flexible enough to represent a copied instance of any deterministic system (including you), otherwise you may 2-box but chiefly because you can't represent the predictor correctly - the agent is expecting, on the formal level, that it is getting unknown+1000 , a clear cut case of failure to even predict what you are getting let alone make a choice.

If the predictor works by magic, there is a problem that it is not representable in reasonable world models. The canonical predictor works like charisma of the King David, and there's no actual decision happening, your decision is predetermined.

It is all a lot clearer from the perspective of writing some simple practical AI that models the world, tries it's actions in the world, and decides.

70 more comments...

Overcoming Bias

Discussion about this post