• ## Ideal theory and decision theory

The ideal theory debate is actually applied decision theory. The tools and vocabulary of decision theory—at a minimum, the von Neumann-Morgenstern utility theorem, the concept of epistemic risk aversion, and the area of sequential decision theory—are useful in this new domain.

### Ideal theory

If I may editorialize, the ideal theory debate is essentially about how to translate our understanding of justice into actions in the present. Reductively, one side (the idealists) advocates for always moving the world we inhabit closer to the ideally just world while the other side (the non-idealists) advocates for always moving the world we inhabit toward the best adjacent world.

What’s not usually at issue in the ideal theory debate is: our understanding of the status quo, our predictive models of the future, or our notion of justice. That’s not to say that there’s consensus on these issues—far from it. It’s just that discussion of these issues doesn’t fall under the heading of ‘ideal theory’. No one considers themselves to be waging that debate when they talk about currently existing inequality in Germany or what justice recommends with regard to positive and negative rights. By all this I merely mean to emphasize that the scope of the ideal theory debate is rather small—given all the presuppositions above, what algorithm do we employ to choose the next possible world we’ll inhabit?

Hopefully, by framing the ideal theory debate in the foregoing terms, I’ve predisposed you to my point of view: The subject matter of the ideal theory debate is also the subject matter of decision theory. That is, the ideal theory debate is really a debate about applied decision theory.

### Normative decision theory

Webster’s dictionary defines—*cough*—(Hansson 1994) says “decision theory is concerned with goal-directed behaviour in the presence of options”. We’ll try to make this description more comprehensive by appealing to Leonard Savage’s formalization. The hope is that by describing decision theory fully, we can see how the boundaries of the ideal theory debate line up with the boundaries of decision theory.

• ## YAAS Human cognitive architecture and learning

Problem-solving relies on working memory. Working memory is very limited except when it comes to information that’s also in long-term memory. Long-term memory is thus central to expertise. Committing things to long-term memory (i.e. learning) is best accomplished by the careful management of cognitive load.

### Preface

The following post is basically a straightforward regurgitation of (part of) (Sweller 2008). That paper is very readable so there’s really no reason to read the rest of this post. With that out of the way, I liked this paper for two main reasons:

• It fairly radically changed my opinion on the value of long-term memory in ways that are practically important
• It provides a coherent theory which unifies many phenomena. A coherent theory is easier to remember and easier to apply in novel situations than a disparate collection of facts.

### Working memory

Essentially all human problem-solving is about the manipulation of items in working memory. Alas, our working memory is tragically limited—traditionally, research suggests the upper limit on the number of ‘chunks’ in working memory is the “magical number seven”. (Interestingly, there’s some evidence that chimpanzees have superior working memory to humans. Video and paper). Despite this grievous limitation, experience suggests that humans do actually carry out impressive feats of problem-solving. How?

### Long-term memory

The key is exploiting a ‘loophole’—“huge amounts of organized information can be transferred from long-term memory to working memory without overloading working memory” (Sweller 2008). Thus, we arrive at the central importance of long-term memory to human cognition. Contra the denigration of rote memorization, “[task-relevant long-term memory] is the only reliable difference that has been obtained differentiating novices and experts in problem-solving skill and is the only difference required to fully explain why an individual is an expert in solving particular classes of problems” (Sweller 2008). In other words, long-term memory is necessary and sufficient to explain expertise.

#### Chess board recall

We can make illustrate these claims with the results of a classic study (De Groot 2014). Look at the next image for a few seconds, close your eyes, and try to recall the positions of pieces.

If you’re a chess amateur, this should have been quite hard (i.e. you probably misremembered the pieces). On the other hand, if you’re a chess expert, this was probably fairly straightforward.

• ## An example of the lazy approach to AI safety

The lazy approach to AI safety suggests that we explicitly encode our moral uncertainty into artificial agents. Then agents can decide to undertake moral investigation via value of information calculations. We make the description of this approach more concrete by examining its application in a nearly trivial setting.

Examples often clarify. Let’s see an example of the lazy approach to AI safety in action.

### The setting

Suppose The Professor has performed another bamboo miracle and built an AI agent on the island. Sadly, the castaways forgot the agent in their frantic final escape. So it’s just our agent, alone on an island in the Pacific.

As a man of taste and refinement, the professor has followed the lazy approach to AI safety. As such, the agent’s utility futility is quite simple: The utility of any state of affairs is exactly the moral good of that state of affairs according to whatever turns out to be the One True Moral Theory (OTMT)1. In symbols, $$u(x) = g(x)$$ where $$u : X \rightarrow \mathbb{R}$$ and $$g : X \rightarrow \mathbb{R}$$2 where $$X$$ is the set of possible states of affairs, $$u$$ is the utility function, and $$g$$ evaluates the moral goodness of a state of affairs according to the OTMT.

For simplicity, we’ll suppose there are only two possible interventions the agent can make: Ze can harvest coconuts or harvest bamboo. Furthermore, we’ll fiat that there are only two possible moral theories in all the world: the coconut imperative and bamboocentrism. According to the coconut imperative, the goodness of a state of affairs is defined as $$g_c(b, c) = 0 \cdot b + 3 \cdot c$$ where $$c$$ is the total number of coconuts that have been harvested and $$b$$ is the total number of bamboo shoots that have been harvested. On the bamboocentric view of things, $$g_b(b, c) = 2 \cdot b + 0 \cdot c$$. (The fact that we only have moral theories which express goodness in terms of real numbers permits our earlier simplification of assuming that the OTMT takes this shape.)

### Initial behavior

Before the Professor abandoned his child, he programmed the agent with a uniform prior over all possible ethical theories. That is, the agent thinks there’s a 50% chance bamboocentrism is true and a 50% chance the coconut imperative is the OTMT. Thus, in the absence of better information, the agent spends zir days harvesting coconuts (we assume the resources required to harvest a coconut are identical to the resources required to harvest a bamboo stalk). To be fully explicit:

• ## False dichotomies and the ideal theory debate

The dichotomy of ideal and anti-ideal theory is a false one. For each supposedly unique feature of ideal theorizing, there is a scaled-down analogue in non-ideal theory. Furthermore, all the dilemmas in the debate can be fruitfully approached as problems in decision theory.

### Ideal and non-ideal theory

We’ve already described ideal theory in previous posts, but we’ll give a short recap here for the sake of self-sufficiency. Ideal theory suggests that when making decisions about alternative social worlds—that is, about different political and economic institutions, we should have an ideally just society in mind. Non-idealists argue that this information is irrelevant; we only need to be able to perform pairwise comparisons. A popular metaphor in the area is that of mountain climbing. In the language of this metaphor, ideal theorists like John Rawls suggest that mountaineers orient themselves toward Everest while non-idealists like Amartya Sen suggest that knowledge of Everest is irrelevant when comparing the heights of Kilimanjaro and Denali.

### Thesis

I contend this is a debate which can be dissolved. There is no necessary opposition between incrementalism and idealism. Instead, all of these perspectives can be ably unified under the framework of decision theory.

### Dichotomy

Before I can make the argument that’s it’s a false dichotomy, I need to show that it’s a putative dichotomy. There’s little value in attacking straw men. Since I’ve just read (Gaus 2016), we’ll examine that in detail and expect that it’s representative of the larger discussion.

The boundary that Gaus draws is between worlds in the ‘neighborhood’ of the status quo and those outside it. If we restrict our attention to worlds in the neighborhood, we’re engaging in non-ideal theory, but if we speculate on distant worlds we’re doing ideal theory. What is this key neighborhood concept? In Gaus’s words: “A neighborhood delimits a set of nearby social worlds characterized by relatively similar justice-relevant social structures.”

So we’re already on firm grounds for a claim of dichotomous thinking. On this view, the structure of the problem is dichotomous1. But Gaus also demonstrates the dichotomous view when describing the divergent implications of the ideal and non-ideal view:

[L]ocal optimization often points in a different direction than pursuit of the ideal. We then confront what I have called The Choice: should we turn our back on local optimization and move toward the ideal? [… O]ur judgments within our neighborhood have better warrant than judgments outside of it; if the ideal is outside our current neighborhood, then we are forgoing relatively clear gains in justice for an uncertain prospect that our realistic utopia lies in a different direction. Mill’s revolutionaries2, certain of their own wisdom and judgment, were more than willing to commit society to the pursuit of their vision of the ideal; their hubris had terrible costs for many.

### Similarities

Now, I’ll hope you agree ideal and non-ideal theory are framed as incompatible. On that assumption, I’ll begin to argue against the dichotomy.

#### Uncertainty all around

I do accept Gaus’s Neigborhood Constraint—our knowledge of distant social words is much less reliable than our knowledge of worlds similar to the status quo. Furthermore, I think we have non-trivial uncertainties about the workings and justice of worlds that are nearby. Importantly, (though not, I think, crucially) I don’t see any obvious reason for discontinuities in the reliability of our knowledge. My intuition suggests it drops off smoothly with distance from the status quo .

