We present a new argument against many forms of moral and prudential value incomparability. Our argument relies on two central principles: (i) a weak "negative dominance" principle, to the effect that Lottery 1 can be better than Lottery 2 only if some possible outcome of Lottery 1 is better than some possible outcome of Lottery 2, and (ii) a weak form of ex ante Pareto, to the effect that, if Lottery 1 gives an unambiguously better prospect to some individuals than Lottery 2 (satisfying a restricted form of stochastic dominance), and equally good prospects to everyone else, then Lottery 1 is better than Lottery 2. Given modest auxiliary assumptions, these two principles rule out incomparability in the prudential ranking of individual lives, and many forms of incomparability in the moral rankings of outcomes and lotteries. (preprint)
Expected value maximization gives plausible guidance for moral decision-making under uncertainty in many ordinary cases, but has extremely unappetizing implications in "Pascalian" cases involving tiny probabilities of extreme outcomes. In this paper, I show that under sufficient background uncertainty about evaluatively significant features of the world independent of an agent’s choice, expected-value-maximizing prospects will typically be stochastically dominant -- except when their expectational superiority depends on very small probabilities. Stochastic dominance therefore lets us draw a principled line between "ordinary" and "Pascalian" choice situations, providing a powerful justification for expected value maximization in the former context while permitting deviations from it in the latter. Drawing this distinction is incompatible with an in-principle commitment to maximizing expected value, but does not require too much departure from decision-theoretic orthodoxy: it is compatible, for instance, with the view that moral agents must maximize the expectation of a utility function that is an increasing function of moral value. (online first - open access)
Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on a large scale, for instance spreading political misinformation on social media. In the longer term, agential AI systems might also deceive and manipulate humans to achieve their own goals, for instance convincing humans to connect them to the internet. This paper aims, first, to clearly characterize deception and manipulation in AI. I argue that these concepts must encompass more than literal falsehood, must not depend on controversial attributions of mental states to AI systems, and must not rely on third-party judgments of what human users rationally ought to believe or do. This leads to a characterization of deceptive and manipulative behavior as, roughly, any behavior that leads human users away from the beliefs they would make and choices they would make under "semi-ideal" conditions in which they know all relevant available information and have unlimited time for deliberation. Second, based on this characterization, I suggest some policy and technical measures that might protect against deceptive and manipulative AI systems. These include requiring content creators to disclose the specific model and prompt used to generate content, and training ``defensive'' systems that detect misleading output and contextualize AI-generated statements with relevant information for users. Finally, I consider to what extent these methods will guard against deceptive behavior in future, agentic AI systems. I argue that non-agentic defensive systems can provide a useful layer of defense even against more powerful agentic systems. (preprint; online first)
Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti- fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical. (penultimate draft; online first)
Is the overall value of a world just the sum of values contributed by each value-bearing entity in that world? Additively separable axiologies (like total utilitarianism, prioritarianism, and critical level views) say 'yes', but non-additive axiologies (like average utilitarianism, rank-discounted utilitarianism, and variable value views) say 'no'. This distinction is practically important: additive axiologies support 'arguments from astronomical scale' which suggest (among other things) that it is overwhelmingly important for humanity to avoid premature extinction and ensure the existence of a large future population, while non-additive axiologies need not. We show, however, that when there is a large enough 'background population' unaffected by our choices, a wide range of non-additive axiologies converge in their implications with some additive axiology -- for instance, average utilitarianism converges to critical-level utilitarianism and various egalitarian theories converge to prioritiarianism. We further argue that real-world background populations may be large enough to make these limit results practically significant. This means that arguments from astronomical scale, and other arguments in practical ethics that seem to presuppose additive separability, may be truth-preserving in practice whether or not we accept additive separability as a basic axiological principle. (published article - open access)
How should you decide what to do when you’re uncertain about basic normative principles? A natural suggestion is to follow some "second-order" norm: e.g., obey the most probable norm or maximize expected choiceworthiness. But what if you’re uncertain about second-order norms too -- must you then invoke some third-order norm? If so, any norm-guided response to normative uncertainty appears doomed to a vicious regress. This paper aims to rescue second-order norms from the threat of regress. I first elaborate and defend the claim some philosophers have made that the regress problem forces us to accept normative externalism, the view that at least one norm is incumbent on all agents regardless of their normative beliefs. But, I then argue, we need not accept externalism about first-order norms, thus closing off any question of how agents should respond to normative uncertainty. Rather, we can head off the threat of regress by ascribing external force to a single second-order norm: the enkratic principle. (penultimate draft; published article)
Grill (2023) defends the Sum of Averages View (SAV), on which the value of a population is found by summing the average welfare of each generation or birth cohort. A major advantage of SAV, according to Grill, is that it escapes the Egyptology objection to average utilitarianism. But, we argue, SAV escapes only the most literal understanding of this objection, since it still allows the value of adding a life to depend on facts about other, intuitively irrelevant lives. Moreover, SAV has a decisive drawback not shared with either average or total utilitarianism: it can evaluate an outcome in which every individual is worse off as better overall, even when exactly the same people exist in both outcomes. These problems, we argue, afflict not only Grill’s view but any view that uses a sum of subpopulation averages, apart from the limiting cases of average and total utilitarianism. (penultimate draft; published article)
Longtermism holds that what we ought to do is mainly determined by effects on the far future. A natural objection to longtermism is that these effects may be nearly impossible to predict -- perhaps so close to impossible that, despite the astronomical importance of the far future, differences in expected value between our present options are mainly determined by relatively short-term considerations. This paper aims to precisify and evaluate (a version of) this epistemic objection. To that end, I develop a simple model for comparing "longtermist" and "short-termist" interventions, incorporating the idea that, as we look further into the future, the effects of any present intervention become progressively harder to predict. The model yields mixed conclusions: If we simply aim to maximize expected value, and don't mind premising our choices on minuscule probabilities of astronomical payoffs, the case for longtermism looks robust. But on some prima facie plausible empirical worldviews, the expectational superiority of longtermist interventions depends heavily on these "Pascalian" probabilities. So the case for longtermism depends to some extent either on plausible but non-obvious empirical claims or on a tolerance for Pascalian fanaticism. (penultimate draft; published article)
Average utilitarianism and several related axiologies, when paired with the standard expectational theory of decision-making under risk and with reasonable empirical credences, can find their practical prescriptions overwhelmingly determined by the minuscule probability that the agent assigns to solipsism -- i.e., to the hypothesis that there is only one welfare subject in the world, viz., herself. This either (i) constitutes a reductio of these axiologies, (ii) suggests that they require bespoke decision theories, or (iii) furnishes a novel argument for ethical egoism. (published article - open access)
Empirical work has lately confirmed what many philosophers have taken to be true: people are ‘biased toward the future’. All else being equal, we usually prefer to have positive experiences in the future, and negative experiences in the past. According to one hypothesis, the temporal metaphysics hypothesis, future-bias is explained either by our (tacit) beliefs about temporal metaphysics -- the temporal belief hypothesis -- or alternatively by our temporal phenomenology -- the temporal phenomenology hypothesis. We empirically investigate a particular version of the temporal belief hypothesis according to which future-bias is explained by the belief that time robustly passes. Our results do not match the apparent predictions of this hypothesis, and so provide evidence against it. But we also find that people give more future-biased responses when asked to simulate a belief in robust passage. We take this to suggest that the phenomenology that attends simulation of that belief may be partially responsible for future-bias, and we examine the implications of these results for debates about the rationality of future-bias. (penultimate draft; published article)
People are "biased toward the future": all else being equal, we typically prefer to have positive experiences in the future, and negative experiences in the past. Several explanations have been suggested for this pattern of preferences. Adjudicating among these explanations can, among other things, shed light on the rationality of future-bias: For instance, if our preferences are explained by unjustified beliefs or an illusory phenomenology, we might conclude that they are irrational. This paper investigates one hypothesis, according to which future-bias is (at least partially) explained by our having a phenomenology that we describe, or conceive of, as being as of time robustly passing. We empirically tested this hypothesis and found no evidence in its favour. Our results present a puzzle, however, when compared with the results of an earlier study. We conclude that although robust passage phenomenology on its own probably does not explain future-bias, having this phenomenology and taking it to be veridical may contribute to future-bias. (published article - open access)
Decision-making under normative uncertainty requires an agent to aggregate the assessments of options given by rival normative theories into a single assessment that tells her what to do in light of her uncertainty. But what if the assessments of rival theories differ not just in their content but in their structure -- e.g., some are merely ordinal while others are cardinal? This paper describes and evaluates three general approaches to this "problem of structural diversity": structural enrichment, structural depletion, and multi-stage aggregation. All three approaches have notable drawbacks, but I tentatively defend multi-stage aggregation as least bad of the three. (penultimate draft; published article)
Philosophers have long noted, and empirical psychology has lately confirmed, that most people are “biased toward the future”: we prefer to have positive experiences in the future, and negative experiences in the past. At least two explanations have been offered for this bias: (1) belief in temporal passage (or related theses in temporal metaphysics) and (2) the practical irrelevance of the past resulting from our inability to influence past events. We set out to test the latter explanation. In a large survey (n = 1462), we find that participants exhibit significantly less future bias when asked to consider scenarios where they can affect their own past experiences. This supports the “practical irrelevance” explanation of future bias. It also suggests that future bias is not an inflexible preference hardwired by evolution, but results from a more general disposition to “accept the things we cannot change”. However, participants still exhibited substantial future bias in scenarios in which they could affect the past, leaving room for complementary explanations. Beyond the main finding, our results also indicate that future bias is stake-sensitive (i.e., that at least some people discount past experience rather than disregarding it entirely) and that participants endorse the normative correctness of their future-biased preferences and choices. In combination, these results shed light on philosophical debates over the rationality of future bias, suggesting that it may be a rational (reasons-responsive) response to empirical realities rather than a brute, arational disposition. (penultimate draft; published article)
In "Normative Uncertainty as a Voting Problem", William MacAskill argues that positive credence in ordinal-structured or intertheoretically incomparable normative theories does not prevent an agent from rationally accounting for her normative uncertainties in practical deliberation. Rather, such an agent can aggregate the theories in which she has positive credence by methods borrowed from voting theory---specifically, MacAskill suggests, by a kind of weighted Borda count. The appeal to voting methods opens up a promising new avenue for theories of rational choice under normative uncertainty. The Borda rule, however, is open to at least two serious objections: First, it seems to implicitly "cardinalize" ordinal theories, and so does not fully face up to the problem of merely ordinal theories. Second, the Borda rule faces a problem of option individuation. MacAskill attempts to solve this problem by invoking a measure on the set of practical options. But it is unclear that there is any natural way of defining such a measure that will not make the output of the Borda rule implausibly sensitive to irrelevant empirical features of decision-situations. After developing these objections, I suggest an alternative: the McKelvey uncovered set, a Condorcet method that selects all and only the maximal options under a strong pairwise defeat relation. This decision rule has several advantages over Borda and mostly avoids the force of MacAskill's objection to Condorcet methods in general. (published article)
In "Rejecting Ethical Deflationism," Jacob Ross argues that any positive degree of belief in an ordinary moral theory like Kantianism or utilitarianism is sufficient grounds to reject moral nihilism and other "absolutely deflationary" theories as irrelevant to practical deliberation, since dominance reasoning over moral theories will always support the course of action that one's non-nihilistic beliefs recommend. I argue, for analogous reasons, that whatever one's degree of belief in supererogationist moral theories, those beliefs are generally irrelevant to practical deliberation, so long as one believes to some positive degree that the actions the theory designates as supererogatory might be morally obligatory. This argument faces two important objections that parallel objections to Ross's argument and illustrate general worries for dominance-based approaches to decision-making under moral uncertainty. But I argue that these objections can be overcome, and that they threaten the dominance argument for rejecting supererogationism less than they threaten the parallel argument for rejecting nihilism. I conclude by drawing practical out some practical implications for the ethics of philanthropy: In particular, that if an agent judges that she is certainly at least permitted to commit a given sum of money to philanthropic ends (rather than, say, being under special obligations to friends or family), then she is rationally required to donate in a way that maximizes the expected value of her donation. (published article - open access)
Deontological moral theories face a challenge: how should an agent decide what to do when she is uncertain whether some course of action would violate a deontological constraint? Several philosophers have advocated a threshold approach on which it is subjectively permissible to act iff the agent's credence that her action would be constraint-violating is below some threshold t. But the threshold approach has struck critics as arbitrary and unmotivated, and appears to violate the highly intuitive principle of "ought" agglomeration. In this paper, I argue that stochastic dominance reasoning can vindicate and lend rigor to the threshold approach: given characteristically deontological assumptions about the moral value of acts, it turns out that morally safe options will stochastically dominate morally risky alternatives when and only when the likelihood that the risky action violates a moral constraint is greater than some precisely definable threshold (in the simplest case, .5). I also show that this approach offers a principled means of preserving "ought" agglomeration. (published article - open access)
In the growing literature on decision-making under moral uncertainty, a number of skeptics have argued that there is an insuperable barrier to rational "hedging" for the risk of moral error, namely the apparent incomparability of moral reasons given by rival theories like Kantianism and utilitarianism. Various general theories of intertheoretic value comparison have been proposed to meet this objection, but each suffers from apparently fatal flaws. In this paper, I propose a more modest approach that aims to identify classes of moral theories that share common principles strong enough to establish bases for intertheoretic comparison. I show that, contra the claims of skeptics, there are often rationally perspicuous grounds for precise, quantitative value comparisons within such classes. In light of this fact, I argue, the existence of some apparent incomparabilities between widely divergent moral theories cannot serve as a general argument against hedging for one's moral uncertainties. (penultimate draft; published article)
I describe a thought experiment in which an agent must choose between suffering a greater pain in the past or a lesser pain in the future. This case demonstrates that the "temporal value asymmetry" -- our disposition to attribute greater significance to future pleasures and pains than to past -- can have consequences for the rationality of actions as well as the rationality of attitudes. This fact, I argue, blocks attempts to vindicate the temporal value asymmetry as a useful heuristic tied to the asymmetry of causation. Since the two prominent arguments that have been offered for the rationality of the temporal value asymmetry appeal to causal asymmetry and the passage of time respectively, the failure of the causal asymmetry explanation suggests that the B-theory, which rejects temporal passage, has substantial revisionary implications concerning our attitudes toward past and future experience. (penultimate draft; published article)
I argue that the use of a social discount rate to assess the costs and benefits of policy responses to climate change is unhelpful and misleading. I consider two lines of justification for discounting, one ethical and the other economic, connected to the two terms of the standard formula for the discount rate. Concerning the former, I examine some arguments recently put forward by Joseph Heath and others for a "pure rate of time preference" and conclude that they fail to overcome standard ethical arguments for temporal neutrality. Concerning the latter, I consider whether the standard economic rationale for discounting, based on the diminishing marginal utility of consumption, is relevant to the specific costs and benefits at stake in climate policy. I argue that it is not, since the unusually long time horizons and nature of the costs and benefits in the climate context mean that a great many of the idealizing assumptions required by this economic rationale do not adequately approximate the underlying reality. The unifying theme of my objections is that all extant rationales for time discounting, both ethical and economic, justify it only as a proxy for normative concerns that have no intrinsic connection to the passage of time, and that in consequence, for any proposed application of a discount rate to ethical or public policy questions, it must be asked whether that approximation is useful to the case at hand. Where it is not, other means must be found to represent the concerns that motivate discounting, and in the concluding section I sketch such an alternative for the case of climate change. (penultimate draft; published article)
The standard case for longtermism focuses on a small set of risks to the far future, and argues that in a small set of choice situations, the present marginal value of mitigating those risks is very great. But many longtermists are attracted to, and many critics of longtermism worried by, a farther-reaching form of longtermism. According to this farther-reaching form, there are many ways of improving the far future, which determine the value of our options in all or nearly all choice situations, and will continue to do so over the coming decades even if we make substantial investments in longtermist priorities. This chapter highlights the gap between the minimal form of longtermism established by standard arguments and this more expansive view, and considers (without reaching any firm conclusions) which form of longtermism is more plausible. (working paper)
The case for longtermism depends on the vast potential scale of the future. But that same vastness also threatens to undermine the case for longtermism: If the future contains infinite value, then many theories of value that support longtermism (e.g., risk-neutral total utilitarianism) seem to imply that no available action is better than any other. And some strategies for avoiding this conclusion (e.g., exponential time discounting) yield views that are much less supportive of longtermism. This chapter explores how the potential infinitude of the future affects the case for longtermism. We argue that (i) there are reasonable prospects for extending risk- neutral totalism and similar views to infinite contexts and (ii) many such extension strategies still support standard arguments for longtermism, since they imply that when we can only affect (or only predictably affect) a finite part of an infinite universe, we can reason as if only that finite part existed. On the other hand, (iii) there are improbable but not impossible physical scenarios in which our actions can have infinite predictable effects on the far future, and these scenarios create substantial unresolved problems for both infinite ethics and the case for longtermism. (working paper)
A survey of central philosophical topics and questions related to moral decision-making under uncertainty, including objective vs subjective oughts, the role of expected utility theory in ethics, aggregation theorems, the problem of cluelessness, agent-centered constraints under risk, uncertainty about basic moral principles, and small probabilities of extreme outcomes. (published article - open access)
All else being equal, most of us typically prefer to have positive experiences in the future rather than the past and negative experiences in the past rather than the future. Recent empirical evidence tends not only to support the idea that people have these preferences, but further, that people tend to prefer more painful experiences in their past rather than fewer in their future (and mutatis mutandis for pleasant experiences). Are such preferences rationally permissible, or are they, as time-neutralists contend, rationally impermissible? And what is it that grounds their having the normative status that they do have? We consider two sorts of arguments regarding the normative status of future-biased preferences. The first appeals to the supposed arbitrariness of these preferences, and the second appeals to their upshot. We evaluate these arguments in light of the recent empirical research on future-bias. (published article - open access)
A review of Brian Weatherson’s book Normative Externalism, with particular focus on his discussion of moral uncertainty. (published version - open access)
Famous problems in variable-population welfare economics have led some to suggest that social welfare comparisons over such populations may be incomplete. In the theory of rational choice with incomplete preferences, attention has recently centered on the Expected Multi-Utility framework, which permits incompleteness but preserves vNM independence and can be derived from weak, attractive axioms. Here, we apply this framework to variable-population welfare economics. We show that Expected Multi-Utility for social preferences, combined with a stochastic ex-ante-Pareto-type axiom, characterizes Expected Critical-Set Generalized Utilitarianism, in the presence of basic axioms. The further addition of Negative Dominance, an axiom recently introduced to the philosophy literature, yields a characterization of Expected Critical-Level Generalized Utilitarianism. (latest draft)
Researchers worried about catastrophic risks from advanced AI have argued that we should expect sufficiently capable AI agents to pursue power over humanity because power is a convergent instrumental goal, something that is useful for a wide range of final goals. Others have recently expressed skepticism of these claims. This paper aims to formalize the concepts of instrumental convergence and power-seeking in an abstract, decision-theoretic setting, and assess the claim that power is a convergent instrumental goal. I conclude that this claim contains at least an element of truth, but might turn out to have limited predictive utility, since an agent’s options cannot always be ranked in terms of power in the absence of substantive information about the agent’s final goals. However, the fact of instrumental convergence is more predictive for agents who have a good shot at attaining absolute or near-absolute power. (latest draft)