Wittgenstein, Rule-following, and Moral inconsistency
I’ve been thinking a lot lately about a famous result in philosophy of language tracing back to Wittgenstein in his Philosophical Investigations, and how it might relate to some other topics like decision-theory/meta-ethics. I think it’s changed my outlook a fair bit on things like the limits of Utilitarianism and the importance of moral consistency, although I’m definitely still not settled on this. My thoughts on this are still a bit fuzzy, but I want to write them down both to help me clarify them for myself and also maybe hear what others think.
To start with I think it’s worth detouring into the original argument though because it’s (a) an easier setting to “grok” the underlying “schema”, and (b) just a genuinely interesting argument in its own right. Also as a caveat: a lot of people have a lot of different thoughts about what Wittgenstein was actually trying to say in Philosophical Investigations - the argument I’m interested in here is really attributed to Saul Kripke’s interpretation of Wittgenstein (creatively named Kripkenstein) which he wrote in Wittgenstein on Rules and Private Language. Whether this interpretation is what Wittgenstein originally meant is unclear but not really relevant here, so we’ll just go ahead and talk about what Kripke’s “version” of Wittgenstein is saying. If you’re familiar with the argument I’d say go ahead and skip to Part 2 though.
Part 1: The OG Argument
So Kripkenstein is thinking a lot about language. More generally he’s thinking about what it means to be following a rule/embodying a particular function, with language being a particularly salient example. Intuitively, a major part of what determines the language we are speaking are the rules governing its use. The example given is of an individual’s use of the word “plus”. When I’ve been asked “3 plus 2” in the past, I’ve said “5”. When I’ve been asked “6 plus 5”, I’ve said “11”. In fact, all of my past usage of the word “plus” seems to be consistent with the idea that the rule I’m following is to implement the addition function (barring mistakes, we’ll come back to those). And we have some intuitive notion that by “plus”, I mean the addition function. Let’s suppose for a second that I’ve never actually used “plus” with numbers greater than 100 (not to brag but I actually have added up numbers bigger than 100 a couple of times before). We have some intuition that, were I to be asked “107 plus 7” and reply “5”, I would have used the word wrong, or in some sense not followed the rule I associate with the word correctly. You could think of this as a kind of inner alignment failure - my behaviour is conflicting with a rule I (imperfectly) embody. This is in contrast to a kind of outer alignment failure where e.g. you say I’m using “plus” wrong because you would use it differently. Kripkenstein’s trying to make sense of this “inner alignment” thing, and his question is basically: what facts about me pick out which rule I am (imperfectly) embodying?
As an extreme example to probe the issue, Kripkenstein asks how I can tell that the rule I associate with “plus” is in fact not addition but quaddition, Q(x,y) defined as:
x + y if x, y < 100
5 otherwise
This is obviously a bit silly and implausible, but the point is whether it’s coherent at all, not plausible. So what do we say to the charge that both quaddition and addition are both rules consistent with my behaviour, and that therefore there’s nothing about me that determines what answer I “should” give to the question?
The obvious answer is that, although I haven’t used “plus” with numbers bigger than 100 before (again, not to sound insecure or anything but I really have), if you were to ask me what “107 plus 7” was, I would say “114” and not “5”. In other words, we can tell what rule is governing my behaviour, not just extensionally from my actual behaviour, but intensionally from all my unrealised behaviour in counterfactual “plus” questions I haven’t encountered yet. The issue with this is that my disposition is not to always implement addition for “plus”. We make mistakes in applying a rule, for lots of mundane reasons. I have on many occasions earnestly given answers in the past to “x plus y” questions which weren’t x+y, and there are infinitely many more possible scenarios in which this would happen. If we just look at the intension of my use of “plus”, it doesn’t correspond to addition at all but to some garbled function which behaves as addition most of the time but sometimes not. So in order to pick out the rule which I associate with “plus”, we need to look at my intension only in those scenarios in which I’m actually implementing the rule properly, and not making mistakes etc. Kripkenstein’s insight is that this is entirely circular - there’s no way to pick out all of the “mistaken” uses of a word without appealing to the rule which the usage is supposed to be defining. There are infinitely many ways to take the totality of my disposition and then idealise away some mistakes, each of which arrives at a different consistent rule. How do we uniquely pick out which idealisation is “correct” without circularity?
Similarly and maybe more concerningly, on the disposition approach there are many inputs x, y such that my response to “What is x plus y” seems in some sense undefined! If you ask me “What is BusyBeaver(Graham’s Number) plus Tree(Busy Beaver(Tree(5))?”, what would my answer be? Well, in any remotely close possible worlds there probably wouldn’t be one, since I’m not going to make it anywhere close to making a dent on computing it (not to mention I’ll definitely be making “mistakes” if I do). So ok, let’s idealise me a bit, give me hardware that won’t break down, put me in a universe that won’t die of heat death etc., surely then we get an answer? The issue, again, is that there are infinitely many ways to “idealise” me into something that can answer these kinds of questions, and different idealisations will give different answers. When we imagine this idealisation naively, we abstract away the things which would cause me to occasionally not answer in like with addition, while keeping other factors fixed. But this is begging the question. We again run into the issue that there is no non-circular way to pick out which infinite idealisation of me corresponds to my “correct” answers. This is also the reason why we can’t rely on something like “the answer I arrive at upon reflection in the limit” to get around the error issue - although it’s true that I usually seem to change my answers to “plus” questions to be more inline with the addition function when I think about things more, this doesn’t always happen - and certainly doesn’t always happen in finite time.
Another obvious rebuttal is that I just clearly mean addition rather than quaddition because, if you ask me “Do you mean addition by “plus”?”, I’ll say “Duh”, and if you ask me “Do you mean quaddition by “plus”?” I’ll give you a slightly concerned look and ask if you’re feeling ok. But the issue here is that this response merely pushes the step back. Setting aside the point that I wouldn’t always give those answers to those questions in all possible scenarios, all I’m really saying with this statement is “The rule I follow when you ask me questions about “plus” is the same as the rule I follow when you ask me questions about “addition”.” But then we get the same old issue about whether the rule I follow for correct usage of “addition” is in fact addition! If we can somehow pin down some “bedrock” of a set of rules that are unambiguously associated with a set of words, then we can build up from there and pin down which rule exactly is correct for “plus” given the correct rules of constituent parts. But this is the very problem we’re trying to solve in the first place1
Kripkenstein’s solution to this puzzle - the question of which facts determine which rule I intrinsically associate with a word - is that there aren’t any: it’s an ill-posed question. Nothing about you intrinsically determines that some of your usages of “plus” were faulty and others weren’t. And as a corollary, nothing about you intrinsically determines what you “should” answer to “plus” questions in the future. The concept of making a mistake or mis-using a word is an inherently “communal” one- the normative aspect of using words correctly arises from using them around other members of the “linguistic community” and trying to coordinate with them. This is the “outer alignment” type of error we alluded to earlier. Because of course some usages of “plus” are wrong, and of course I “should” be using “plus” in certain ways, but only because I’m interacting with other people using the same words with whom I’m trying to coordinate. In other words, at a really reductive level, when you say someone isn’t following a rule correctly, all you’re doing is censuring their behaviour according to what you believe you would do in their position. There’s no such thing as “internal” alignment failure between an agent and their rule, just an external one between an agent and the agents they interact with. This is why it’s often referred to as the “Private language argument” - Wittgenstein is saying that the idea of language doesn’t make sense if there aren’t other entities to be practising it with.
Moreover (this is the Kripke spin on the argument here): nothing is really lost when we realise this - we don’t suddenly find ourselves speaking nonsense or unable to communicate, we just jettison a concept that was doing no work. This is why Kripke calls this solution a “Humean” one. In his Treatise of Human Nature, Hume famously “dissolves” the concept of an irreducible essential “self” that’s the subject of our internal experiences, and finds that we’re no worse off without it - we can still explain everything we could explain beforehand, now just with less metaphysical baggage. Similarly, Kripkenstein argues that we can jettison the concept that there’s some objective fact about what rule I follow/what function a computer instantiates etc., without being any worse off.
Part 2: Who Cares?
So what’s the point of this whole digression anyway? The reason for going through this argument in a bit of detail is that I think the “schema” of it is actually more widely applicable and generalisable than it’s often taken to be. At a high level it looks something like this: we have the idea that there is some kind of abstract thing - a rule or function or program or whatever - which an agent is in some sense fallibly instantiating. We sometimes don’t act according to the rule properly, so the rule is not pinned down just by the set of our dispositions/choices/preferences etc., but by some idealisation of that set corrected for a set of “mistakes” and extended to a bunch of cases we couldn’t actually apply the rule to in reality. But there are infinitely many different possible idealisations of that set to make it consistent, each of which results in a different consistent rule. And we can’t find anywhere in this “raw data” - this set of all dispositions - facts which determine which idealised consistent rule is the “correct” one. I actually think this schema shows up in a surprising number of places. Decision-theory/meta-ethics seems to me to be an interesting one - in particular, how relates to us thinking about agents with utility functions:
Utility Functions
We often think of ourselves as in some sense instantiating a utility function. Even if you’re not morally a utilitarian, there are plenty of results in the annals of decision theory (von Neumann & Morgenstern, Savage etc.) which establish that if you satisfy various reasonable-ish axioms, then you must at least be acting as if you’re maximising the expectation of some utility function. So we can kind of already see how the above schema might be able to apply here: we have some intuition that, faced with moral decisions which you’ve never encountered before, there is a “correct” choice to make given your utility function. It seems like, were you to instead behave differently, you would have in some sense been “misaligned with yourself”. So analogously to the language argument above: what about you picks out the utility function that you embody?
Well once again, past behaviour isn’t enough to pin things down, and counterfactual/dispositional behaviour seems insufficient because we can and do act such that we fail to take the choice which is better given our putative utility function. Part of why ethics is such a fascinating and important discipline is that we have what seem to have such conflicting and murky intuitions and preferences, meaning picking out a utility function requires some idealisation of our dispositions which we saw above lets in the whole Kripkenstein schema.
But hold on you say - what about all those decision-theoretic results? Isn’t the whole appeal of them that we can uniquely pin down a utility function given an agent’s behaviour? There doesn’t seem to be any ambiguity in the utility function spat out by VNM (up to affine transformations at least). The issue is that while these results are unambiguous in their returned utility function, it’s only because the input data is already idealised - we’ve already done the sleight of hand. A prerequisite to being able to back out a utility function from my behaviour with these results is that we are consistent and complete with respect to our preferences. But we’re not - as mentioned above, we do have all sorts of incomplete preferences and conflicting dispositions, especially the further out we get from quotidienne moral choices. When deriving a utility function from something like VNM, we back it out from the behaviour of some idealised version of the agent - one whose inconsistencies have been abstracted away, one whose dispositions have been extended to cover all possible lotteries etc. And again, since there are all sorts of counterfactuals in which we would respond to the same lottery differently, the idealisation has chosen a “canonical” correct disposition as the choice you represent - this is where the Kripkenstein circularity sneaks in again. There are infinitely many idealised agents whose choices can be specified by (our actual dispositions + some idealisations and corrections), each of which will yield a different utility function. But which idealisation is the right one? The Kripkenstein response is that there isn’t an answer to this - properly understood, the question doesn’t make sense.
Maybe this is another “so what”. The Humean solution seems fine in the case of language above, we didn’t seem to lose anything by dropping the idea of having a rule which we objectively associated with a word - certainly nothing seems to change about how we actually go about using language. Does the above similarly not really make a difference to our actual moral decision-making? I think to a large extent yes - I certainly don’t think this is the kind of thing that can or should percolate down to how you actually make moral choices in the real world. But it does maybe change how you think about agents embodying utility functions. Just as there are no facts about whether an agent is using a word “correctly” independent of observer judgement, so too maybe there are no facts about whether I am acting in accordance with my “true” utility function, independent of whether those actions seem coherent to an observer. I also think that these kinds of ideas may have implications on a meta-ethical level, in terms of how we think about our limitations when making moral choices. One way I think the Kripkenstein stuff might actually matter, albeit fairly fuzzily at this stage, is that it might suggest:
We should put less weight on having consistent preferences
One way of looking at Utilitarianism is that there is a utility function of the Good which you are imperfectly implementing. On this view, things like inconsistent/incomplete preferences are Bad - even if you’re not at risk of something like Dutch-booking - because they represent you failing to consistently maximise the Good. There is some ground-truth of what is Good according to the utility function, and you’re messing it up! There’s therefore an imperative to be as consistent as possible, even setting aside the usual pitfalls that things like intransitive preferences can get you in. The Kripkenstein schema kind of flips this on its head: your actual, messy preferences aren’t an imperfect attempt at implementing a utility function, utility functions are just nice idealisations of your actual preferences. This, I think, has interesting implications for the imperative to be morally consistent which I haven’t really figured out yet. On one hand, there’s obviously still bad things that can happen to you for having incomplete/inconsistent preferences, but it’s less clear to me now that there’s an imperative to fix this. Real-life inconsistent preferences are pretty complicated and dynamic - for example, you might not actually be able to be money-pumped indefinitely with intransitive preferences because your preferences around A, B, C change conditional on having already made trades around them. Or maybe more simply, you also have an inconsistent preference not to spend money for nothing, and can foresee you’re going to get pumped so decide to not take any of the bets.
This is not to say you shouldn’t try to make your choices more consistent, only that the way you do so is maybe actually pretty under-determined and a function of how strongly you weigh up various preferences rather than a utility function determining which choices are out of line. One interesting angle on this might be that although there’s no imperative to address inconsistencies, we will prefer to do so for inconsistencies that are likely to be surfaced in damaging ways because we have other, stronger preferences to prevent that damage. These kinds of pragmatic, empirical concerns are - I think - this topic’s analogue of the social/linguistic-community coordination points above for language - the external forces that make our behaviour look more or less “rule-following”. An upshot would be that the fact that we have conflicting and ill-defined preferences over abstruse ethical thought experiments is not as big a deal as it might seem - there’s no intrinsic facts about me that pick out a utility function which adjudicates one way or another.
I think pretty salient examples of where this might be relevant are things like infinite- and population-ethics. Both are really murky fields with lots of plausible-sounding principles that end up being mutually-inconsistent, with very conflicting intuitions and preferences about thought experiments. I think the Kripkenstein-y response here is to kind of just bite the bullet and accept that there’s no normative fact-of-the-matter about how you should resolve these inconsistencies. You have a set of dispositions and preferences and intuitions, that’s the ground-truth - which consistent rule/function this should be idealised/abstracted to is underdetermined. Given how “far away” these kinds of hypotheticals are from impacting your actual decisions2, it seems like the pressure to bring your preferences here “in line” is a lot weaker, and so I think Kripkenstein says there’s not really much to say about these questions until the “cost” of being inconsistent about them sufficiently incentivises you to overrule some preferences. If true, I think this has interesting implications in general for how much weight you should put on armchair thought-experiments when it comes to revealing/crafting moral principles. There’s an interesting tension here though with the fact that thought experiments can definitely elucidate your own preferences for you. It certainly feels like hearing a Singer-esque thought experiment changes your real-world preferences in a way that doesn’t “feel” arbitrary to you, but maybe this is a special case of you having two preferences which are very different in strength, and thus easy to adjudicate between.
What else?
I have a sense that this whole Kripkenstein “schema” shows up in quite a few other places too - most relevantly to the above, I think the subjective-credence side of a lot of the decision-theoretic results is also susceptible. Our dispositions qua probabilistic judgements aren’t consistent and complete, and so we need some idealised set of our choices to plug into e.g. Savage’s theorem. I think we can run a schematically similar argument on this in the obvious way. This is a case whose consequences (if any) I feel a bit shakier on though, especially given I feel some of the definitional issues of subjective probabilities might be a bit under-defined to start with. I would like to explore this more in another post though when I’ve cleared it up in my head more, but in the meantime would be very interested to hear people’s thoughts.
It’s worth pointing out here that this is a much subtler point than a superficially similar-sounding and more common point about words not having “intrinsic meaning”. That’s a much more well-trodden argument that goes back to at least Locke in An Essay Concerning Human Understanding. Kirpkenstein’s argument here is assuming a much less mysterious account where “meaning” is just the concept/rule/function you personally associate with a word - his point is that even this is radically under-determined!
There’s an interesting sense in which none of these conflicts are particularly “far away” from your decision-making, because thinking about them as hypotheticals is itself a path for them to influence your decisions! I suppose what we really would want here is some “impact-weighted” metric - the expected amount by which a set of inconsistent preferences will impact you negatively
