In Memory of Anthony Brueckner

I’m deeply saddened by the passing of my friend and mentor, Anthony Brueckner. He inspired me to do philosophy. He inspired me to explore my interests. He came along side me in that exploration, and he guided me to a clearer and more streamlined expression of my own thoughts and ideas. I’m going to miss his anecdotes, relating a philosophical point to something from pop culture–often a quote from a movie or a song. I’m going to miss hearing him coming down the hall of the department, continually clearing his throat, knowing that Tony B. was near. I’m going to miss seeing him wearing that worn brown leather jacket, carrying that worn leather briefcase. But, most of all, I’m going to miss talking to him about philosophy. If I have a philosophical hero it is Tony Brueckner. He has provided myself, and so many others, with a model of intellectual humility, intellectual honesty, generosity with one’s time, and rigorous attention to details. As he says in his monograph (2010: 5), “As one often finds in philosophy, the devil is in the details.” I’m going to miss you, Tony B.

The Principle of Indifference and Epistemic Reasons

The Principle of Indifference (PoI) is plausibly defined as follows:

  • (Pol): Each member of a set of propositions should be assigned the same probability (of truth) in the absence of any reason to assign them different probabilities. (Castell 1998: 387)

(PoI) is a principled way to assign probabilities in situations of epistemic ignorance. When you have no reason to assign probabilities to a set of propositions in one way versus another way (PoI) instructs you to assign a uniform distribution of probabilities across the propositions in the partition. Despite being a principled way to assign probabilities, given ignorance, there are problems with (PoI).

A well-known problem with (PoI) is Bertrand’s paradox. The upshot of Bertrand’s paradox is that unless (PoI) is somehow restricted it results in inconsistent assignments of probabilities to the same event. Equally valid ways of carving up the outcome space (i.e., the propositions in the partition) result in (PoI) assigning different uniform distributions to the same event. How the outcome space is described changes the probability value (PoI) recommends. (PoI) is description-dependent and inconsistent as a result. A lesser-known problem with (PoI) involves reliance on the notion of a ‘reason’ in the definition of PoI (i.e., “in the absence of any reason…”). This is the issue that I want to explore.

As Paul Castell (1998: 388) points out, you can generate different iterations of (PoI) based on the strength that you assign to the notion of an epistemic reason. How strong does a consideration in favor of believing that p have to be in order to count as a reason to believe that p?

Epistemic reasons can vary in strength. I might believe that Sheriff Chance is corrupt because I saw her take a bribe from an ex-convict named Stumpy, or I might believe that Sheriff Chance is corrupt because I suspect that she is corrupt. Assuming my suspicion is merely a suspicion, and not based on solid evidence, the former reason to believe that the Sheriff is corrupt is stronger than the latter reason.

In general, adopting a weaker understanding of ‘reason’ generates a stronger (i.e., more stringent) version of (PoI). Such a version of (PoI) is more stringent because it is a more demanding principle. To qualify as being in a state of epistemic ignorance requires not having any reason to assign different probabilities to the propositions. If, for instance, a mere suspicion qualifies as a ‘reason’, then you cannot possess any suspicion that one proposition is more likely than the others. If you have such a suspicion, then you have a reason to assign them different probabilities and (PoI) does not apply. This puts a high bar on what it takes for (PoI) to apply to a situation of uncertainty. The inverse also holds: a stronger understanding of ‘reason’ generates a weaker (i.e., less stringent) version of (PoI). If, for instance, only knowledge qualifies as a ‘reason’, then you cannot possess any knowledge that one proposition is more likely than the others. Because knowledge is a stronger epistemic concept than mere suspicion it will be easier to qualify as being in a state of epistemic ignorance (i.e., possessing no reasons). This places a low bar on what it takes for (PoI) to apply to a situation of uncertainty. These generalizations are as follows:

  • High Bar: A weak interpretation of ‘reason’ yields a strong interpretation of (PoI);
  • Low Bar: A strong interpretation of ‘reason’ yields a weak interpretation of (PoI).

Before trying to set the bounds of interpretations of (PoI) by finding iterations of (PoI) that satisfy High Bar and Low Bar it would be good to get a grip on how these considerations create problems for (PoI). One aspect of the worry is that (PoI) generates probability distributions (i.e., quantities) contingent upon the interpretation of a qualitative notion (i.e., a reason). If you think traditional epistemology should inform formal epistemology, then this may not be much of a worry. However, if you think things should work in the opposite direction or not at all (i.e., formal epistemology should inform traditional epistemology or they should be regarded as domains that don’t meaningfully interact), then this may be regarded as a worry. However, the deeper worry is akin to the problem Bertrand’s paradox causes (PoI).

What I call the ‘Many Interpretations’ problem for (PoI) results in inconsistent assignments of probabilities to the same event. The problem is not generated based on a redescription of the propositions in the partition. Rather, the problem is generated based on a reinterpretation of (PoI). For instance, this can occur when a Low Bar interpretation of (PoI) is applicable to an epistemic situation, so it recommends a uniform distribution of probabilities, but a High Bar interpretation of (PoI) is not applicable to the same epistemic situation, and a non-uniform distribution of probabilities results. How (PoI) is interpreted determines whether or not it is applicable to one and the same epistemic situation, which results in inconsistent application of the principle, and inconsistent assignment of probability values. Is the Many Interpretations problem something that Bayesians need to worry about?

At first glance it appears that Subjective Bayesians aren’t impacted by the Many Interpretations problem. Subjective Bayesians do not require a ‘reason’ to assign probabilities to propositions. Such Bayesians regard (PoI) as unnecessary, no matter how it is interpreted. As Castell (1998) explains about this Bayesian position:

The thought is that where you do not feel in a position to make a (warranted) probabilistic judgement given the available evidence, the proper thing to do is simply to abstain from judgement. According to this view, it is unreasonable to wheel in a principle that provides probabilities where you judge none to be warranted: there is no role for (Pol) within Bayesianism. (p. 388)

Castell argues that Bayesians must (and in fact do) rely on (PoI) in assigning probabilities. In the interest of space I will not rehearse the details of his argument. However, there is a good case to be made that Bayesians rely, even if only implicitly, on (PoI), and in certain situations Bayesians must rely on (PoI) because no frequency data is available in the facts of the case or in the background information. So, I think that, despite professions otherwise, this objection to (PoI) is pressing for Bayesians as well.

Let’s explore a few iterations of (PoI). Castell (1998) articulates two iterations of (PoI) in an attempt to find a High Bar version of (PoI). The first iteration of (PoI) involves judgment (J):

  • (PoI-J): Each member of a set of propositions should be assigned the same probability in the absence of a subjective judgement to the contrary. (p. 388 n.4)

Castell recognizes that (PoI-J) is problematic. We don’t make probabilistic judgments about many things. This is often because we have not thought about the matter. Do we really assign a uniform distribution to things that we have never thought about? Castell thinks that we need to introspect on such things. This generates the second iteration of (PoI), which involves introspection and judgment (IJ):

  • (PoI-IJ): Each member of a set of propositions should be assigned the same probability if due consideration (introspection) yields no subjective judgement to the contrary. (p. 388 n.4)

Though I wouldn’t put (PoI-IJ) at the upper limit of High Bar, (PoI-IJ) is a viable option for a High Bar version of (PoI). This is because the notion of a ‘reason’ is relatively weak. As Castell says about (PoI-IJ), “where an agent feels unable to make any judgements (however weakly based on evidence), it directs him to adopt the uniform distribution” (p. 388 n.4). (PoI-IJ) is a more stringent principle because it requires you to not have any subjective judgment “however weakly based on evidence” in order to assign a uniform distribution of probabilities. By contrast, a viable option for a Low Bar version of (PoI) involves knowledge (K):

  • (PoI-K): Each member of a set of propositions should be assigned the same probability if due consideration (introspection) yields no knowledge to the contrary.

(PoI-K) has a strong interpretation of ‘reason’ and generates a weak version of (PoI). It requires you to not have any knowledge that one (or more) of the propositions should be assigned a different probability. Such knowledge is harder to come by, so (PoI-K) is easier to satisfy. There is a Low Bar regarding the applicability of (PoI-K) to a situation of uncertainty.

Along the High Bar/Low Bar spectrum there is a range of interpretations of (PoI). Such interpretations include having no intuition, no reasonable doubt, and no belief to the contrary.

The Many Interpretations problem calls for a restriction of (PoI) to prevent inconsistent assignments of probabilities to the same event. Is a Low Bar or a High Bar interpretation of (PoI) more likely to be correct? If it’s possible to argue that one interpretation is the correct interpretation, then other interpretations can be ruled out. This delimits the permissiveness of (PoI) and prevents multiple interpretations from generating different probabilities. However, such a move greatly reduces the scope and power of (PoI). It will not be possible to apply (PoI) to many situations of epistemic uncertainty, situations that are not ruled out under a permissive understanding of (PoI).

Reference

Castell, Paul. 1998. A Consistent Restriction of the Principle of Indifference. The British Journal for the Philosophy of Science 49: 387-95.

How Rawls Might View Occupy Wall Street

This discussion was brought to my attention on Leiter Reports. It’s an interesting discussion with Joshua Cohen about Rawls’ theory of justice and how it relates to the Occupy Wall Street movement. A colleague of mine (Quentin Gee) notes the following quote by Rawls on his UCSB profile page.

“When politicians are beholden to their constituents for essential campaign funds, and a very unequal distribution of income and wealth obtains in the background culture, with the great wealth being in the control of corporate economic power, is it any wonder that congressional legislation is, in effect, written by lobbyists, and Congress becomes a bargaining chamber in which laws are bought and sold?” – John Rawls, The Law of Peoples

I find this quote apt in light of the Occupy Wall Street movement. I think the Occupy Wall Street movement could better theoretically frame its dialogue in light of Rawls’ political philosophy.

Evidence of Evidence is Evidence, or is it?

Branden Fitelson (forthcoming) provides counterexamples to Richard Feldman’s principle that Evidence of Evidence is Evidence (EEE). Here’s the principle in its initial (naïve) form:

(EEE1) If E (non-conclusively) supports the claim that (some subject) S possesses evidence which supports p, then E supports p. (Fitelson forthcoming: 1).

Fitelson’s counterexamples to (EEE) work by presupposing the “positive relevance” (i.e., increase-in-probability) notion of evidential support. In footnote 6 he indicates a more substantive principle of evidential support might be wielded in defending (EEE). In this post I want to explore this possibility, specifically in relation to the notion of propositional justification. Consider the following principle of propositional justification:

S is justified in believing that p iff S’s total evidence sufficiently supports p (Neta 2007: 197).

Though there are many issues that could be raised with this formulation of propositional justification, let’s see if a less demanding iteration of the principle could be used to resist Fitelson’s counterexamples to (EEE). Neta’s principle suggests the following notion of evidential support:

(1) E (evidentially) supports p iff S’s total evidence includes E and S’s total evidence (necessarily) supports p.

The counterexample to (EEE1) involves drawing a card c at random from a deck. All the evidence we are given regarding c is as follows:

(E1) c is a black card.

(E2) c is the ace of spades.

(p) c is an ace.

Imagine a guy named John knows what card c is, and the evidence above constitutes all the facts about the case. This means the following is the case:

(2) E1 supports the claim that John possesses evidence (E2) which supports p.

Positive relevance creates a problem for (EEE1) because (E1) doesn’t raise the probability of (p). (E1) alone is probabilistically irrelevant to (p); so, even though (E1) supports (E2), the second conjunct in (EEE1) is false (i.e., E1 doesn’t support p).

How does the counterexample fare under principle (1) instead of positive relevance? John’s total evidence includes (E1), and John’s total evidence (E1 and E2) necessarily supports (p). (E1) alone doesn’t necessarily support (p), but it also doesn’t support (not-p), and when coupled with (E2) it does necessarily support (p). In fact, (E2) entails (p). John’s total evidence might not sufficiently support (p), but his total evidence does necessarily do so. The next iteration of (EEE) runs as follows:

(EEE2) If E1 supports the claim that S possesses evidence E2 which supports p, then the conjunction of E1 and E2 supports p (Fitelson forthcoming: 2).

This seems like the defense I just gave for (EEE1), assuming (1). Didn’t I just claim the conjunction of (E1) and (E2) supports (p)? If so, then, assuming evidential support principle (1), it looks like the next counterexample will sink (EEE2). However, I think (EEE2) and (1) escape unscathed. Fitelson’s counterexample to (EEE2) is about a guy named Joe:

(E1) Joe has a full head of white hair.

(E2) Joe is over 35 years of age.

(p) Joe is bald.

The example works given the positive relevance principle because the conjunction of (E1 and E2) fails to raise the probability of (p). Being over 35 years of age might raise the probability that one is bald, but having a full head of white hair doesn’t raise the probability one is bald. (E1) supports (not-p), so the conjunction of (E1) and (E2) refutes (p). What about in relation to principle (1) instead of positive relevance?

(E1) is part of the total evidence. But, does the total evidence support (p)? More specifically, is the notion of total evidence equivalent to the conjunction of all the (relevant) evidence? I would argue it is possible to equate the total evidence with the conjunction of all relevant evidence, as I did in defending (EEE1) against a counterexample, but it is not necessarily the case that the total evidence must be regarded as all of the evidence conjoined. There is a probabilistic consideration in favor of this point.

The probabilistic consideration is that the total evidential support for (p) is not determined by simply conjoining the individual probabilities that (E1) and (E2) afford (p). Returning to the card counterexample, the probability that the card is a black card (E1) supports the claim the card is an ace (p) is the probability that: if the card is black, then the card is an ace. Only the ace of clubs and the ace of spades satisfy this condition, so the probability is 2/52 (approx. 4%).

The probability (E2: the card is the ace of spades) supports the claim the card is an ace (p) is 1, as the entailment takes on the maximum value. The fact that (E2) entails (p) means Pr(p|E2) = 1. The amount that the total evidence supports (p) is not simply the product of the probabilities of (p) given (E2) and (p) given (E1). This would yield a probability of Pr(p|2/52 x 52/52) = 104/2704 or about 4%.

A  better estimate is attained through subtraction of the two probabilities. This measures the degree to which one bit of evidence lessens the impact of another bit of evidence on the target proposition. This better approximates the total impact of (E1 and E2) on (p). This makes it the case that Pr(p|52/52 – 2/52) = 50/52 or 96%. The actual probability, though, would be 1 or 100% because the probability the card is black (E1) is swamped by the probability the card is the ace of spades (E2) in relation to the probability the card is an ace (p). As such, (E1) can be disregarded. Again, the correct probabilistic impact of the total evidence on the target proposition is not determined by a conjunction of all the evidence. Let’s apply this back to the second counterexample about Joe’s hair or lack-thereof:

Pr(p|E1) = 0

Pr(p|E2) = .20 (estimate of men over 35 who are bald)

Conjoining the total evidence, again, doesn’t refute the target proposition. Suppose John knows the age of Joe and (2) is true. That is, (E1) supports the claim that John possesses evidence (E2) that supports (p). Does the conjunction of (E1 and E2) refute (p), as Fitelson urges? Multiplying the probabilities, as given above, would yield a probability of 0. It would indeed refute (p). However, this case is dissimilar from the card drawing case because it uses vague terms. Being “bald” is not defined as having “no hair”. Most men who are “bald” still have some hair on their head. Being bald is defined in relation to “male pattern baldness.” This is a progressive condition and, much like the term “a heap”, is wrought with vagueness. The fact that Joe has a full head of white hair (E1) and Joe is over 35 years of age (p) doesn’t make it the case that there is zero probability that Joe is bald. Joe may have hair loss as a result of male pattern baldness, yet to a casual observer (like John) he may appear to have a full head of hair. This is especially the case for someone who has all white hair because the threshold for counting as having a full head of hair is plausibly lower when all of one’s hair is white. Due to vagueness in the terms in (E1) and (p), the fact that Joe is over 35 years of age (E2) is not swamped by (E1).

A better probability estimate is attained by (i) allowing Pr(p|E1) = .05 as a correction on boundary vagueness in the terms, then (ii) subtracting the probabilities to yield the impact of the total evidence on the target proposition: Pr(p|.20 – .05) = .15 or 15%. The total evidence still supports (i.e., does not outright refute) (p) even though the total evidence makes it more likely that (not-p) than (p) is the case.

References

Fitelson, Branden (forthcoming). “Evidence of Evidence is not (Necessarily) Evidence.” Analysis.

Neta, Ram (2007). “Propositional Justification, Evidence, and the Cost of Error.” Philosophical Issues.

Conference Videos: BLED – Knowledge, Understanding and Wisdom

Videos from the BLED conference on “Knowledge, Understanding and Wisdom” are now online. Click HERE to access the videos. In side-by-side format you can view both the video of the talk and the presentation slides from the talk. This avoids awkward transitions when one window pane tries to show both the video lecture and the accompanying slides.

It looks like the website videolectures.net provides a great venue for posting recorded talks. Hopefully, feasibility sensitive, using this site to post conference videos becomes a profession-wide standard.