background image

                                   

Introduction to the philosophy of science 

by Lyle Zinda 

Lecture 1 

2/1/94 

Introduction 

Philosophy of science is part of a range of sub-disciplines known as "philosophy of X" 
(where X may be filled in with art, history, law, literature, or the various special sciences 
such as physics). Each of the activities for which there is a "philosophy of X" is an 
investigation into a certain part of the world or a particular type of human activity. What 
we will talk about today, in very general terms, is what distinguishes philosophy of X 
from sociology, history, or psychology of X. These approaches to science cannot be 
sharply demarcated, though many people have thought that they can be. However, there 
are clear differences of emphasis and methods of investigation between these approaches 
that we can outline here in a preliminary way. 

Let us attempt to characterize the particular emphases of several approaches to studying 
science: 

•  Sociology of Science - Sociology of science studies how scientists interact as 

social groups to resolve differences in opinion, how they engage in research, and 
how they determine which of many theories and research programs are worth 
pursuing, among other things.  

•  Psychology of Science - Psychology of science studies how scientists reason, i.e., 

the thought processes that scientists follow when they are judging the merits of 
certain kinds of research or theoretical speculations; how they reason about data, 
experiment, theories, and the relations between these; and how they come up with 
new theories or experimental procedures.  

•  History of Science - History of science studies how scientists have engaged in the 

various activities noted above in the past, i.e., how social interactions among 
scientists (and between scientists and the greater society) and styles of scientific 
reasoning have changed over the centuries; and how particular scientific 
achievements came to be accepted, both by individual scientists and by the 
scientific community as a whole.  

In each of these cases, the data on which the approach to studying science is based is 
empirical or observational. What is at issue is how scientists in fact interact as social 
groups, reason, or how scientific reasoning styles and scientific theories have in fact 
changed over time. To adjudicate disputes within these approaches thus requires 
engaging in much like scientific activity itself: one must gather evidence for one's views 
(or in the case of certain approaches to history, one's interpretation of the nature of 
scientific activity or the causes of scientific revolutions). 

background image

What is philosophy of science? How does it differ from these other approaches to 
studying science? Well, that's not an easy question to answer. The divisions among 
philosophers of science are quite striking, even about fundamentals, as will become 
apparent as the course proceeds. One reason for this is that philosophers of science, on 
occasion, would find many of the things that sociologists, psychologists, and historians of 
science study to be relevant to their own studies of science. Of course, the degree to 
which philosophers of science are interested in and draw upon the achievements of these 
other disciplines varies greatly among individuals--e.g., some philosophers of science 
have been far more interested in the history of science, and have thought it more relevant 
to their own endeavors, than others. However, there are some tendencies, none of them 
completely universal, that would serve to mark a difference between philosophers of 
science on the one hand and sociologists, historians, and psychologists of science on the 
other.  

The first difference is that philosophy of science is not primarily an empirical study of 
science, although empirical studies of science are of relevance to the philosopher of 
science. (Like everything else you might cite as a peculiarity of philosophy of science, 
this point is a matter of dispute; some philosophers of science, for example, claim that 
philosophy of science ought to be considered a special branch of epistemology, and 
epistemology ought to be considered a special branch of empirical psychology.) 
Philosophers of science do not generally engage in empirical research beyond learning 
something about a few branches of science and their history. This type of study, however, 
is simply a prerequisite for talking knowledgeably about science at all. Philosophers 
primarily engage in an activity they call "conceptual clarification," a type of critical, 
analytical "armchair" investigation of science. For example, a philosopher of science may 
try to answer questions of the following sort. 

What is scientific methodology, and how does it differ (if it does) from the 
procedures we use for acquiring knowledge in everyday life? 

How should we interpret the pronouncements of scientists that they have gained 
knowledge about the invisible, underlying structure of the world through their 
investigations? 

Part of what is open to philosophy of science, insofar as it is critical, is to question the 
methods that scientists use to guide their investigations. In other words, philosophers of 
science often seek to answer the following question. 

What reason is there to think that the procedures followed by the scientist are 
good ones? 

In a sense, philosophy of science is normative in that it asks whether the methods that 
scientists use, and the conclusions that they draw using those methods, are proper or 
justified. Normally, it is assumed that the methods and conclusions are proper or justified, 
with it being the task of the philosopher of science to explain precisely how they can be 
proper or justified. (In other words, the philosopher of science seeks to understand the 
practice of science in such as way as to vindicate that practice.) This opens up the 
possibility of revision: that is, if a philosopher of science concludes that it is impossible 

background image

to justify a certain feature of scientific practice or methodology, he or she might conclude 
that that feature must be abandoned. (This would be rare: most philosophers would react 
to such a situation by rejecting the view that did not vindicate that feature of scientific 
practice.) 

Let us take another approach to the question of what distinguishes philosophy of science 
from other approaches to studying science. Philosophy has, since Plato, been concerned 
with the question of what a particular kind of thing essentially is. In other words, 
philosophers seek a certain sort of answer to questions of the following form. 

What is X? 

In asking a question of this form, philosophers seek to understand the nature of X, where 
by "nature" they mean something like X's essence or meaning. 

We will start the course by considering the question, "What is scientific explanation?" 
We will also seek to answer the question, "What makes a scientific explanation a good 
one?" Most people take the notion of explanation for granted; but as you will soon find 
out, philosophers take a special interest in the concepts others take for granted. 
Philosophers emphasize the difference between being able to identify something as an 
explanation and being able to state in precise terms what an explanation is, i.e., what 
makes something an explanation. Philosophers seek to do the latter, assuming that they 
are able (like everyone else) to do the former. 

None of this, of course, will mean very much until we have examined the philosophy of 
science itself, i.e., until we start doing philosophy of science. To a large degree, each of 
you will have to become a philosopher of science to understand what philosophy of 
science is. 

Lecture 2 

2/3/94 

The Inferential View Of Scientific Explanation  

Last time we discussed philosophy of science very abstractly. I said that it was difficult to 
separate completely from other studies of science, since it draws on these other 
disciplines (sociology, history, psychology, as well as the sciences themselves) for its 
"data." What distinguishes philosophy of science from other studies of science is that it 
(1) takes a critical, evaluative approach--e.g., it aims at explaining why certain methods 
of analyzing data, or as we shall see today, the notion of explanation--are good ones. 
There is also an emphasis on conceptual analysis--e.g., explaining what explanation is, 
or, in other words, what it means when we say that one thing "explains" another. 
(Philosophers often discuss the meaning of many terms whose meaning other people take 
for granted.) I also noted that the best way to see how philosophy of science is 
differentiated from other studies of science is by examining special cases, i.e., by doing 
philosophy of science. We will start our investigation by examining the notion of 
explanation. 

background image

The Difference Between Explanation And Description 

It is commonplace, a truism, to say that science aims at not only describing regularities in 
the things that we observe around us--what is often called observable or empirical 
phenomena--but also at explaining those phenomena. For example, there is the 
phenomenon of redshift in the spectra of stars and distant galaxies. The physical 
principles behind redshift are sometimes explained by analogy with the Doppler effect, 
which pertains to sound: observed wavelengths are increased if the object is moving 
away from us, shortened if the object is moving towards us. (Also, there is a derivation in 
general relativity theory of the redshift phenomenon, due to gravitation.) In 1917, de 
Sitter predicted that there would be a relationship between distance and redshift, though 
this prediction was not generally recognized until Hubble (1929), who was inspired by de 
Sitter's analysis. Another example: There's a periodic reversal in the apparent paths of the 
outer planets in the sky. This can be predicted purely mathematically, based on past 
observations, but prediction does not explain why the reversal occurs. What is needed is a 
theory of the solar system, which details how the planets' real motions produce the 
apparent motions that we observe. Final example: Hempel's thermometer example (an 
initial dip in the mercury level precedes a larger increase when a thermometer is placed 
into a hot liquid). 

Three Approaches To Explanation 

A philosopher of science asks: What is the difference between describing a phenomenon 
and explaining it? In addition, what makes something an adequate explanation? 
Philosophers have defended three basic answers to this question. 

Inferential View (Hempel, Oppenheim) - An explanation is a type of argument, with 

sentences expressing laws of nature occurring essentially in the premises, and the 
phenomenon to be explained as the conclusion. Also included in the premises can 
be sentences describing antecedent conditions.  

Causal View (Salmon, Lewis) - An explanation is a description of the various causes 

of the phenomenon: to explain is to give information about the causal history that 
led to the phenomenon.  

Pragmatic View (van Fraassen) - An explanation is a body of information that 

implies that the phenomenon is more likely than its alternatives, where the 
information is of the sort deemed "relevant" in that context, and the class of 
alternatives to the phenomenon are also fixed by the context.  

In the next few weeks, we will examine each of these accounts in turn, detailing their 
strengths and weaknesses. Today we will start with Hempel's inferential view, which 
goes by several other names that you might encounter. 

The "received" view of explanation (reflecting the fact that philosophers 
generally agreed with the inferential view until the early 1960s) 

The "deductive-nomological" model of explanation (along with its probabilistic 
siblings, the "inductive-statistical" and "deductive-statistical" models of 

background image

explanation) 

The Inferential Theory of Explanation 

The original Hempel-Oppenheim paper, which appeared in 1948, analyzed what has 
come to be known as "deductive-nomological" (D-N) explanation. Hempel and 
Oppenheim consider patterns of explanation of a particular sort and try to generalize that 
pattern. As already noted, they view explanation as a certain sort of argument, i.e., a set 
of premises (sentences) which collectively imply a conclusion. The arguments they 
considered were deductive, i.e., arguments such that if the premises are true the 
conclusion has to be true as well.  

All human beings are mortal. 

Socrates is a human being. 

Socrates is mortal. 

Not every argument that is deductive is an explanation, however. How can we separate 
those that are from those that are not? 

To accomplish this task, Hempel and Oppenheim describe their "General Conditions of 
Adequacy
" that define when a deductive argument counts as an adequate explanation. 

An explanation must: 

(a) be a valid deductive argument (hence "deductive")  

(b) contain essentially at least one general law of nature as a premise (hence 
"nomological") 

(c) have empirical content (i.e., it must be logically possible for an observation-
statement to contradict it) 

The first three conditions are "logical" conditions, i.e., formal, structural features that a 
deductive argument must have to count as an explanation. To complete the conditions of 
adequacy, Hempel and Oppenheim add a fourth, "empirical" condition. 

The premises (statements in the explanans) must all be true. 

On the inferential view, explanations all have the following structure (where the 
antecedent conditions and laws of nature make up the "explanans"). 

C1, ... , Cn [antecedent conditions (optional)]  

L1, ... , Ln [laws of nature] 

Therefore, E [explanandum] 

Later, we will be looking at a statistical variant of this pattern, which allows the laws of 
nature to be statistical, and the resulting inference to be inductive. 

background image

Laws Of Nature 

Since Hempel and Oppenheim's analysis is stated in terms of laws of nature, it is 
important to state what a law of nature is on their view. Hempel and Oppenheim take a 
law to be a true "law-like" sentence. This means that a law is a linguistic entity, to be 
distinguished by its peculiar linguistic features. According to Hempel and Oppenheim, 
laws are to be distinguished from other sentences in a language in that they are (1) 
universal, (2) have unlimited scope, (3) contain no designation of particular objects, and 
(4) contain only "purely qualitative" predicates. 

The problem Hempel and Oppenheim face is distinguishing laws from accidental 
generalizations, i.e., general truths that happen to be true, though they are not true as a 
matter of physical law. For example, suppose that all the apples that ever get into my 
refrigerator are yellow. Then the following is a true generalization: "All the apples in my 
refrigerator are yellow." However, we do not deem this sentence to be a law of nature. 
One reason that this might be so is that this statement only applies to one object in the 
universe, i.e., my refrigerator. Laws of nature, by contrast, refer to whole classes of 
objects (or phenomena). (Consider the sentence "All gases that are heated under constant 
pressure expand.") It is for this reason that Hempel and Oppenheim include the 
requirement that to be a law of nature a sentence must not designate any particular 
objects. 

Discuss: Can we get around this requirement by a bit of linguistic trickery? Consider if 
we replace the designation "in my refrigerator" by "in any Lyle Zynda-owned 
refrigerator," or, even better, by some physical description in terms of "purely 
qualitative" predicates that single out my refrigerator from the class of all refrigerators in 
the universe. Would the existence of such a sentence convince you that you have 
discovered a new law of nature? Moreover, consider the following two sentences. 

G: No gold sphere has a mass greater than 100,000 kilograms. 

U: No enriched uranium sphere has a mass greater than 100,000 kilograms. 

The former is not a law of nature, though the latter is (though it is of a relatively low-
level variety of law). 

One reason that the statements about apples and gold might not be laws of nature, a 
reason not adequately captured by the Hempel-Oppenheim analysis of laws, is that they 
do not support inferences to counterfactual statements. For example, it cannot be inferred 
from the fact that all the apples that ever get into my refrigerator are yellow that if a red 
apple were to be placed into my refrigerator, it would turn yellow. Laws of nature, by 
contrast, support counterfactual inferences. From the fact that all gases that are heated 
under constant pressure expand we can infer that if a sample of gas in a particular 
container were heated under constant pressure, it would expand. (We can reasonably infer 
this even if we never heat that sample of gas under constant pressure.) Similarly, we can 
infer that if were to successfully gather 100,000 kilograms of uranium and try to fashion 
it into a sphere, we would fail. (We could not infer a similar statement about gold from 
the true generalization that no gold sphere has a mass greater than 100,000 kilograms.) 

background image

The difference between statements G and U has never been captured purely syntactically. 
Thus, Hempel and Oppenheim's view that laws of nature are sentences of a certain sort 
must be fundamentally mistaken. What laws of nature are is still a matter of dispute. 

Counterexamples To The Inferential View Of Scientific Explanation: Asymmetry 
and Irrelevance
 

The Hempel-Oppenheim analysis of scientific explanation has the following overlapping 
features. (Review and explain each.) 

(a) Inferential - Explanations are arguments: to explain why E occurred is to 
provide information that would have been sufficient to predict beforehand that E 
would occur. 

(b) Covering Law - Explanations explain by showing that E could have been 
predicted from the laws of nature, along with a complete specification of the 
initial conditions. 

(c) Explanation-Prediction Symmetry - The information (i.e., laws, antecedent 
conditions) appearing in an adequate explanation of E could have been used to 
predict E; conversely, any information that can be used to predict E can be used 
after the fact to explain why E occurred. 

(d) No Essential Role For Causality - Laws of nature do not have to describe 
causal processes to be used legitimately in scientific explanations. 

Many counterexamples have been given to the Hempel-Oppenheim analysis of scientific 
explanation that target one or more of these features. The first group of counterexamples 
reveals that the Hempel-Oppenheim analysis faces the Problem of Asymmetry: Hempel-
Oppenheim assert that explanation and prediction are symmetric, whereas that does not 
seem to be the case, as the following examples show. 

(1) Eclipse - You can predict when and where a solar eclipse will occur using the 
laws governing the orbit of the Earth around the Sun, and the orbit of the Moon 
around the Earth, as well as the initial configuration these three bodies were in at 
an earlier time. You can also make the same prediction by extrapolating 
backwards in time from the subsequent positions of these three bodies. However, 
only the first would count as an explanation of why the eclipse occurred when and 
where it did. 

(2) Flagpole - Using the laws of trigonometry and the law that light travels in 
straight lines, you can predict the length of the shadow that a flagpole of a certain 
height will cast when the Sun is at a certain elevation. You can also predict what 
the height of the flagpole is by measuring the length of its shadow and the 
elevation of the Sun. However, only the first derivation would count as an 
explanation. 

(3) Barometer - Using the laws governing weather patterns, storm formation, and 
the effect of air pressure on the behavior of barometers, you can predict that when 

background image

a barometer falls that a storm will soon follow. You can also predict that when a 
storm is approaching, the barometer will fall. However, neither of these are 
explanatory, since both are explained by antecedent atmospheric conditions. 

The second group of counterexamples reveals that the Hempel-Oppenheim analysis of 
explanation faces the Problem of Irrelevance: the Hempel-Oppenheim analysis 
sometimes endorses information as explanatory when it is irrelevant to the explanandum.  

(4) Birth Control Pills - All men who take birth control pills never get pregnant. 
Thus, from the fact that John is taking birth control pills we can infer logically 
that he won't get pregnant. However, this would hardly be an explanation of 
John's failing to get pregnant since he couldn't have gotten pregnant whether or 
not he took birth control pills. 

(5) Hexed Salt - All salt that has had a hex placed on it by a witch will dissolve in water. 
Hence, we can logically infer from the fact that a sample of salt had a hex placed on it by 
a witch that it will dissolve in water. However, this wouldn't give us an explanation of the 
dissolution since the salt would have dissolved even if it hadn't been hexed. 

 

Lecture 3 

2/8/94 

The Causal Theory Of Explanation, Part I  

Last time, we saw that the inferential view of explanation faced the asymmetry and 
irrelevance problems. There is another problem, however, that comes out most clearly 
when we consider the inductive-statistical (I-S) component of the inferential view. This 
problem strikes most clearly at the thesis at the heart of the inferential view, namely, that 
to explain a phenomenon is to provide information sufficient to predict that it will occur. 

I-S explanation differs from D-N explanation only in that the laws that are cited in the 
explanation can be statistical. For example, it is a law of nature that 90% of electrons in a 
90-10 superposition of spin-up and spin-down will go up if passed through a vertically 
oriented Stern-Gerlach magnet. This information provides us with the materials for 
stating an argument that mirrors a D-N explanation.  

Ninety percent of electrons in a 90-10 superposition of spin-up and spin-down 
will go up if passed through a vertically oriented Stern-Gerlach magnet. (Law of 
Nature) 

This electron is in a 90-10 superposition of spin-up and spin-down and is passed 
through a vertically oriented Stern-Gerlach magnet. (Statement of Initial 
Conditions) 

Therefore, this electron goes up. (Explanandum) [90%] 

This argument pattern is obviously similar to that exhibited by D-N explanation, the only 

background image

difference being that the law in the inductive argument stated above is statistical rather 
than a universal generalization. On the inferential view, this argument constitutes an 
explanation since the initial conditions and laws confer a high probability on the 
explanandum. If you knew that these laws and initial conditions held of a particular 
electron, you could predict with high confidence that the electron would go up. 

The problem with the inferential view is that you can't always use explanatory 
information as the basis for a prediction. That is because we frequently offer explanations 
of events with low probability. (Numbers in the examples below are for purposes of 
illustration only.) 

Atomic Blasts & Leukemia. We can explain why a person contracted leukemia 
by pointing out the person was once only two miles away from an atomic blast, 
and that exposure to an atomic blast from that distance increases one's chances of 
contracting leukemia in later life. Only 1 in 1,000 persons exposed to an atomic 
blast eventually contract leukemia. Nevertheless, exposure to an atomic blast 
explains the leukemia since people who haven't been exposed to an atomic blast 
have a much lower probability (say, 1 in 10,000) of contracting leukemia. 

Smoking & Lung Cancer. We can explain why someone contracted lung cancer 
by pointing out that the person had smoked two packs of cigarettes a day for forty 
years. This is an explanation since people who smoke that much have a much 
higher probability (say, 1 in 100) of contracting lung cancer than non-smokers 
(say, 1 in 10,000). Still, the vast majority of smokers (99 percent) will never 
contract lung cancer. 

Syphilis & Paresis. We can explain why someone contracted paresis by pointing 
out that the person had untreated latent syphilis. This is an explanation since the 
probability of getting paresis is much higher (e.g., 1 in 100) if a person has 
untreated latent syphilis than if he does not (e.g., 0). Still, the vast majority of 
people with untreated latent syphilis will never contract paresis. 

In each of these cases, you can't predict that the result will occur since the information 
does not confer a high probability on the result. Nevertheless, the information offered 
constitutes an explanation of that result, since it increases the probability that that result 
will occur. 

In the 1960s and 1970s, Wesley Salmon developed a view of statistical explanation that 
postulated that, contrary to what Hempel claimed earlier, high probability was not 
necessary for an explanation, but only positive statistical relevance. 

Definition. A hypothesis h is positively relevant (correlated) to e if h makes e 
more likely, i.e., pr(h|e) > pr(h)

The problem Salmon faced was distinguishing cases where the information could provide 
a substantive explanation from cases where the information reported a mere correlation 
and so could not. (For example, having nicotine stains on one's fingers is positively 
correlated with lung cancer, but you could not explain why a person contracted lung 
cancer by pointing out that the person had nicotine stains on their fingers.) Distinguishing 

background image

these cases proved to be impossible using purely formal (statistical) relations. Obviously, 
some other type of information was needed to make the distinction. Rejecting the 
received view of explanation, Salmon came to believe that to explain a phenomenon is 
not to offer information sufficient for a person to predict that the phenomenon will occur, 
but to give information about the causes of that phenomenon. On this view, an 
explanation is not a type of argument containing laws of nature as premises but an 
assembly of statistically relevant information about an event's causal history. 

Salmon points out two reasons for thinking that causal information is what is needed to 
mark off explanations. First, the initial conditions given in the explanatory information 
have to precede the explanandum temporally to constitute an explanation of the 
explanandum. Hempel's theory has no restriction of this sort. The eclipse example 
illustrates this fact: you can just as well use information about the subsequent positions of 
the Sun and Moon to derive that an eclipse occurred at an earlier time as use information 
about the current positions of the Sun and Moon to derive that an eclipse will occur later. 
The latter is a case of retrodiction, whereas the latter is a (familiar) case of prediction. 
This is an example of the prediction-explanation symmetry postulated by Hempel. 
However, as we saw earlier when discussing the problem of asymmetry, only the 
forward-looking derivation counts as an explanation. Interestingly, Salmon points out that 
the temporal direction of explanation matches the temporal direction of causation, which 
is forward-looking (i.e., causes must precede their effects in time).  

Second, not all derivations from laws count as explanations. Salmon argues that some D-
N "explanations" (e.g., a derivation from the ideal gas law PV = nRT and a description of 
initial conditions) are not explanations at all. The ideal gas law simply describes a set of 
constraints on how various parameters (pressure, volume, and temperature) are related; it 
does not explain why these parameters are related in that way. Why these constraints 
exist is a substantive question that is answered by the kinetic theory of gases. (Another 
example: People knew for centuries how the phases of the Moon were related to the 
height of tides, but simply describing how these two things are related did not constitute 
an explanation. An explanation was not provided until Newton developed his theory of 
gravitation.) Salmon argues that the difference between explanatory and non-explanatory 
laws is that the former describe causal processes, whereas non-explanatory laws (such as 
the ideal gas law) only describe empirical regularities. 

 
 

Lecture 4  

(2/10/94) 

The Causal Theory Of Explanation, Part II  

As we discussed last time, Salmon sought to replace the inferential view of explanation, 
which faces the asymmetry and irrelevance problems, with a causal theory, which 
postulates that an explanation is a body of information about the causes of a particular 
event. Today we will discuss Salmon's view in detail, as well as the related view of David 

background image

Lewis. 

Salmon's theory of causal explanation has three elements. 

(1) Statistical Relevance - the explanans (C) increases the probability of the 
explanandum (E), i.e., pr(E|C) > pr(E). 

(2) Causal Processes - the explanans and the explanandum are both parts of 
different causal processes 

(3) Causal Interaction - these causal processes interact in such a way as to bring 
about the event (E) in question 

This leaves us with the task of saying what a causal process is. Basically, Salmon's view 
is that causal processes are characterized by two features. First, a causal process is a 
sequence of events in a continuous region of spacetime. Second, a causal process can 
transmit information (a "mark"). 

Let us discuss each of these in turn. There are various sequences of events that are 
continuous in the required sense--e.g., a light beam, a projectile flying through space, a 
shadow, or a moving spot of light projected on a wall. An object that is sitting still, e.g., a 
billiard ball, is also deemed a causal process. Each of these is a continuous process in 
some sense but not all of them are causal processes--e.g., the shadow and light spot. Let's 
look at an example that makes this clearer. As some of you may know, relativity theory 
says that nothing can travel faster than light. But what is the "thing" in nothing? Consider 
a large circular room with a radius of 1 light year. If we have an incredibly focused laser 
beam mounted on a swivel in the center of the room, we can rotate the laser beam so that 
it rotates completely once per second. If the laser beam is on, it will project a spot on the 
wall. This spot too will rotate around the wall completely once per second, which means 
that it will travel at 2p light-years per second! Strangely enough, this is not prohibited by 
relativity theory, since a spot of this sort cannot "transmit information." Only things of 
that sort are limited in speed. 

Salmon gives this notion an informal explication in "Why ask 'Why'?" He argues that the 
difference between the two cases is that a process like a light beam is a causal process: 
interfering with it at one point alters the process not only for that moment: the change 
brought about by the interference is "transmitted" to the later parts of the process. If a 
light beam consists of white light (or a suitably restricted set of frequencies), we can put a 
filter in the beam's path, e.g., separating out only the red frequencies. The light beam 
after it passes through the filter will bear the "mark" of having done so: it will now be red 
in color. Contrast this with the case of the light spot on the wall: if we put a red filter at 
one point in the process, the spot will turn red for just that moment and then carry on as if 
nothing had happened. Interfering with the process will leave no "mark." (For another 
example, consider an arrow on which a spot of paint is placed, as opposed to the shadow 
of the arrow. The first mark is transmitted, but the second is not.) 

Thus, Salmon concludes that a causal process is a spatiotemporal continuous process that 
can transmit information (a "mark"). He emphasizes that the "transmission" referred to 
here is not an extra, mysterious event that connects two parts of the process. In this 

background image

regard, he propounds his "at-at" theory of mark transmission: all that transmission of a 
mark consists in is that the mark occurs "at" one point in the process and then remains in 
place "at" all subsequent points unless another causal interaction occurs that erases the 
mark. (Here he compares the theory with Zeno's famous arrow paradox. Explain. The 
motion consists entirely of the arrow being at a certain point at a certain time; in other 
words, the motion is a function from times to spatial points. This is necessary when we 
are considering a continuous process. To treat as the conjunction of the discrete events of 
moving from A halfway to C, moving halfway from there, and so on, leads to the 
paradoxical conclusion that the arrow will never reach its destination. Transmission is not 
a "link" between discrete stages of a process, but a feature of that process itself, which is 
continuous. 

Now we turn to explanation. According to Salmon, a powerful explanatory principle is 
that whenever there is a coincidence (correlation) between the features of two processes, 
the explanation is an event common to the two processes that accounts for the correlation. 
This is a "common cause." To cite an example discussed earlier, there is a correlation 
between lung cancer (C) and nicotine stains on a person's fingers (N). That is, 

pr(C|N) > pr(C). 

The common cause of these two events is a lifetime habit of smoking two packs of 
cigarettes each day (S). Relative to S, C and N are independent, i.e., 

pr(C|N&S) = pr(C|S). 

You'll sometimes see the phrase that S "screens C off from N" (i.e., once S is brought into 
the picture N becomes irrelevant). This is part of a precise definition of "common cause," 
which is constrained by the formal probabilistic conditions. We start out with pr(A|B) > 
pr(A). C is a common cause of A and B if the following hold. 

pr(A&B|C) = pr(A|C)pr(B|C) 

pr(A&B|¬C) = pr(A|¬C)pr(B|¬C)  

pr(A|C) > pr(A|¬C) 

pr(B|C) > pr(B|¬C) 

(The first condition is equivalent to the screening off condition given earlier.) These 
conditions are also constrained: A, B, and C have to be linked suitably as parts of a causal 
process known as a "conjunctive" fork. 

(Consider the relation between this and the smoking-lung cancer, and leukemia-atomic 
blast cases given earlier.) 

This does not complete the concept of causal explanation, however, since some common 
causes do not "screen off" correlated by (causally) independent events. Salmon gives 
Compton scattering as an example. Given that an electron e- absorbs a photon of a certain 
energy E and is given a bit of kinetic energy E* in a certain direction as a result, a second 
photon will be emitted with E** = E - E*. The energy levels of the emitted photon and of 

background image

the electron will be correlated, even given that the absorption occurred. That is, 

pr(A&B|C) > pr(A|C)pr(B|C). 

This is a causal interaction of a certain sort, between two processes (the electron and the 
photon). We can use the probabilistic conditions here to analyze the concept: "when two 
processes intersect, and both are modified in such ways that the changes in one are 
correlated with changes in the other--in the manner of an interactive fork--we have causal 
interaction." Thus, a second type of "common cause" is provided by the C in the 
interactive fork. 

Salmon's attempt here is to analyze a type of explanation that is commonly used in 
science, but the notion of causal explanation can be considered more broadly than he 
does. For example, Lewis points out that the notion of a causal explanation is quite fluid. 
In his essay on causal explanation, he points out that there is an extremely rich causal 
history behind every event. (Consider the drunken driving accident case.) Like Salmon, 
Lewis too argues that to explain an event is to provide some information about its causal 
history. The question arises, what kind of information? Well, one might be to describe in 
detail a common cause of the type discussed by Salmon. However, there might be many 
situations in which we might only want a partial description of the causal history (e.g., we 
are trying to assign blame according to the law, or we already know a fair chunk of the 
causal history and are trying to find out something new about it, or we just want to know 
something about the type of causal history that leads to events of that sort, and so on). 
Lewis allow negative information about the causal history to count as an explanation 
(there was nothing to prevent it from happening, there was no state for the collapsing star 
to get into, there was no connection between the CIA agent being in the room and the 
Shah's death, it just being a coincidence, and so on). To explain is to give information 
about a causal history, but giving information about a causal history is not limited to 
citing one or more causes of the event in question. 

(Mention here that Lewis has his own analysis of causation, in terms of non-backtracking 
counterfactuals.) 

Now given this general picture of explanation, there should be no explanations that do 
not cite formation about the causal history of a particular event. Let us consider whether 
this is so. Remember the pattern of D-N explanation that we looked at earlier, such as a 
deduction of the volume of a gas that has been heated from the description of its initial 
state, how certain things such as temperature changed (and others, such as pressure, did 
not), and an application of the ideal gas law PV = nRT. On Hempel's view, this could 
count as an explanation, even though it is non-causal. Salmon argues that (1) non-causal 
laws allow for "backwards" explanations, and (2) cry out to be explained themselves. 
Regarding the latter point, he says that non-causal laws of this sort are simply 
descriptions of empirical regularities that need to be explained. The same might occur in 
the redshift case, if the law connecting the redshift with the velocity was simply an 
empirical generalization. (Also, consider Newton's explanation of the tides.) 

Let's consider another, harder example. A star collapses, and then stops. Why did it stop? 
Well, we might cite the Pauli Exclusion Principle (PEP), and say that if it had collapsed 

background image

further, there would have been electrons sharing the same overall state, which can't be 
according the PEP. Here PEP is not causing the collapse to stop; it just predicts that it 
will stop. Lewis claims that the reason this is explanatory is that it falls into the "negative 
information" category. The reason that the star stopped collapsing is that there was no 
physically permissible state for it to get into. This is information about its causal history, 
in that it describes the terminal point of that history. 

There are other examples, from quantum physics, especially, that seem to give the most 
serious problems for the causal view of explanation, especially Salmon's view that 
explanations in science are typically "common cause" explanations. One basic problem is 
that Salmon's view relies on spatiotemporal continuity, which we cannot assume at the 
particulate level. (Consider tunneling.) Also, consider the Bell-type phenomenon, where 
we have correlated spin-states, and one particle is measured later than the other. Why did 
the one have spin-up when it was measured? Because they started out in the correlated 
state and the other one that was measured had spin-down. You can't always postulate a 
conjunctive fork. This should be an interactive fork of the type cited by Salmon, since 
given the set-up C there is a correlation between the two events. However, it is odd to say 
the setup "caused" the two events when they were both space-like separated! We will 
consider these problems in greater detail next week. 

 

Lecture 5 

2/15/93 

The Causal Theory Of Explanation, Part III  

Last time, we discussed Salmon's analysis of causal explanation. To review, Salmon said 
that explanation involved (1) statistical relevance, (2) connection via a causal process, 
and (3) change after a causal interaction. The change is the event to be explained. The 
notion of a causal process was filled out in terms of (a) spatiotemporal continuity and (b) 
the ability to transmit information (a "mark"). While we can sometimes speak simply of 
cause and effect being parts of a single causal process, the final analysis will typically be 
given in terms of more complex (and sometimes indirect) causal connections, of which 
Salmon identifies two basic types: conjunctive and interactive forks. 

Today, we are going to look at these notions a bit more critically. To anticipate the 
discussion, we will discuss in turn (1) problems with formulating the relevant notion of 
statistical relevance (as well as the problems associated with a purely S-R view of 
explanation), (2) the range of causal information that can be offered in an explanation, 
and (3) whether some explanations might be non-causal. If we have time, we will also 
discuss whether (probabilistic or non-probabilistic) causal laws have to be true to play a 
role in good explanations (Cartwright). 

Statistical Relevance 

The basic problem with statistical relevance is specifying what is relevant to what. 
Hempel first noticed in his analysis of statistical explanation that I-S explanations, unlike 
D-N explanations, could be weakened by adding further information. For example, given 

background image

that John has received penicillin, he is likely to recover from pneumonia; however, given 
the further information that he has a penicillin-resistant strain of pneumonia, he is 
unlikely to recover. Thus, we can use true information and statistical laws to explain 
mutually contradictory things (recovery, non-recovery). On the other hand, note that we 
can also strengthen a statistical explanation by adding more information (in the sense that 
the amount of inductive support the explanans gives the explanandum can be increased). 
This "ambiguity" of I-S explanation--relative to one thing, c explains e, relative to 
another, it does not--distinguishes it in a fundamental way from D-N explanation. 

As you know, the inferential view said that the explanans must confer a high probability 
on the explanandum for the explanation to work; Salmon and other causal theorists 
relaxed that requirement and only required that the explanans increase the probability of 
the explanandum, i.e., that it be statistically relevant to the explanandum. Still, the 
ambiguity remains. The probability that Jones will get leukemia is higher given that he 
was located two miles away from an atomic blast when it occurred; but it is lowered 
again when it is added that he had on lead shielding that completely blocked the effects of 
any radiation that might be in the area. This led Hempel, and Salmon, too, to add that the 
explanation in question must refer to statistical laws stated in terms of a "maximally" 
specific reference class (i.e., the class named in the "given" clause) to be an explanation. 
In other words, it is required that dividing the class C further into C1, C2, and so on 
would not affect the statistics, in that pr(E|Ci) = pr(E|Cj). This can be understood in two 
ways, either "epistemically," in terms of the information we have at our disposal, of 
"objectively," in terms of the actual "objective" probabilities in the world. (Hempel only 
recognized the former.) If our reference class can't be divided ("partitioned") into cells 
that give different statistics for E, then we say that the class is "homogeneous" with 
respect to E. The homogeneity in question can be either epistemic or objective: it must be 
the latter if we are really talking about causes rather than what we know about causes. 

The problem with this is that dividing up the class can result in trivialization. For 
example, a subclass of the class of people who receive penicillin to treat their pneumonia 
(P) is the class of those people who recover (R). Obviously, it is always the cause that 
pr(|P&R) = 1. However, this type of statistical law would not be very illuminating to use 
in a statistical explanation of why the person recovered from pneumonia. 

There are various ways around this problem. For example, you can require that the 
statistical law in question not be true simply because it is a theorem of the probability 
calculus (which was the case with pr(R|P&R) = 1). Hempel used this clause in his 
analysis of I-S explanation. Salmon adds that we should further restrict the clause by 
noting that the statistical law in question not refer to a class of events that either (1) 
follow the explanandum temporally, or (2) cannot be ascertained as true or false in 
principle independently of ascertaining the truth or falsity of the explanandum. The first 
requirement is used to block explanations of John's recovery that refer to the class of 
people who are reported on the 6:00 news to have recovered of pneumonia (supposing 
John is famous enough to merit such a report). This is the requirement of maximal 
specificity (Hempel) or that the reference class be statistically homogeneous (Salmon). 

Of course, as we mentioned earlier, there might be many correlations that exist in the 
world between accidental events, such as that someone in Laos sneezes (S) whenever a 

background image

person here recovers from pneumonia (R), so that we have pr(R|P&S) > pr(R|P). (Here 
the probabilities might simply be a matter of the actual, empirical frequencies.) If this 
were the case, however, we would not want to allow the statistical law just cited to occur 
in a causal explanation, since it may be true simply by "accident." We might also demand 
that there be causal processes linking the two events. That's why Salmon was concerned 
to add that the causal processes that link the two events must be specified in a causal 
explanation. The moral of this story is two-fold. Statistical relevance is not enough, even 
when you divide up the world in every way possible. Also, some ways of dividing up the 
world to get statistical relevance are not permissible. 

What Kinds Of Causal Information Can Be Cited In An Explanation? 

Salmon said that there were two types of causal information that could be cited in a 
causal explanation, which we described as conjunctive and interactive forks. Salmon's 
purpose here is to analyze a type of explanation that is commonly used in science, but the 
notion of causal explanation can be considered more broadly than he does. For example, 
Lewis points out that the notion of a causal explanation is quite fluid. In his essay on 
causal explanation, he points out that there is an extremely rich causal history behind 
every event. (Consider the drunken driving accident case.) Like Salmon, Lewis too 
argues that to explain an event is to provide some information about its causal history. 
The question arises, what kind of information? Well, one might be to describe in detail a 
common cause of the type discussed by Salmon. However, there might be many 
situations in which we might only want a partial description of the causal history (e.g., we 
are trying to assign blame according to the law, or we already know a fair chunk of the 
causal history and are trying to find out something new about it, or we just want to know 
something about the type of causal history that leads to events of that sort, and so on). 

Question: How far back in time can we go to statistically relevant events? Consider the 
probability that the accident will occur. Relevant to this is whether gas was available for 
him to drive his car, whether he received a driver's license when he was young, or even 
whether he was lived to the age that he did. All of these are part of the "causal history" 
leading up to the person having an accident while drunk, but we would not want to cite 
any of these as causing the accident. (See section, "Explaining Well vs. Badly.")  

Something to avoid is trying to make the distinction by saying that the availability of gas 
was not "the" cause of the person's accident. We can't really single out a given chunk of 
the causal history leading up to the event to separate "the cause." Lewis separates the 
causal history--any portion of which can in principle be cited in a given explanation--
from the portion of that causal history that we are interested in or find most salient at a 
given time. We might not interested in information about any portion of the causal 
history, Lewis says, but it remains the case that to explain and event is to give 
information about the causal history leading up to that event. 

In addition, Lewis points out that the range of ways of giving information about causal 
histories is quite broad. For example, Lewis allow negative information about the causal 
history to count as an explanation (there was nothing to prevent it from happening, there 
was no state for the collapsing star to get into, there was no connection between the CIA 
agent being in the room and the Shah's death, it just being a coincidence, and so on). To 

background image

explain is to give information about a causal history, but giving information about a 
causal history is not limited to citing one or more causes of the event in question. 

 

Lecture 6  

2/17/94 

Problems With The Causal Theory Of Explanation  

Last time we finished our exploration of the content of causal theories of explanation, 
including the kinds of caveats that have to be added to make the theory workable. Today, 
we will examine whether the causal approach as a whole is plausible, and examine an 
alternative view, namely van Fraassen's pragmatic theory of explanation. (Note 
Kourany's use of the term "erotetic.") As I stated at the end of the period last time, there 
are two basic challenges that can be given to the causal view, namely that sometimes 
non-causal generalizations can explain, and that laws can be explained by other laws (a 
relationship that does not seem to be causal, since laws don't cause other laws, since 
neither is an event). 

(1) Non-causal generalizations - Suppose that someone was ignorant of the various gas 
laws, or once learned the laws and has now forgotten them. 

General Law: PV = nRT 

Boyle's Law: At constant temperature, pressure is inversely proportional to 
volume, i.e., PV = constant. 

Charles' Law: At constant pressure, volume is directly proportional to 
temperature, i.e., V/T = constant. 

Pressure Law: At constant volume, pressure is directly proportional to 
temperature, i.e., P/T = constant. 

They wonder why it is that a certain container containing gas expands when heated. You 
could then give various answers, such as that the pressure was constant, and then cite 
Charles's law to finish the explanation. Alternately, you might say that there was a more 
complex relationship, where the pressure increased along with the temperature, but that 
the increase in pressure was not enough to compensate for the increase in temperature to 
that the volume had to rise too, according to the general gas law. Q: Is this an 
explanation? We have to distinguish whether it's the "ultimate" or "fundamental" 
explanation of the phenomenon in question or whether it's an explanation of the 
phenomenon. 

Example from statics: an awkwardly posed body is held in place by a transparent rod. 
Why is it suspended in that odd way? Well, there's a rod here, and that compensates for 
the force of gravity. 

Example from Lewis, the collapsing star: Why did it stop? Well, there is no causal story 
that we can tell, other than by giving "negative" information: there was no state for it to 

background image

get into if it collapsed further, because of Pauli Exclusion Principle. (Here identical 
fermions cannot have the same quantum numbers, n, l, m, and ms.) Here PEP is not 
causing the collapse to stop; it just predicts that it will stop. Lewis claims that the reason 
this is explanatory is that it falls into the "negative information" category. The reason that 
the star stopped collapsing is that there was no physically permissible state for it to get 
into. This is information about its causal history, in that it describes the terminal point of 
that history. 

(2) Explanation of Laws by Laws. Newton explained Kepler's laws (ellipses, equal areas 
in equal times, p

2

 = d

3

) by deriving them from his laws of motion and gravitation (inertia, 

F = ma, action-reaction, and F = Gm

1

m

2

/r

2

). This is the kind of explanation to which the 

inferential view is especially well suited (as well as the pragmatic view that we will 
consider next time), but it does not fit immediately into the causal view of explanation 
(Newton's laws don't cause Kepler's laws to be true.) That is because the causal view of 
explanation seems best-suited for explaining particular events, rather than general 
regularities. Lewis responds to several objections to his theory by saying that his theory 
does not intend to cover anything more than explanations of events. (Not a good answer 
as is, since then we would not have a general theory of explanation, but only a description 
of certain kinds of explanation.) However, he does have an answer at his disposal: one 
way to give information about causal histories is to consider what is common to all causal 
histories for events of a given type (e.g., a planet orbiting around a star according to 
Kepler's laws). Q: Is this enough? 

There are of course the other problems mentioned earlier: Salmon's view relies on the 
notion of spatiotemporal continuous processes to explain the notion of causal processes. 
(Lewis is not subject to this problem, since he has an alternative theory of causation: 
linkage by chains of non-backtracking counterfactual dependence.) 

 

Lecture 7 

2/22/94 

Van Fraassen's Pragmatic View Of Explanation  

Van Fraassen's pragmatic view of explanation is that an explanation is a particular type of 
answer to a why-question, i.e., an answer that provides relevant information that "favors" 
the event to be explained over its alternatives. For van Fraassen, these features are 
determined by the context in which the why-question is asked. 

The Basic Elements Of The Pragmatic View Of Explanation  

According to van Fraassen, a why-question consists of (1) a presupposition (Why X), (2) 
a contrast class (Why X rather than Y, Z, and so on), and (3) an implicitly understood 
criterion of relevance. Information given in response to a particular why-question 
constitutes an explanation of the presupposition if the information is relevant and 
"favors" the presupposition over the alternatives in its contrast class. (Explain and give 
examples.) 

background image

Both the contrast class and the criterion of relevance are contextually determined, based 
on interests of those involved. Subjective interests define what would count as an 
explanation in that context, but then it's an objective matter whether that information 
really favors the presupposition over the alternatives in its contrast class. (Explain and 
give examples.) 

Contrasts Between The Pragmatic And Causal Views Of Explanation 

Any type of information can be counted as relevant (of course, it's a scientific 

explanation if only information provided by science counts; however, there might 
be different kinds of scientific explanation; not any old information will do).  

Context (interests) determines when something counts as an explanation, vs. when we 

would find an explanation interesting or salient. (According to Lewis, what makes 
it an explanation is that it gives information about the causal history leading up to 
a given event; whether we find that explanatory information interesting or salient 
is another matter.)  

Distinction: Pragmatic theory of explanation vs. theory of the pragmatics of 

explanation. On the pragmatic view, God could never have a "complete" 
explanation of an event, unless he had interests. (A mere description of the causal 
history leading up to an event--even a complete one--is not an explanation of any 
sort according to the pragmatic view.)  

On the pragmatic view, asymmetries only exist because of the context; thus, they can be 
reversed with a change in context. That is what van Fraassen's Tower example is 
supposed to illustrate. (Recount the Tower example.) Lewis' Objection to the Tower 
example: What is really doing the explaining is the intention of the Prince, and that's a 
cause of the flagpole being that particular height. Discuss: Can you think of a story in 
which the redshift would explain the galaxies moving away? Where human intention is 
not possible, it seems difficult; this would seem to confirm Lewis' diagnosis of the Tower 
story. 

 

Lecture 8 

2/24/94 

Carnap vs. Popper 

The idea is to say when we "test" scientific theories, and then form opinions regarding 
those theories based on such tests. What does scientific investigation consist in, and what 
are the rules governing it? 

Induction - Scientific investigation is a type of inductive process, where we increase the 
evidential basis for or against a particular theory without the evidence conclusively 
establishing the theory. "Induction" has had various meanings in the past: in the 
Renaissance, it was thought that the way to develop scientific theories was to examine all 
the evidence you could and then extrapolate from that to form theories. (This was a 

background image

method of developing theories as well as a method of justifying them.) This was in 
contrast to positing "hypotheses" about unobservable entities to explain the phenomena. 
(Indeed, Newton was criticized for formulating such "hypotheses.") This view did not 
survive, however, since it became apparent that you can't form theories in this way. Thus, 
we have to understand "induction" differently (supposing that it is a useful concept at all). 

Carnap is an inductivist, and in this respect he differs from Popper. However, both agree 
(taking inspiration from Hume) that there is a serious problem with the justification of 
"inductive inference." Carnap discusses it in terms of a puzzle about how we arrive at and 
form opinions regarding laws. (Note that notion of a law that Carnap assumes is similar 
to Hempel's.) Laws are universal statements (at least), hence apply to an at least 
potentially infinite domain. However, our empirical data is always finite. (Consider ideal 
gas law.) What does deductive logic give us to evaluate theories? 

Suppose that h -> e1, e2, e3,.... If we show the truth any finite number of the ei, we 
haven't established that h; however, if we show that one of these is false, h must also be 
false. 

Thus deductive logic cannot give us the tools to establish a scientific theory; we cannot 
"infer" from evidence to theory. There are different ways in which this can be so. Carnap 
distinguishes between empirical and theoretical laws. The latter refer to unobservable 
entities or properties; theoretical laws are absolutely necessary in science, but they cannot 
simply be derived from large bodies of research. Science postulates new categories of 
things not just generalizations about regularities of things we can observe. (Consider the 
kinetic theory of gases.) The upshot is that theories can always be wrong, no matter how 
much evidence we've found that is consistent with those theories. Scientific theories are 
not "proven" in the sense that given that a certain body of empirical data they are immune 
from all refutation. Q: How can we form an opinion regarding theories? Moreover, what 
kind of opinion should we form?  

Both of these authors assume that the answer to this question will be in the form of a 
"logic of scientific discovery." (Logic = formal system of implication, concerned with 
relations of implication between statements, but not relations of statements to the world, 
i.e., whether they are true or false). Indeed, this is the title of Popper's famous book. The 
point on which they differ is the following: Is deductive logic all that we're limited to in 
science? (Popper - yes, no inductive logic; Carnap - No, there's an "inductive logic" too, 
which is relevant to scientific investigation.) 

Popper and Carnap also agree on is that there is a distinction between the contexts of 
justification and discovery. This is a contentious view, as we'll see when we start looking 
at Kuhn and Laudan. The traditional approach to induction assumed that gathering 
evidence would enable one to formulate and justify a theory. (Consider Popper's example 
of a person who gave all his "observations" to the Royal Society. Also, consider the 
injunction to "Observe!") Both Carnap and Popper distinguish the two: what they were 
concerned with is not how to come up with scientific hypotheses--this is a creative 
process (e.g., the formulation of a theoretical framework or language), not bound by rules 
of logic but only (perhaps) laws of psychology--but how to justify our hypotheses once 
we come up with them. Carnap's answer is that we can't "prove" our hypotheses but we 

background image

can increase (or decrease) their probability by gathering evidence. (This is inductive since 
you proceed to partial belief in a theory by examining evidence.) 

Let's make this a bit more precise: is induction a kind of argument? That is, is there such 
a thing as "inductive inference?" This is one view of inductive logic: a formal theory 
about relations of (partial) implication between statements. Can be thought of in two 
ways. (1) You accept a conclusion (all-or-nothing) based on evidence that confirms a 
theory to a certain degree (e.g., if sufficiently high). (2) You accept a conclusion to a 
certain degree (i.e., as more or less probable) based on certain evidence. The latter is 
what Carnap's approach involves. The basic notion there is degree of confirmation. What 
science does when it "justifies" a theory (or tests it) is to provide evidence that confirms 
or disconfirms it to a certain degree, i.e., makes it more or less probable than it was 
before the evidence was considered. 

We've already made informal use of the notion of probability. However, the precise sense 
of the term turns out to be of importance when thinking about induction in Carnap's 
sense. 

Frequentist (Statistical) 

Inductive (Logical) 

Objective (Physical Propensity) 

Subjective (Personal Degree of Belief) 

Carnap thought that the Logical notion was the one operative in scientific reasoning. 
Analogy with deductive logic: formal, relation between statements because of their 
formal properties (i.e., irrespective of what facts obtain). Disanalogy: No acceptance or 
belief: pr(h|e) = x means that e partially implies h, to degree x. Only partial belief, guided 
by the logical probabilities. The latter is a matter of the logical relationship between the 
two statements; it should guide our opinion (degrees of belief), but it does not on 
Carnap's view reduce to it. Scientific reasoning is then the formulation of a broad 
framework (language) in which theories can be expressed; inherent in that framework 
will be relations of partial implication (conditional logical probabilities) between 
evidence and hypotheses, i.e., pr(h|e). Evidence confirms the theory if pr(h|e) > pr(h) (and 
if pr(h|e) is sufficiently high). Then making an observation involves determining whether 
e is true; we seek to develop and confirm theories in this way. 

Popper rejected this framework altogether. Popper thought that Carnap's inductive logic--
indeed, the very idea of inductive logic--was fundamentally mistaken. (How do we 
establish our starting point in Carnap's theory, i.e., the logical probabilities? It didn't work 
out very well.) There is no such thing: the only logic available to science is deductive 
logic. (Note that this doesn't mean that you can't use statistics. Indeed, you'll use it 
frequently, though the relationship between statistical laws and statistical data will be 
deductive.) What is available then? Well, we can't verify a theory (in the sense of 
justifying belief or partial belief in a hypothesis by verifying its predictions), but we can 
certainly falsify a theory using purely deductive logic. If h Þ e1, e2, e3, ..., then if even 
one of e1, e2, e3, ... turn out to be false, the theory as a whole is falsified, and must be 

background image

rejected (unless it can be amended to account for the falsity of the falsifying evidence). 
Thus, Popper argues that science proceeds ("advances") by a process of conjecture and 
refutation. He summarizes several important features of this process as follows (page 
141). 

•  It's too easy to get "verifying" evidence; thus, verifying evidence is of no intrinsic 

value.  

•  To be of any use, predictions should be risky.  

•  Theories are better (have more content) the more they restrict what can happen.  

•  Theories that are not refutable by some possible observation are not scientific 

(criterion of demarcation).  

•  To "test" a theory in a serious sense is to attempt to falsify it.  

•  Evidence on "corroborates" a theory if it is the result of a serious ("genuine") test.  

The process is then to start with a conjecture and try to falsify it; it that succeeds, move 
on to the next conjecture, and so on, until you find a conjecture that you do not falsify. 
Keep on trying, though. If you have trouble falsifying it, you say that it has been 
"corroborated." This does not mean that it has a high degree of probability, however. It 
still may be improbable, given the evidence at hand. (Indeed, it should be improbable if it 
says anything of interest.) We only tentatively accept scientific theories, while continuing 
to try to refute them. (Here "tentative acceptance" does not mean to believe that they are 
true, or even to be highly confident in their truth.) 

 

Lecture 9 

3/1/94 

An Overview Of Kuhn's The Structure Of Scientific Revolutions 

Today we will be looking at Kuhn's The Structure of Scientific Revolutions (SSR) very 
broadly, with the aim of understanding its essentials. As you can gather from the title of 
Kuhn's book, he is concerned primarily with those episodes in history known as 
"scientific revolutions." During periods of this sort, our scientific understanding of the 
way the universe works is overthrown and replaced by another, quite different 
understanding. 

According to Kuhn, after a scientific discipline matures, its history consists of long 
periods of stasis punctuated by occasional revolutions of this sort. Thus, a scientific 
discipline goes through several distinct types of stages as it develops. 

I. The Pre-Paradigmatic Stage 

Before a scientific discipline develops, there is normally a long period of somewhat 
inchoate, directionless research into a given subject matter (e.g., the physical world). 

background image

There are various competing schools, each of which has a fundamentally different 
conception of what the basic problems of the discipline are and what criteria should be 
used to evaluate theories about that subject matter. 

II. The Emergence Of Normal Science 

Out of the many competing schools that clutter the scientific landscape during a 
discipline's pre-paradigmatic period, one may emerge that subsequently dominates the 
discipline. The practitioners of the scientific discipline rally around a school that proves 
itself able to solve many of the problems it poses for itself and that holds great promise 
for future research. There is typically a particular outstanding achievement that causes the 
discipline to rally around the approach of one school. Kuhn calls such an achievement a 
"paradigm." 

A. Two Different Senses Of "Paradigm"--Exemplar And Disciplinary Matrix 

Normal science is characterized by unanimous assent by the members of a scientific 
discipline to a particular paradigm. In SSR, Kuhn uses the term paradigm to refer to two 
very different kinds of things. 

1. Paradigms As Exemplars 

Kuhn at first uses the term "paradigm" to refer to the particular, concrete achievement 
that defines by example the course of all subsequent research in a scientific discipline. In 
his 1969 postscript to SSR, Kuhn refers to an achievement of this sort as an "exemplar." 
Among the numerous examples of paradigms Kuhn gives are Newton's mechanics and 
theory of gravitation, Franklin's theory of electricity, and Copernicus' treatise on his 
heliocentric theory of the solar system. These works outlined a unified and 
comprehensive approach to a wide-ranging set of problems in their respective disciplines. 
As such, they were definitive in those disciplines. The problems, methods, theoretical 
principles, metaphysical assumptions, concepts, and evaluative standards that appear in 
such works constitute a set of examples after which all subsequent research was 
patterned. (Note, however, that Kuhn's use of the term "paradigm" is somewhat 
inconsistent. For example, sometimes Kuhn will refer to particular parts of a concrete 
scientific achievement as paradigms.)  

2. Paradigms As Disciplinary Matrices 

Later in SSR, Kuhn begins to use the term "paradigm" to refer not only to the concrete 
scientific achievement as described above, but to the entire cluster of problems, methods, 
theoretical principles, metaphysical assumptions, concepts, and evaluative standards that 
are present to some degree or other in an exemplar (i.e., the concrete, definitive scientific 
achievement). In his 1969 postscript to SSR, Kuhn refers to such a cluster as a 
"disciplinary matrix." A disciplinary matrix is an entire theoretical, methodological, and 
evaluative framework within which scientists conduct their research. This framework 
constitutes the basic assumptions of the discipline about how research in that discipline 
should be conducted as well as what constitutes a good scientific explanation. According 
to Kuhn, the sense of "paradigm" as a disciplinary matrix is less fundamental that the 
sense of "paradigm" as an exemplar. The reason for this is that the exemplar essentially 

background image

defines by example the elements in the framework that constitutes the disciplinary 
matrix. 

B. Remarks On The Nature Of Normal Science 

1. The Scientific Community 

According to Kuhn, a scientific discipline is defined sociologically: it is a particular 
scientific community, united by education (e.g., texts, methods of accreditation), 
professional interaction and communication (e.g., journals, conventions), as well as 
similar interests in problems of a certain sort and acceptance of a particular range of 
possible solutions to such problems. The scientific community, like other communities, 
defines what is required for membership in the group. (Kuhn never completed his 
sociological definition of a scientific community, instead leaving the task to others.) 

2. The Role Of Exemplars 

Exemplars are solutions to problems that serve as the basis for generalization and 
development. The goal of studying an exemplar during one's scientific education is to 
learn to see new problems as similar to the exemplar, and to apply the principles 
applicable to the exemplar to the new problems. A beginning scientist learns to abstract 
from the many features of a problem to determine which features must be known to 
derive a solution within the theoretical framework of the exemplar. Thus, textbooks often 
contain a standard set of problems (e.g., pendulums, harmonic oscillators, inclined plane 
problems). You can't learn a theory by merely memorizing mathematical formulas and 
definitions; you must also learn to apply these formulas and definitions properly to solve 
the standard problems. This means that learning a theory involves acquiring a new way of 
seeing, i.e., acquiring the ability to group problems according to the theoretical principles 
that are relevant to those problems. The "similarity groupings" of the mature scientist 
distinguish her from the scientific neophyte. 

3. Normal Science As "Puzzle-Solving" 

According to Kuhn, once a paradigm has been accepted by a scientific community, 
subsequent research consists of applying the shared methods of the disciplinary matrix to 
solve the types of problems defined by the exemplar. Since the type of solution that must 
be found is well defined and the paradigm "guarantees" that such a solution exists 
(though the precise nature of the solution and the path that will get you to a solution is 
often not known in advance), Kuhn characterizes scientific research during normal or 
paradigmatic science as "puzzle-solving." 

III. The Emergence Of Anomaly And Crisis 

Though the paradigm "guarantees" that a solution exists for every problem that it poses, it 
occasionally happens that a solution is not found. If the problem continues to persist after 
repeated attempts to solve it within the framework defined by the paradigm, scientists 
may become acutely distressed and a sense of crisis may develop within the scientific 
community. This sense of desperation may lead some scientists to question some of the 
fundamental assumptions of the disciplinary matrix. Typically, competing groups will 

background image

develop strategies for solving the problem, which at this point has become an "anomaly," 
that congeal into differing conceptual "schools" of thought much like the competing 
schools that characterize pre-paradigmatic science. The fundamental assumptions of the 
paradigm will become subject to widespread doubt, and there may be general agreement 
that a replacement must be found (though often many scientists continue to persist in 
their view that the old paradigm will eventually produce a solution to the apparent 
anomaly). 

IV. The Birth And Assimilation Of A New Paradigm 

Eventually, one of the competing approaches for solving the anomaly will produce a 
solution that, because of its generality and promise for future research, gains a large and 
loyal following in the scientific community. This solution comes to be regarded by its 
proponents as a concrete, definitive scientific achievement that defines by example how 
research in that discipline should subsequently be conducted. In short, this solution plays 
the role of an exemplar for the group--thus, a new paradigm is born. Not all members of 
the scientific community immediately rally to the new paradigm, however. Some resist 
adopting the new problems, methods, theoretical principles, metaphysical assumptions, 
concepts, and evaluative standards implicit in the solution, confident in their belief that a 
solution to the anomaly will eventually emerge that preserves the theoretical, 
methodological, and evaluative framework of the old paradigm. Eventually, however, 
most scientists are persuaded by the new paradigm's growing success to switch their 
loyalties to the new paradigm. Those who do not may find themselves ignored by 
members of the scientific community or even forced out of that community's power 
structure (e.g., journals, university positions). Those who hold out eventually die. The 
transition to the new paradigm is complete. 

Lecture 10 

3/3/94 

Paradigms and Normal Science 

Last time we examined in a very general way Kuhn's account of the historical 
development of particular scientific disciplines. To review, Kuhn argued that a scientific 
discipline goes through various stages: the pre-paradigmatic, paradigmatic ("normal"), 
and revolutionary (transitional, from one paradigm to another). Each stage is 
characterized in terms of the notion of a paradigm, so it is highly important that we 
discuss this notion in detail. Today, we will limit ourselves primarily to the context of the 
transition from pre-paradigmatic science to "normal" (paradigm-governed) science.  

Paradigms 

Let's look a bit a Kuhn's characterization of the notion of paradigm. He introduces 
paradigms first as "universally recognized scientific achievements that for a time provide 
model problems and solutions to a community of practitioners" (page x). A paradigm is 
"at the start largely a promise of success discoverable in selected and still incomplete 
examples" (pages 23-24), and it is "an object for further articulation and specification 
under new or more stringent conditions" (page 23); hence from paradigms "spring 

background image

particular coherent traditions of scientific research" (page 10) that Kuhn calls "normal 
science." Normal science consists primarily of developing the initial paradigm "by 
extending the knowledge of those facts that the paradigm displays as particularly 
revealing, by increasing the extent of the match between those facts and the paradigm's 
predictions, and by further articulation of the paradigm itself" (page 24). The paradigm 
provides "a criterion for choosing problems that, while the paradigm is taken for granted, 
can be assumed to have solutions" (page 27). Those phenomena "that will not fit the box 
are often not seen at all" (page 24). Normal science "suppresses fundamental novelties 
because they are necessarily subversive of its basic commitments." Nevertheless, not all 
problems will receive solutions within the paradigm, even after repeated attempts, and so 
anomalies develop that produce "the tradition-shattering complements to the tradition-
bound activity of normal science" (page 6). 

At the outset of SSR, we are told that a paradigm is a concrete achievement that provides 
a model for subsequent research. Such achievements are referred to in textbooks, 
lectures, and laboratory exercises; typically, they are the standard problems that a student 
is required to solve in learning the discipline. The task given the student provides some of 
the content of the mathematical equations (or more generally, theoretical descriptions) 
that comprise the main body of the text. In other parts of SSR, however, we are told that a 
paradigm is much more, e.g., that it includes law, theory, application, and instrumentation 
together (page 10); or that it is a set of commitments of various sorts, including 
conceptual, theoretical, instrumental, methodological, and quasi-metaphysical 
commitments (pages 41-42). Paradigms sometimes are characterized as definitive, 
concrete patterns or models for subsequent research, but at other times seem to be 
characterized as vague theories or theory schemas to be subsequently articulated. In its 
broadest sense, the paradigm is taken to included theories, laws, models, concrete 
applications (exemplars--"paradigms" in the narrower sense), explicit or implicit 
metaphysical beliefs, standard for judging theories, and particular sets of theoretical 
values. In short, anything that is accepted or presupposed by a particular scientific 
community can seemingly be part of a "paradigm." 

There is no doubt that all these elements are present in science. The question is whether it 
is informative to characterize science in this way. A basic problem is as follows: is 
"paradigm" is defined in its broadest sense, where anything that is accepted or 
presupposed by a scientific community is part of the "paradigm" that defines that 
community, then it is a relatively trivial matter to say that there are paradigms. (That is, 
it's not really a substantive historical thesis to say that scientific eras are defined by the 
universal acceptance of a paradigm if a paradigm is simply defined as whatever is 
universally accepted.) 

In his 1969 Postscript to SSR, Kuhn recognizes these problems and distinguishes 
between two senses of "paradigm" as used in SSR: a "disciplinary matrix" (framework) 
and an "exemplar." The former (disciplinary matrix) is the entire framework--conceptual, 
methodological, metaphysical, theoretical, and instrumental--assumed by a scientific 
tradition. The latter (exemplars) are the concrete, definitive achievements upon which all 
subsequent research is patterned. Kuhn's thesis about paradigms is not empty, since he 
argues that the definitive, concrete achievement ("paradigm" in the narrow sense) 

background image

provides the foundation of the disciplinary matrix ("paradigm" in the broader sense). In 
other words, a scientific tradition is defined not by explicitly stated theories derived by 
explicit methodological rules, but by intuitive abstraction from a particular, concrete 
achievement. Let's look at this in more detail.  

The function of textbook examples - Textbook examples provide much of the content of 
the mathematical or theoretical principles that precede them. Here's a common 
phenomenon: you read the chapter, think you understand, but can't do the problems. This 
doesn't evince any failure on your part to understand the text, simply shows that 
understanding the theory cannot be achieved simply by reading a set of propositions. 
Learning the theory consists in applying it (it's not that you learn the theory first and then 
learn to apply it). In other words, knowledge of a theory is not always propositional 
knowledge (knowledge that--accumulation of facts); it is sometimes procedural or 
judgmental knowledge (knowledge how--acquiring judgmental skills). Scientists agree on 
the identification of a paradigm (exemplar), but not necessarily on the full interpretation 
or rationalization of the paradigm (page 44).  

Scientific training is a process of incorporation into a particular community - The goal of 
training is to get the student to see new problems as like the standard problems in a 
certain respect: to group the problems that he will be faced with into certain similarity 
classes, based on the extent to which they resemble the concrete, standard exemplars. 
Being able to group things in the right way, and to attack similar problems using methods 
appropriate for solving that particular type of problem evinces understanding and so 
incorporation into the community. Kuhn's thesis is that it does not evince the 
internalization of methodological rules explicit or implicit in scientific procedure. (Here a 
"rule" is understood as a kind of statement about how to proceed in certain 
circumstances.) That is not to say that none, or even most, of scientific practice can be 
codified into rules; it is just that rules of how to proceed are not required, nor does 
scientific competence (in the sense of judgmental skills) consist in the acquisition of rules 
(statements about how to proceed) and facts (statements about the way the world is). This 
distinguishes him from Carnap and Popper, who see at least the "justification" of 
scientific theories are a rule-governed procedure. 

Thus, to adopt a paradigm is to adopt a concrete achievement as definitive for the 
discipline. (Example: Newton's Principia, with its application to particular problems such 
as the tides, the orbits of the planets, terrestrial motion such as occurs with projectiles, 
pendulums, and springs, and so on.) It's definitive in the sense that the methods that were 
used, the result that was obtained, and the assumptions behind the methods (mathematical 
generalizations, techniques for relating the formalism to concrete situations, etc.). The 
discipline then grows by extending these procedures to new areas--the growth is not 
simply one of "mopping up" however (despite Kuhn's own characterization of it as such), 
but rather an extension of the disciplinary matrix (framework) by organic growth. (No 
sufficient & necessary conditions; instead, a family resemblance.) 

Normal Science 

Pre-Paradigmatic Science - Kuhn's model is the physical sciences; almost all of his 
examples are taken from Astronomy, Physics, or Chemistry. This affects his view of 

background image

science. Consider, on the other hand, Psychology, Sociology, Anthropology, are they 
"pre-paradigmatic"? Is there any universally shared framework? If so, are they sciences at 
all? 

ï Some properties - each person has to start anew from the foundations (or at least, each 
subgroup); disagreement over fundamentals (what counts as an interesting problem, what 
methods should be used to solve the problem, what problems have been solved); in this 
context, all facts seem equally relevant: 

In the absence of a paradigm or some candidate for a paradigm, all of the facts that could 
possibly pertain to the development of a given science are likely to seem equally relevant. 
As a result, early fact-gathering is a far more nearly random activity than the one that 
subsequent scientific research makes familiar ... [and] is usually restricted to the wealth 
of data that lie ready to hand (page 15).  

Thus, some important facts are missed (e.g., electrical repulsion). 

Why accept a paradigm?  

•  It solves a class of important though unsolved problems.  

•  The solution has great scope and so holds much promise for generating further 

research. That is, it must be both broad enough and incomplete enough to provide 
the basis for further research: "it is an object for further articulation and 
specification under new or more stringent conditions" (page 23).  

•  It gives direction and focus to research, i.e., selects out a feasible subset of facts, 

possible experiments as promising or relevant. It says what kinds of articulations 
are permissible, what kinds of problems have solutions, what kind of research is 
likely to lead to a solution, and the form an adequate solution can take. A puzzle: 
the outcome is known, the path that one takes to it is not (also, it requires 
ingenuity to get there).  

Important to recognize that the form of the solution is not exactly specified, but only the 
categories in which it will be framed and the set of allowable paths that will get one there. 
However, this again is not a rule-bound activity, but imitation (and extension) of a 
pattern. "Normal science can proceed without rules only so long as the relevant scientific 
community accepts without question the particular problems-solutions already achieved. 
Rules should therefore become important and the characteristic unconcern about them 
should vanish whenever paradigms or models are felt to be insecure" (page 47). 

Discuss: What level of unanimity exists in present sciences? Aren't there disagreements 
even in physics? 

 

Lecture 11 

3/8/94 

Anomaly, Crisis, and the Non-Cumulativity of Paradigm Shifts 

background image

Last time we discussed the transition from pre-paradigmatic science to paradigmatic or 
"normal" science, and what that involves. Specifically, we discussed the nature and role 
of a paradigm, distinguishing between the primary, narrow sense of the term (an 
exemplar, i.e., a definitive, concrete achievement) and a broader sense of the term (a 
disciplinary matrix or framework, which includes conceptual, methodological, 
metaphysical, theoretical, and instrumental components). We also noted that though 
Kuhn usually talks about each scientific discipline (as distinguished roughly by academic 
department) as having its own paradigm, the notion is more flexible than that and can 
apply to sub-disciplines (such as Freudian psychoanalysis) within a broader discipline 
(psychology). The question of individuating paradigms is a difficult one (though Kuhn 
sometimes speaks as if it's a trivial matter to identify a paradigm), but we will not 
investigate it any further. Now we want to turn to how paradigms lead to their own 
destruction, by providing the framework necessary to discover anomalies, some of which 
lead to the paradigm's downfall. 

As you recall, normal science is an enterprise of puzzle-solving according to Kuhn. 
Though the paradigm "guarantees" that the puzzles it defines have solutions, this is not 
always the case. Sometimes puzzles cannot admit of solution within the framework 
(disciplinary matrix) provided by the paradigm. For example, the phlogiston theory of 
combustion found it difficult to explain the notion of weight gain when certain substances 
were burned or heated. Since combustion was the loss of a substance on that view, there 
should be weight loss. This phenomenon was not taken to falsify the theory, however 
(contrary to what Popper might claim); instead, phlogiston theorists attempted to account 
for the difference by postulating that phlogiston had "negative mass," or that "fire 
particles" sometimes entered an object upon burning. The paradigm eventually collapsed 
for several reasons: 

•  None of the proposed solutions found general acceptance; i.e., there was a 

multiplicity of competing solutions to the puzzle presented by the anomaly 
(weight gain).  

•  The proposed solutions tended to create as many problems as they solved; one 

method of explaining weight gain tended to imply that there would be weight gain 
in cases in which there was none.  

•  This led to a sense of crisis among many practitioners in the field; the field begins 

to resemble the pre-paradigmatic stage (i.e., nearly random attempts at 
observation, experimentation, theory formation), signaling the demise of the old 
paradigm. Nevertheless, it is not abandoned: to abandon it would be to abandon 
science itself, on Kuhn's view.  

•  Eventually, a competitor arose that seemed more promising than any of the other 

alternatives, but which involved a substantial conceptual shift. This was the 
oxygen theory of combustion. 

Paradigm Change as Non-Cumulative 

There is a common but oversimplified picture of science that sees it as a strictly 

background image

cumulative enterprise (science progresses by finding out more and more about the way 
the world works). The "more and more" suggests that nothing is lost. Kuhn argues that on 
the contrary there are substantial losses as well as gains when paradigm shifts occur. Let's 
look at some examples. 

•  Some problems no longer require solution, either because they make no sense in 

the new paradigm, or they are simply rejected.  

•  Standards for evaluating scientific theories alter along with the problems that a 

theory must according to the paradigm solve.  

Example: Newtonian physics introduced an "occult" element (forces), against the 
prevailing view of the corpuscular view that all physical explanation had to be in terms of 
collisions and other physical interactions between particles. Newton's theory did not 
accord with that standard, but it solved many outstanding problems. (Corpuscular view 
could not explain other than in a rough qualitative way why planets would move in 
orbits; Kepler's Laws were separate descriptions of why this was so. Thus it was a great 
achievement when postulating forces led to the derivation of Kepler's Laws.) "Forces" 
were perceived by many as odd, "magical" entities. Newton himself tried to develop a 
corpuscular theory of gravitation, without success, as did many Newtonian scientists who 
followed him. Eventually, when it became apparent that the effort was futile, the standard 
that corpuscular-mechanistic explanation was required was simply disregarded, and 
gravitational attraction was accepted as an intrinsic, unexplainable property of matter. 

Another example: Chemistry before Dalton aimed at explaining the sensory qualities of 
compounds (colors, smells, sounds, etc.). Dalton's atomic paradigm was only suited for 
explaining why compounds went together in certain proportions; so when it was 
generally accepted the demand that a theory explain sensory qualities was dropped. 

Other examples: (a) Phlogiston. The problems of explaining how phlogiston combined 
with calxes to form particular metals were abandoned when phlogiston itself was 
abandoned. The ether (medium for electromagnetic radiation); explaining why movement 
through the ether wasn't detectable vanished along with the ether. There simply was no 
problem to be solved. (b) Michelson-Morley experiment. This experiment was first 
"explained" by Lorenz based on his theory of the electron, which implied that since 
forces holding together matter were electromagnetic and hence influenced by movement 
through the ether, parts of bodies contract in the direction of motion when moving 
through the ether. Relativity explains the contraction, but in a new conceptual framework 
that does not include the ether. 

•  Some new entities are introduced along with the paradigm; indeed, only make 

sense (i.e., can be conceptualized) when introduced as such. (Oxygen wasn't 
"discovered" until the oxygen theory of combustion was developed.)  

•  Elements that are preserved may have an entirely different status. (The constant 

speed of light is a postulate in special relativity theory, a consequence of 
Maxwell's theory.) Substantive conceptual theses might be treated as "tautologies" 
(e.g., Newton's Second Law).  

background image

The fact that new standards, concepts, and metaphysical pictures are introduced makes 
the paradigms not only incompatible, but also "incommensurable." Paradigm shift is a 
shift in worldview. 

Lecture 12 

3/10/94 

Incommensurability 

At the end of the lecture last time, I mentioned Kuhn's view that different paradigms are 
incommensurable, i.e., that there is no neutral standpoint from which to evaluate two 
different paradigms in a given discipline. To put the matter succinctly, Kuhn argues that 
different paradigms are incommensurable (1) because they involve different scientific 
language, which express quite different sorts of conceptual frameworks (even when the 
words used are the same), (2) because they do not acknowledge, address, or perceive the 
same observational data, (3) because they are not concerned to answer the same 
questions, or resolve the same problems, and (4) they do not agree on what counts as an 
adequate, or even legitimate, explanation.  

Many authors took the first sense of incommensurability (linguistic, conceptual) to be the 
primary one: the reason scientists differed with regard to the paradigms, and there was no 
neutral standpoint to decide the issues between the two paradigms, is that there is no 
language (conceptual scheme) in which the two paradigms can be stated. That is why the 
two sides "inevitably talk past each other" during revolutionary periods. Kuhn seems to 
assume that because two theories differ in what they say mass is like (it is conserved vs. it 
is not, and exchangeable with energy) that the term "mass" means something different to 
the two sides. Thus, there is an assumption implicit in his argument about how theoretical 
terms acquire meaning, something like the following. 

The two sides make very different, incompatible claims about mass. 

The theoretical context as a whole (the word and its role within a paradigm) 
determines the meaning of a theoretical or observational term.  

The two sides mean something different by "mass." 

On this interpretation of Kuhn, the two sides of the debate during a revolution talk past 
each other because they're simply speaking different (but very similar sounding) 
languages. This includes not only the abstract terms like "planet" or "electron," but 
observational terms like "mass," "weight," "volume" and so on. This contrasts with the 
older ("positivist") view espoused by Carnap, for example, which held that there was a 
neutral observational language by which experimental results could be stated when 
debating the merits (or deficiencies) of different candidates for paradigm status. The two 
scientists might disagree on whether mass is conserved, but agree on whether a pointer on 
a measuring apparatus is in a certain position. If one theory predicts that the pointer will 
be in one location, whereas other predicts it will be in a different location, the two cannot 
both be correct and so we then need only check to see which is right. On Kuhn's view (as 
we are interpreting him here), this is a naive description, since it assumes a sharp 

background image

dichotomy between theoretical and observational language. (On the other hand, if his 
view is simply holistic, so that any change in belief counts as a conceptual change, it is 
implausible, possibly vacuous.) 

This interpretation of Kuhn makes him out to be highly problematic: if the two groups are 
talking about different things, how can they really conflict or disagree with one another? 
If one group talks about "mass" and another about "mass*" it's as if the two are simply 
discussing apples and oranges. Theories are on this interpretation strictly incomparable. 
Newtonian and relativistic mechanics could not be rivals. This is important since Kuhn 
himself speaks as though the anomalies are the deciding point between the two theories: 
one paradigm cannot solve it, the other can. If they are dealing with different empirical 
data, then they're not even trying to solve the same thing. 

The problem becomes more acute when you consider remarks that Kuhn makes that seem 
to mean that the conceptual scheme (paradigm) is self-justifying, so that any debate 
expressed in the different languages of the two groups will necessarily be circular at some 
point. That is, there is no compelling reason to accept one paradigm unless you already 
accept that paradigm: reasons to accept a paradigm are conclusive only within that 
paradigm itself. If that is so, however, what reason could anyone have to give up the old 
paradigm? (In addition, it cannot be literally correct that the paradigm is self-justifying, 
since otherwise there would be no anomalies.) 

An alternative view would reject the thesis that the incommensurability of scientific 
concepts or language is the primary one; rather it is the incommensurability between 
scientific problems. That is, if the two paradigms view different problems as demanding 
quite different solutions, and accept different standards for evaluating proposed solutions 
to those problems, they may overlap conceptually to a large degree, enough to disagree 
and be rivals, but still reach a point at which the disagreement cannot be settled by appeal 
to experimental data or logic. "When paradigms change," he says, "there are usually 
significant shifts in the criteria determining both the legitimacy of problems and of 
proposed solutions...." "To the extent ... that two scientific schools disagree about what is 
a problem and what is a solution, they will inevitably talk through each other when 
debating the relative merits of their respective paradigms." The resulting arguments will 
be "partially circular." "Since no paradigm ever solves all the problems it defines and 
since no two paradigms leave all the same problems unsolved, paradigm debates always 
involve the question: which problems is it more significant to have solved?" 

On this view, what makes theories "incommensurable" with each other is that they differ 
on their standards of evaluation; this difference is the result of their accepting different 
exemplars as definitive of how work in that discipline should proceed. They are, indeed, 
making different value judgments about research in their discipline. 

How are different value judgments resolved? One focus of many critics has been Kuhn's 
insistence to compare scientific revolutions with political or religious revolutions, and 
with paradigm change as a kind of "conversion." Since conversion is not a rational 
process, it is argued, then this comparison suggests that neither is scientific revolution, 
and so science is an irrational enterprise, where persuasion--by force, if necessary--is the 
only way for proponents of the new paradigm to gain ascendancy. Reasoned debate has 

background image

no place during scientific revolutions. Whether this is an apt characterization of Kuhn's 
point depends on whether conversion to a religious or political viewpoint is an irrational 
enterprise. 

Kuhn himself does not endorse the radical conclusions just outlined; he does not view 
science as irrational. In deciding between different paradigms, people can give good 
reasons for favoring one paradigm over another, he says; it is just that those reasons 
cannot be codified into an algorithmic "scientific method," that would decide the point 
"objectively" and conclusively. There are different standards of evaluation for what 
counts as the important problems to solve, and what counts as an admissible solution. For 
pre-Daltonian chemistry (with the phlogiston theory of combustion), explaining weight 
gains and losses was not viewed to be as important as explaining why metals resembled 
each other more than they did their ores. Quantitative comparisons were secondary to 
qualitative ones. Thus the weight gain problem was not viewed as a central difficulty for 
the theory, though an anomalous one. Things were different in the new theory, in which 
how elements combined and in what proportions became the primary topic of research. 
Here there was a common phenomenon on which they could agree--weight gain during 
combustion--but it was not accorded the same importance by both schools. 

Under this interpretation, much of what Kuhn says is misleading, e.g., his highly 
metaphorical discussion about scientists who accept different paradigms living in 
different worlds. Kuhn sometimes seems to be making an argument, based on Gestalt 
psychology, of the following form. 

Scientists who accept different paradigms experience the world in different ways; 
they notice some things the others do not, and vice versa. 

The world consists of the sum of your experiences.  

Scientists who accept different paradigms experience different worlds. 

Some of his arguments depend on the assumption that to re-conceptualize something, to 
view it is a different way, is to see a different thing. Thus, he speaks as if one scientist 
sees a planet where another saw a moving star, or that Lavoisier saw oxygen whereas 
Priestley saw "dephlogisticated air." 

This is an incommensurability of experience; it is dubious, but this does not detract from 
the very real incommensurability of standards that Kuhn brought to the attention of the 
philosophical, historical, and scientific communities. 

Lecture 13 

3/22/94 

Laudan on Kuhn's Theory of Incommensurable Theories  

Before the break, we finished our discussion of Kuhn's theory of scientific revolutions by 
examining his notion of incommensurability between scientific theories. To review, Kuhn 
claimed that rival paradigms are always incommensurable. Roughly, that means that 

background image

there is no completely neutral standpoint from which one can judge the relative worth of 
the two paradigms. As we discussed, incommensurability comes in three basic varieties 
in Kuhn's SSR: 

•  Incommensurability of Standards or Cognitive Values - Scientists espousing 

different paradigms may agree on certain broadly defined desiderata for theories 
(that a theory should be simple, explanatory, consistent with the empirical data, of 
large scope and generality, and so on), but they typically disagree on their 
application. For example, they might disagree about what needs to be explained 
(consider transition to Daltonian chemistry) or what constitutes an acceptable 
explanation (consider transition to Newtonian mechanics).  

•  Incommensurability of Language - Scientists typically speak different languages 

before and after the change; the same words may take on new meanings (consider 
masssimultaneouslengthspacetimeplanet, etc.). The two sides inevitably 
"talk past one another."  

•  Incommensurability of Experience - Scientists see the world in different, 

incompatible ways before and after a paradigm shift. Kuhn describes paradigm 
change as involving a kind of Gestalt shift in scientists' perceptions of the world. 
Sometimes Kuhn speaks as though the world itself has changed (as opposed to the 
conceptual framework in which the world is perceived, which is quite different), 
though such talk is best construed as metaphorical. (Since otherwise you're 
committed to the view that the world is constituted by how we conceive of it.)  

As I noted then, the first sense of incommensurability is the fundamental one for Kuhn. 
That is because the paradigm (in the sense of exemplar, a concrete, definitive 
achievement) defines by example what problems are worth solving and how one should 
go about solving them. Since this defines the particular sense in which theories are 
deemed "simple," "explanatory," "accurate," and so one, the paradigm one adopts as 
definitive determines which standards one uses to judge theories as adequate. 

Thus, according to Kuhn one's particular standards or cognitive values are determined by 
the paradigm one accepts. There is on his view no higher authority to which a scientist 
can appeal. There are no "deeper" standards to which one can appeal to adjudicate 
between two paradigms that say that different problems are important; thus, there is no 
neutral standpoint from which one can decide between the two theories. It is primarily in 
this sense that theories are "incommensurable" according to Kuhn. 

In the next two weeks, we will examine the notion of cognitive values in science in detail, 
particularly as discussed in Laudan's book Science and Values. (Note that Laudan's book 
does not deal with ethical values, but with cognitive ones, and particularly with the notion 
that there is no paradigm-neutral algorithm for adjudicating between different sets of 
cognitive values.) Laudan's aim: to find a middle ground between the rule-bound view of 
Carnap and Popper, and the apparent relativism of Kuhn (i.e., his view that standards of 
theoretical worth are paradigm-relative, and that there is no higher authority to adjudicate 
between these standards). 

background image

Laudan's claim is that both approaches fail to explain some aspect or another of science, 
and go too far in predicting the degree of disagreement or consensus in science. 

On the rule-bound view, there should normally be consensus as long as scientists are 
rational. To decide between two competing hypotheses, one has only to examine the 
evidence. Evidence can be inconclusive, but it can never be conclusive in one way for 
one person and conclusive in another way for another person. (It is this way according to 
Kuhn, since to a large extent paradigms are "self-justifying.") In any case, it is always 
apparent how one could proceed in principle to adjudicate between the two hypotheses, 
even if it is impossible or impractical for us to do so. If there is a neutral algorithm or 
decision procedure inherent in "the scientific method," then you can see how the degree 
of consensus that typically exists in science would be easy to explain. On the other hand, 
it is difficult to explain how there could be disagreement over fundamentals when people 
have the same body of evidence before them. Historical study does seem to suggest that 
indeed science is non-cumulative in important respects, i.e., that in scientific revolutions 
some standards and achievements are lost at the same time that new ones are gained. 
(Examples: Dalton, Newton). 

On the other hand, it is hard to see how Kuhn could explain how consensus arises as 
quickly as it does in science, given his incommensurability thesis. Indeed, it is difficult to 
see how Kuhn can explain why consensus arises at all. Some of his explanations leave 
one cold, e.g., that all the young people adopt the new theory, and that the older people, 
who remain loyal to the older framework, simply die off. Why shouldn't the younger 
scientists be as divided as the older scientists? Similarly, arguing that some groups get 
control of the universities and journals does not explain why the others don't go off and 
found their own journals. To see this, consider Kuhn's own analogies between scientific 
revolutions and political revolutions of religious conversions. In these areas of human 
discourse, there is little consensus and little prospect for consensus. (Indeed, in the fields 
of philosophy and sociology of science there is little consensus or prospect for consensus, 
either.) Here we suspect that the groups differ with regard to their basic (political or 
religious) values, and that since there is no way of adjudicating between these values, the 
rifts end up persisting. If science is like that, and there is no "proof" but only 
"conversion" or "persuasion," then why should there ever be the unanimity that arises 
during periods of normal science, as Kuhn describes it? 

To complicate matters, Kuhn often notes that it usually becomes clear to the vast majority 
of the scientific community that one paradigm is "better" than another. One important 
achievement leading to the adopting of the new theory is that it solves the anomaly that 
created a sense of crisis within the old paradigm. Additionally, the new paradigm may 
include very precise, quantitative methods that yield more accurate predictions, or they 
may simply be easier to apply or conceptualize. Often, Kuhn claims not that these are not 
good considerations in favor of adopting the new paradigm, but that they are 
"insufficient" to force the choice between scientists. In any case, they are not often 
present in actual historical cases (e.g., he notes that at first Copernican astronomy was not 
sufficiently more accurate than Ptolemaic astronomy). 

When Kuhn talks like this, he sounds very little like the person who propounds radical 
incommensurability between theories. Instead, the issue at hand seems to be that the 

background image

empirical evidence simply does not determine logically which theory is correct. This 
thesis is often called the "underdetermination" thesis. This is much less radical than 
saying that scientists "live in different worlds" (incommensurability of experience). 
Instead, we simply have it that empirical evidence does not determine which theory is 
correct, and that to fill in the gap scientists have to import their own cognitive values, 
about which they differ. (These may not be supplied by the paradigm, but rather the 
paradigm is favored because cognitive values differ.) 

Laudan thinks that Kuhn shares many assumptions with Popper and Carnap, in particular 
the view that science (and pursuit of knowledge in general) is hierarchically structured 
when it comes to justification. That is, we have the following levels of disagreement and 
resolution. 

Level of Disagreement  

Level of Resolution  

Factual  

Methodological  

Methodological  

Axiological  

Axiological  

None  

According to Laudan, Kuhn disagrees with Carnap and Popper about whether scientists 
share the same cognitive values insofar as they are acting professionally (i.e., as scientists 
rather than individuals). If so, this would provide a way of resolving any dispute. Kuhn 
says No; Carnap and Popper (in different ways) says Yes. They both seem to agree that 
differences in cognitive values cannot be resolved. Thus, the reason Kuhn sees paradigms 
as incommensurable is simply that on his view there is no higher level to appeal to in 
deciding between the different values inherent in competing paradigms. Next time, we 
will examine the implications of the hierarchical picture in detail. 

Lecture 14 

3/24/94 

Laudan on the Hierarchical Model of Justification  

At the end of the last lecture, we briefly discussed Laudan's view that Popper, Carnap, 
and Kuhn all shared an assumption, i.e., that scientific justification is hierarchically 
structured. To review, Laudan thinks that Kuhn shares many assumptions with Popper 
and Carnap, in particular the view that science (and pursuit of knowledge in general) is 
hierarchically structured when it comes to justification. That is, we have the following 
levels of disagreement and resolution. 

Level of Disagreement  

Level of Resolution  

Factual  

Methodological  

Methodological  

Axiological  

background image

Axiological  

None  

According to Laudan, Kuhn disagrees with Carnap and Popper about whether scientists 
share the same cognitive values insofar as they are acting professionally (i.e., as scientists 
rather than individuals). If so, this would provide a way of resolving any dispute. Kuhn 
says No; Carnap and Popper (in different ways) says Yes. They both seem to agree that 
differences in cognitive values cannot be resolved. Thus, the reason Kuhn sees paradigms 
as incommensurable is simply that on his view there is no higher level to appeal to in 
deciding between the different values inherent in competing paradigms. Let us look at 
Laudan's description of the hierarchical model in detail. 

•  Factual Disputes - disagreement about "matters of fact," i.e., any claim about 

what is true of the world, including both what we observe and the unobservable 
structure of the world.  

Factual disputes, on the hierarchical view, are to be adjudicated by appealing to the 
methodological rules governing scientific inquiry. 

•  Methodological Disputes - disagreement about "methodology," i.e., both high- 

and low-levels rules about how scientific inquiry should be conducted. These 
include very specific, relatively low-level rules such as "always prefer double-
blind to single-blind tests when testing a new drug" to high-level rules such as 
"avoid ad hocness," "only formulate independently testable theories," "assign 
subjects to the test and control groups randomly," and so on. These rules would 
also include instructions regarding statistical analysis (when to perform a t-test or 
c-squared test, when to "reject" or "accept" hypotheses at a given level of 
significance, and so on). As Laudan remarks, settling a factual dispute on the 
hierarchical model is somewhat similar to deciding a case in court; the rules are 
fairly well established for deciding cases; evidence is presented on either side of 
the case, and the rules, when properly applied, result in an impartial, fair, justified 
resolution of the dispute.  

Note that it is not a part of the hierarchical view that any factual dispute can be 
immediately adjudicated by application of methodological rules. For starters, the 
evidence may simply be inconclusive, or of poor quality. In that case, the rules would 
simply tell you to go find more evidence; they would also tell you what type of evidence 
would be needed to resolve the dispute. This does not mean, of course, that the evidence 
will be found. It may be impractical, or maybe even immoral, to go out and get evidence 
of the required sort. That, however, only means that some factual disputes cannot be 
settled as a practical matter; all disputes can be settled "in principle." 

Methodological disputes, on the hierarchical view, are to be adjudicated by appealing to 
the goals and aims of scientific inquiry. The assumption is that rules of testing, 
experiment, statistical analysis, and so on, are not ends in themselves but means to 
achieving a higher goal. 

•  Axiological Disputes - disagreements about the aims and goals of scientific 

inquiry. For example, do scientists seek truth, or simply empirical adequacy? 

background image

Must theories be "explanatory" in particular sense? As remarked last time, Carnap 
and Popper seem to assume that scientists, insofar as they are acting rationally and 
as scientists, do not disagree about the fundamental aims and goals of scientific 
inquiry. (Carnap and Popper might differ on what those aims and goals are that all 
scientists share; but that is another story.) Kuhn, on the other hand, assumes that 
scientists who commit themselves to different paradigms will also differ about 
what they consider to be the aims and goals of scientific inquiry (in a particular 
discipline).  

On the hierarchical view of justification, axiological disputes cannot be adjudicated; there 
is no higher level to appeal to. 

[Note: GOAL in subsequent discussion of hierarchical view is what it involves, and what 
it would take to refute it. Contrast with "Leibnizian Ideal."] 

Factual Disagreement (Consensus) 

Can all factual disputes be settled by appeal to methodological rules? There is a basic 
problem with supposing that they can. Although the rules and available evidence will 
exclude some hypotheses from consideration, they will never single out one hypothesis 
out of all possible hypotheses as the "correct" one given that evidence. In other words, 
methodological rules plus the available evidence always underdetermine factual claims. 
This may occur if the two hypotheses are different but "empirically equivalent," i.e., they 
have the same observational consequences. In this case, it is questionable whether the 
theories are even different theories at all (e.g., wave vs. matrix mechanics). In many cases 
we might think that the theories really are different, but observation could never settle the 
issue between them (e.g., Bohmian mechanics vs. orthodox quantum mechanics). 

As noted earlier, this fact does not undercut the hierarchical model. The reason for this is 
that the hierarchical model only says that when factual disputes can be adjudicated, they 
are adjudicated at the methodological level (by applying the rules of good scientific 
inquiry). However, it is not committed to conceiving of methodology as singling out one 
hypothesis out of all possible hypotheses, but simply as capable, often enough, of settling 
a dispute between the hypotheses that we happen to have thought of, and giving us 
direction on how to gather additional evidence should available evidence be insufficient. 
That is, on the hierarchical model are rules simply answer the question: 

Which hypothesis out of those available to us is 

best supported by the available evidence? 

The rules, then, do not tell us what hypothesis to believe ("the truth is h"), but simply 
which of two hypotheses to prefer. In other words, they provide criteria that partition or 
divide the class of hypotheses into those that are permissible, given the evidence, and 
those that are not. Thus it may turn out in particular cases that given the available 
evidence more than one hypothesis is acceptable.  

Consider the following question: would it be rational for a scientist now, given our 
present state of empirical knowledge, to believe in Cartesian physics, or phlogiston 
theory, and so on? The point here is that though there are periods (perhaps long ones) 

background image

during which the available rules underdetermine the choice, so that it is rationally 
permissible for scientists to disagree, there comes a time when rational disagreement 
becomes impermissible. That is, though whether a person is justified in holding a position 
is relative to the paradigm in the short term, in the long term it isn't true that just 
"anything goes." (Feyerabend, some sociologists conclude from the fact that "reasonable" 
scientists can and do differ, sometimes violently so, when it comes to revolutionary 
periods that there is no rational justification of one paradigm over another, that there is no 
reason to think that science is progressing towards the truth, or so on.) 

•  Moral (Laudan): Underdetermination does not imply epistemic relativism with 

regard to scientific theories.  

How is this relevant to Kuhn? Well, Laudan claims that Kuhn is implicitly assuming that 
the fact that there is no neutral algorithm (methodological rule) that always tells us "This 
hypothesis is correct," and concludes from this that the choice between them must be, at 
least in part, non-rational. Then he concludes at the end of the book that this shows that 
science is not "progressing" towards the truth, considered as a whole. Progress can only 
be determined when the relevant and admissible problems (goals and aims of the 
discipline) are fixed by a paradigm. (Analogy with Darwinian evolution: no "goal" 
towards which evolution is aiming.) However, this may be true only in the short term. 
Laudan says (pages 31-32) that though "observational accuracy" might be vaguely 
defined, there comes a point at which it's apparent that one theoretical framework is more 
accurate observationally than another. Kuhn's problem is emphasizing too strongly the 
disagreement that can occur because of the ambiguity of shared standards. 

Methodological Disagreement (Consensus) 

What are some examples of methodological disagreement? Predictions must be surprising 
or of wide variety. Another example: disagreement over applications of statistics; 
statistical inference is not a monolithic area of investigation, free of disagreement. 

Laudan says that goals and aims cannot completely resolve disputes over many 
methodological claims; there is underdetermination between goals and justification of 
methods just as there is between methods and justification of factual claims. For example, 
simply accepting that our goal is that scientific theories be true, explanatory, coherent, 
and of wide generality does not by itself determine which methodological principles we 
should seek. 

This no more implies that methodological disputes cannot be resolved by appeals to 
shared goals than that factual claims can never be settled by appeals to shared methods. 
We can often show that a certain rule is one way of reaching our goal, or that it is better 
than other rules under consideration. Consider the rule that double-blind tests are better 
than single-blind tests. NOTE: This example already shows that the hierarchical model is 
too simple, since the lower-level facts influence what methods we think will most likely 
reach our goals of the truth about a subject matter. 

Lecture 15 

3/29/94 

background image

Laudan's Reticulated Theory of Scientific Justification 

Last time we examined Laudan's criticisms of the hierarchical model of scientific 
justification. As you recall, his point was not that scientific debate is never resolved as 
the hierarchical model would predict; it is just that it does not always do so. Laudan 
thinks that the hierarchical model is often plausible, so long as it is loosened up a bit. In 
particular, the hierarchical model has to allow that not all disputes can be resolved by 
moving up to a higher level; also, as I mentioned at the end of the last session, it has to 
allow for elements from a lower level to affect what goes on at a higher level. (For 
example, it was mentioned that the methodological rule that double-blind tests be 
preferred to single-blind tests was based on the factual discovery that researchers 
sometimes unintentionally cause patients who have received a medication, but don't 
know whether they have, to be more optimistic about their prospects for recovery and 
thereby affect how quickly people recover from an illness.) If these adjustments are 
made, the hierarchical model becomes less and less "hierarchical." 

In the end, however, the what makes the hierarchical view essentially "hierarchical" is 
that there is a "top" level (the axiological) for which no higher authority is possible. To 
put it less abstractly, a theory of scientific justification is hierarchical if it says that there 
is no way to adjudicate between disputes at the (axiological) level of cognitive values, 
aims, or goals; disagreement at this level is always rationally irresolvable. 

Laudan wants to dispute the claim that disagreement at the axiological level is always 
irresolvable. Instead, he argues that there are several mechanisms that can and are used to 
resolve disagreements at the axiological level. To see that these mechanisms exist, 
however, we have to drop all vestiges of the view that scientific justification is "top-
down." Instead, scientific justification is a matter of coherence between the various 
levels; scientific disputes can be rationally resolved so long as one or more of the levels is 
held fixed. 

Central to this model of scientific justification is the view that the different levels 
constrain each other, so that holding some of the levels fixed, there are limits to how far 
you can go is modifying the other level(s). This means that it must be possible for some 
of the levels to change without there being change at all the other levels. Before Laudan 
describes this model, which he the "reticulated" model of scientific justification, in detail, 
he first discusses a common but flawed pattern of reasoning that leads many people to 
think that there must be "covariation" between all three levels. 

The Covariation Fallacy 

•  Disagreement at one level (e.g., theory) are always accompanied by disagreement 

at all higher levels (e.g., method and goals).  

•  Agreement at one level (e.g., aims) is always accompanied by agreement at all 

lower levels (e.g., method and theory).  

If these theses are correct, then theoretical disagreements between scientists would indeed 
have to be accompanied by disagreements over aims, e.g., what counts as an acceptable 
scientific explanation. Laudan, relying on the fact that there is underdetermination 

background image

between every level, argues that this is not necessarily so. People can agree over what 
counts as a good scientific explanation while differing over whether a specific theory 
meets whatever criteria are necessary for a good scientific explanation, or over what 
methods would best help promote the acquisition of good scientific explanations. (Kuhn's 
view, of course, is that there must be a difference at the higher level if disagreement 
occurs at the lower level; thus, he argues that the scientists agree only at a shallow level--
theories ought to be "explanatory"--while disagreeing about what those criteria 
specifically amount to.) What is perhaps more important, people can disagree about the 
aims of their discipline (e.g., truth vs. empirical adequacy, or consistency with the 
evidence vs. conceptual elegance, simplicity, and beauty) while agreeing about 
methodology and theory. (TEST: Get a group of scientists who agree on theory and 
method and then ask them what the ultimate aims of their discipline are; you might find 
surprising differences.) This would occur if the methods in question would promote both 
sets of aims. Similarly, the same theory can be deemed preferable by two different and 
deeply conflicting methodologies. That is, a theory can win out if it looks superior no 
matter what perspective you take. This is how things have often occurred in the history of 
science, according to Laudan. 

Kuhn commits the covariance fallacy, Laudan argues, in his arguments for the view that 
theory, methods, and values, which together make up a paradigm, form an inseparable 
whole. As Laudan construes him, Kuhn thinks that a paradigm is a package deal: you 
can't modify the theory without affecting the methodological rules, or how the aims of 
that discipline are conceived. On the contrary, Laudan argues, change can occur 
piecemeal, one or more level at a time (with adjustments to the other levels coming later). 

How Can Goals Be Rationally Evaluated? 

Laudan describes two mechanisms that can be used to adjudicate between axiological 
disputes: (1) you can show that, if our best theories are true, the goals could not be 
realized (the goal is "utopian"); and (2) the explicitly espoused goals of a discipline are 
not (or even cannot be) reflected in the actual practice of that discipline (as evinced in its 
methods). Mechanism (1) tries to show a lack of fit between theories and aims, keeping 
the former fixed; mechanism (2) tries to show a lack of fit between methods and aims, 
keeping the former fixed. 

Method (1) - Different Kinds of "Utopian" Strategies:  

(a) Demonstrable utopianism (goals demonstrably cannot be achieved, e.g., absolute 
proof of general theories by finite observational evidence); 

(b) Semantic utopianism (goals are so vaguely that it is unclear what would count as 
achieving them, e.g., beauty or elegance); 

(c) Epistemic utopianism (it's impossible to provide a criterion that would enable us to 
determine if we've reached our goal, e.g., truth). 

Method (2) - Reconciling Aims and Practice 

(a) Actual theories, methods cannot achieve those aims. Examples: Theories must be 

background image

capable of proof by Baconian induction from the observable evidence; Explanatory 
theories must not speculate about unobservables. Both were rejected because the practice 
of science necessitated the postulation of unobservables; so Baconian induction was 
rejected as an ideal and replaced with the Method of Hypotheses (hypothetical-
deductivism). Here agreement over theories and methods--which didn't make sense if the 
explicitly espoused aims were really the aims of science--provided the rational basis for 
adjusting the explicitly espoused aims of science. 

(b) All attempts at achieving those aims have failed (e.g., certainty, explanatory, 
predictive theory that appeals only to kinematic properties of matter). 

Three Important Things to Note about Laudan's Reticulated Theory of Scientific 
Rationality: 
on Laudan's view, (1) because the levels constrain but do not determine the 
other levels, it is sometimes the case that disagreements over aims are rationally 
irresolvable--but this is not generally the case; (2) the levels are to a large degree 
independent of one another, allowing for paradigm change to be piecemeal or gradual 
rather than a sudden "conversion" or "Gestalt shift;" and (3) scientific "progress" can only 
be judged relative to a particular set of goals. Thus, Laudan's view, like Kuhn's, is 
relativistic. 

(Important: Like Kuhn, Laudan denies the radical relativistic view that progress does 
not exist in science; he simply thinks that whether progress occurs in science can only be 
judged relative to certain shared goals, just like whether certain aims are reasonable can 
only be judged if either theory or method is held fixed. Judgments that progress has 
occurred in science, relative to fixed goals, can therefore be rationally assessed as true or 
false.)  

Lecture 16 

3/31/94 

Dissecting the Holist Picture of Scientific Change  

Last time, we discussed Laudan's reticulationist model of scientific justification. In this 
session, we will examine Laudan's arguments for thinking that his model does better than 
Kuhn's quasi-hierarchical, "holist" model at explaining both how agreement and 
disagreement emerges during scientific revolutions. As you recall, Laudan argues that 
what change in the aims or goals of a scientific discipline can result from reasoned 
argument if there is agreement at the methodological and/or theoretical (factual) level. In 
other words, if any of the three elements in the triad of theories, methods, and aims is 
held fixed, this is sufficient to provide reasonable grounds for criticizing the other 
elements. Laudan's view is "coherentist," in that he claims that scientific rationality 
consists in maintaining coherence or harmony between the elements of the triad. 

This picture of reasoned argument in science requires that the aims-methods-theories 
triad be separable, i.e., that these elements do not combine to form an "inextricable" 
whole, or Gestalt, as Kuhn sometimes claimed. If some of these elements can change 
while the others are held fixed, and reasoned debate is possible as long one or more of the 
elements are held fixed, then this leaves open this possibility that scientific debates 

background image

during what Kuhn calls "paradigm change" can be rational, allowing (at least sometimes) 
for relatively quick consensus in the scientific community. This could occur if scientific 
change were "piecemeal," i.e., if change occurred in only some elements of the aims-
methods-theories triad at a time. Indeed, Laudan wants to argue that when examined 
closely scientific revolutions are typically piecemeal and gradual rather than sudden, all-
or-nothing Gestalt switches. The fact that it often looks sudden in retrospect is an illusion 
accounted for by the fact that looking back often telescopes the fine-grained structure of 
the changes. 

Kuhn on the Units of Scientific Change 

For Kuhn, the elements of the aims-methods-theories triad typically change 
simultaneously rather than sequentially during scientific revolutions. For example, he 
says: "In learning a paradigm the scientist acquires theory, methods, and standards 
together, usually in an inextricable mix." In later chapters of SSR, Kuhn likens paradigm 
change to all-or-nothing Gestalt switches and religious conversion. If this were so, it 
would not be surprising if paradigm debate were always inconclusive and could never 
completely be brought to closure by rational means. Closure must then always be 
ultimately explained by non-rational factors, such as the (contingent) power dynamics 
within a scientific community. As Laudan argued in Chapter 1 of SV, factors such as 
these cannot fully explain why it is that closure is usually achieved in science whereas it 
is typically not in religions or other ideologies, or why it is that closure is normally 
achieved relatively quickly. 

Laudan's solution to the problem of explaining both agreement and disagreement during 
scientific revolutions is not to reject Kuhn's view entirely, but to modify it in two ways. 

•  by dropping the hierarchical picture of scientific rationality, and replacing it with 

the "reticulated" picture, in which scientific aims as well as methods and theories 
are rationally negotiable  

•  by dropping the notion that all elements in the aims-methods-theories triad change 

simultaneously; change is typically "piecemeal" during scientific revolutions  

Question: How could piecemeal change occur? Laudan first sketches an idealized 
account of such change, and then attempts to argue that this idealized account 
approximates what often happens historically. (Go through some examples of the former 
in terms of a hypothetical "unitraditional" paradigm shift.) The fact that it does not look 
like that in retrospect is normally due to the fact that history "telescopes" change, so that 
a decade-long period of piecemeal change is characterized only in terms of its beginning 
and end-points, which exhibits a complete replacement of one triad by another. "...a 
sequence of belief changes which, described at the microlevel, appears to be a perfectly 
reasonable and rational sequence of events may appear, when represented in broad 
brushstrokes that drastically compress the temporal dimension, as a fundamental and 
unintelligible change of world view" (78). 

(Now go through what might happen if there are different, competing paradigms.) When 
there is more than one paradigm, agreement can also occur in the following kinds of 

background image

cases. 

•  The theory in one complex looks better according to the divergent methodologies 

in both paradigms.  

•  The theory in one complex looks like it better meets the aims of both theories than 

its rival (e.g., predictive accuracy, simplicity).  

Because the criteria (aims, methods) are different in the two theories, there may be no 
neutral, algorithmic proof; nevertheless, it often turns out that as the theory develops it 
begins to look better from both perspectives. (Again, this only makes sense if we deny 
three theses espoused by Kuhn--namely, that paradigms are self-justifying, that the aims-
methods-theories mix that comprises a paradigm is an "inextricable" whole, and that 
paradigm change cannot be piecemeal.) Thus, adherents of the old paradigm might drop 
adopt the new methods and theories because they enable them to do things that they 
recognize as valuable even from their own perspective; then they might modify their aims 
as they find out that those aims don't cohere with the new theory. (Examples of 
Piecemeal Change: Transition from Cartesian to Newtonian mechanics (theoretical 
change led later to axiological change); Transition from Ptolemaic to Copernican 
astronomy (methodological change--i.e., it eventually because easier to calculate using 
the methods of Copernican astronomy, though this wasn't true at first--led to eventual 
adoption of the Copernican theory itself.) 

Prediction of the Holist Approach (committed to covariance): 

•  Change at one level (factual, methodological, axiological) will always be 

simultaneous with change at the other levels (follows from the fact that theory, 
methods, and standards form an "inextricable" whole).  

Counterexamples: piecemeal change given above; cross-discipline changes not tied to 
any particular paradigm (e.g., the acceptance of unobservables in theories, the rejection of 
certainty or provability as a standard for acceptance of theories)  

Because of the counterexamples, it is possible for there to be "fixed points" from which 
to rationally assess the other levels. "Since theories, methodologies, and axiologies stand 
together in a kind of justificatory triad, we can use those doctrines about which there is 
agreement to resolve the remaining areas about which we disagree" (84). 

Can Kuhn respond? 

(1) The "ambiguity of shared standards" argument - those standards that scientists agree 
on (simplicity, scope, accuracy, explanatory value), they often interpret or "apply" 
differently. Laudan's criticism: not all standards are ambiguous (e.g., logical consistency) 
- A response on behalf of Kuhn: it's enough that some are, and that they play a crucial 
role in scientists' decisions 

(2) The "collective inconsistency of rules" argument - rules can be differently weighted, 
so that they lead to inconsistent conclusions. Laudan's criticism: only a handful of cases, 
not obviously normal. No one has ever shown that Mill's Logic, Newton's Principia, 

background image

Bacon or Descartes' methodologies were internally inconsistent. A response of behalf of 
Kuhn: Again, it's enough if it happens, and it often happens when it matters most, i.e., 
during scientific revolutions. 

(3) The shifting standards argument - different standards applied, so theoretical 
disagreements cannot be conclusively adjudicated. Laudan's criticism: It doesn't follow 
(see earlier discussion of underdetermination and the reticulationist model of scientific 
change). 

(4) The problem weighting argument - Laudan's response: one can give reasons why 
these problems are more important than others, and these reasons can be (and usually are) 
rationally critiqued. "...the rational assignment of any particular degree of probative 
significance to a problem must rest on one's being able to show that there are viable 
methodological and epistemic grounds for assigning that degree of importance rather than 
another" (99). Also, Laudan notes that the most "important" problems are not the most 
probative ones (the ones that most stringently test the theory). For example, explaining 
the anomalous advance in Mercury's perihelion, Brownian motion, diffraction around a 
circular disk. These problems did not become probative because they were important, but 
became important because they were probative. 

Lecture 17 

4/5/94 

Scientific Realism Vs. Constructive Empiricism  

Today we begin to discuss a brand new topic, i.e., the debate between scientific realism 
and constructive empiricism. This debate was provoked primarily by the work of Bas van 
Fraassen, whose critique of scientific realism and defense of a viable alternative, which 
he called constructive empiricism, first reached a wide audience among philosophers with 
the publication of his 1980 book The Scientific Image. Today we will discuss (1) what 
scientific realism is, (2) what alternatives are available to scientific realism, specifically 
van Fraassen's constructive empiricism. 

What Is Scientific Realism? 

Scientific realism offers a certain characterization of what a scientific theory is, and what 
it means to "accept" a scientific theory. A scientific realist holds that (1) science aims to 
give us, in its theories, a literally true story of what the world is like, and that (2) 
acceptance of a scientific theory involves the belief that it is true. 

Let us clarify these two points. With regard to the first point, the "aims of science" are to 
be distinguished from the motives that individual scientists have for developing scientific 
theories. Individual scientists are motivated by many diverse things when they develop 
theories, such as fame or respect, getting a government grant, and so on. The aims of the 
scientific enterprise are determined by what counts as success among members of the 
scientific community, taken as a whole. (Van Fraassen's analogy: The motives an 
individual may have for playing chess can differ from what counts as success in the 
game, i.e., putting your opponent's king in checkmate.) In other words, to count as fully 

background image

successful a scientific theory must provide us with a literally true description of what the 
world is like. 

Turning to the second point, realists are not so naive as to think that scientists' attitudes 
towards even the best of the current crop of scientific theories should be characterized as 
simple belief in their truth. After all, even the most cursory examination of the history of 
science would reveal that scientific theories come and go; moreover, scientists often have 
positive reason to think that current theories will be superseded, since they themselves are 
actively working towards that end. (Example: The current pursuit of a unified field 
theory, or "theory of everything.") Since acceptance of our current theories is tentative, 
realists, who identify acceptance of a theory with belief in its truth, would readily admit 
that scientists at most tentatively believe that our best theories are true. To say that a 
scientist's belief in a theory is "tentative" is of course ambiguous: it could mean either 
that the scientist is somewhat confident, but not fully confident, that the theory is true; or 
it could mean that the scientist is fully confident that the theory is approximately true. To 
make things definite, we will understand "tentative" belief in the former way, as less-
than-full confidence in the truth of the theory. 

Constructive Empiricism: An Alternative To Scientific Realism 

There are two basic alternatives to scientific realism, i.e., two different types of scientific 
anti-realism. That is because scientific realism as just described asserts two things, that 
scientific theories (1) should be understood as literal descriptions of the what the world is 
like, and (2) so construed, a successful scientific theory is one that is true. Thus, a 
scientific anti-realist could deny either that theories ought to be construed literally, or that 
theories construed literally have to be true to be successful. A "literal" understanding of a 
scientific theory is to be contrasted with understanding it as a metaphor, or as having a 
different meaning from what its surface appearance would indicate. (For example, some 
people have held that statements about unobservable entities can be understood as 
nothing more than veiled references to what we would observe under various conditions: 
e.g., the meaning of a theoretical term such as "electron" is exhausted by its "operational 
definition.") Van Fraassen is an anti-realist of the second sort: he agrees with the realist 
that scientific theories ought to be construed literally, but disagrees with them when he 
asserts that a scientific theory does not have to be true to be successful. 

Van Fraassen espouses a version of anti-realism that he calls "constructive empiricism." 
This view holds that (1) science aims to give us theories that are empirically adequate, 
and (2) acceptance of a scientific theory involves the belief that it is empirically adequate. 
(As was the case above, one can tentatively accept a scientific theory by tentatively 
believing that the theory is empirically adequate.) A scientific theory is "empirically 
adequate" if it gets things right about the observable phenomena in nature. Phenomena 
are "observable" if they could be observed by appropriately placed beings with sensory 
abilities similar to those characteristic of human beings. On this construal, many things 
that human beings never have observed or ever will observe count as "observable." On 
this understanding of "observable," to accept a scientific theory is to believe that it gets 
things right not only about the empirical observations that scientists have already made, 
but also about any observations that human scientists could possibly make (past, present, 
and future) and any observations that could be made by appropriately placed beings with 

background image

sensory abilities similar to those characteristic of human scientists. 

The Notion Of Observability 

Constructive empiricism requires a notion of "observability." Thus, it is important that we 
be as clear as possible about what this notion involves for van Fraassen. Van Fraassen 
holds two things about the notion of observability: 

(1) Entities that exist in the world are the kinds of things that are observable or 
unobservable. There is no reason to think that language can be divided into theoretical 
and observational vocabularies, however. We may describe observable entities using 
highly theoretical language (e.g., VHF receiver," "mass," "element," and so on); this does 
not, however, mean that whether the things themselves (as opposed to how we describe 
or conceptualize them) are unobservable or not depends on what theories we accept. 
Thus, we must carefully distinguish between observing an entity from observing that an 
entity exists meeting such-and-such a description. The latter can be dependent upon 
theory, since descriptions of observable phenomena are often "theory-laden." However, it 
would be a confusion to conclude from this that the entity observed is a theoretical 
construct. 

(2) The boundary between observable and unobservable entities is vague. There is a 
continuum from viewing something with glasses, to viewing it with a magnifying lens, 
with a low-power optical microscope, with a high-power optical microscope, to viewing 
it with an electron microscope. At what point should the smallest things visible using a 
particular instrument count as "observable?" Van Fraassen's answer is that "observable" 
is a vague predicate like "bald" or "tall." There are clear cases when a person is bald or 
not bald, tall or not tall, but there are also many cases in between where it is not clear on 
which side of the line the person falls. Similarly, though we are not able to draw a precise 
line that separates the observable from the unobservable, this doesn't mean that the notion 
has no content, since there are entities that clearly fall on one side or the other of the 
distinction (consider sub-atomic particles vs. chairs, elephants, planets, and galaxies). 
The content of the predicate "observable" is to be fixed relative to certain sensory 
abilities. What counts as "observable" for us is what could be observed by a suitably 
placed being with sensory abilities similar to those characteristic of human beings (or 
rather, the epistemic community to which we consider ourselves belonging). Thus, beings 
with electron microscopes in place of eyes do not count. 

Arguments In Favor Of Scientific Realism: Inference To The Best Explanation 

Now that we have set out in a preliminary way the two rival positions that we will 
consider during the next few weeks, let us examine the arguments that could be given in 
favor of scientific realism. An important argument that can be given for scientific realism 
is that we ought rationally to infer that the best explanation of what we observe is true. 
This is called "inference to the best explanation." The argument for this view is that in 
everyday life we reason according to the principle of inference to the best explanation, 
and so we should also reason this way in science. The best explanation, for example, for 
the fact that measuring Avogadro's number (a constant specifying the number of 
molecules in a mole of any given substance) using such diverse phenomena as Brownian 

background image

motion, alpha decay, x-ray diffraction, electrolysis, and blackbody radiation gives the 
same result is that matter really is composed of the unobservable entities we call 
molecules. If it were not, wouldn't it be an utterly surprising coincidence that things 
behaved in very different circumstances exactly as if they were composed of molecules? 
This is the same kind of reasoning that justifies belief that an apartment has mice. If all 
the phenomena that have been observed are just as would be expected if a mouse were 
inhabiting the apartment, isn't it then reasonable to believe that there's a mouse, even 
though you've never actually seen it? If so, why should it be any different when you are 
reasoning about unobservable entities such as molecules? 

Van Fraassen's response is that the scientific realist is assuming that we follow a rule that 
says we should infer the truth of the best explanation of what we have observed. This is 
what makes it look inconsistent for a person to insist that we ought not to infer that 
molecules exist while at the same time insisting that we ought to infer that there is a 
mouse in the apartment. Why not characterize the rule we are following differently--i.e., 
that we infer that the best explanation of what we observe is empirically adequate? If that 
were the case, we should believe in the existence of the mouse, but we should not believe 
anything more of the theory that matter is composed of molecules than that it adequately 
accounts for all observable phenomena. In other words, van Fraassen is arguing that, 
unless you already assume that scientists follow the rule of inference to the (truth of the) 
best explanation, you cannot provide any evidence that they follow that rule as opposed 
to following the rule of inference to the empirical adequacy of the best explanation.  

We will continue the discussion of inference to the best explanation, as well as other 
arguments for scientific realism, next time. 

Lecture 18 

4/7/94 

Inference To The Best Explanation As An Argument For Scientific Realism 

Last time, I ended the lecture by alluding to an argument for scientific realism that 
proceeded from the premise that the reasoning rule of inference to the best explanation 
exists. Today, we will examine in greater detail how this argument would go, and we'll 
also discuss the notion of inference to the best explanation in greater detail. 

The Reality Of Molecules: Converging Evidence  

As I noted last time, what convinced many scientists at the beginning of this century of 
the atomic thesis (i.e., that matter is composed of atoms that combine into molecules, and 
so on) was that there are many independent experimental procedures all of which lead to 
the same determination of Avogadro's number. Let me mention some of the ways in 
which the number was determined. 

(1) Brownian Motion. Jean Perrin studied the Brownian motion of small, microscopic 
particles known as colloids. (Brownian motion was first noted by Robert Brown, in the 
early 19th century.) Though visible only through a microscope, the particles were much 
larger than molecules. Perrin determined Avogadro's number from looking at how 

background image

particles were distributed vertically when placed in colloidal suspensions. He prepared 
tiny spheres of gamboge, a resin, all of uniform size and density. He measured how the 
particles were distributed vertically when placed in water; calculating what forces would 
have to be in place to account for this keeping the particles suspended, he could calculate 
their average kinetic energy. If we know the mass and velocities, we can then determine 
the mass of a molecule of the fluid, and hence Avogadro's number, which is the 
molecular weight divided by the mass of a single molecule. 

(2) Alpha Decay. Rutherford recognized that alpha particles were helium nuclei. Alpha 
particles can be detected by scintillation techniques. By counting the number of helium 
atoms that were required to make up a certain mass of helium, Rutherford calculated 
Avogadro's number. 

(3) X-ray diffraction - A crystal will diffract x-rays, the matrix of atoms acting like a 
diffraction grating. From the wavelength of the x-rays and the diffraction pattern, you can 
calculate the spacing of the atoms. Since that is regular in a crystal, you could then 
determine how many atoms it takes to make up the crystal, and so Avogadro's number. 
(Friedrich & Knipping)  

(4) Blackbody Radiation - Planck derived a formula for the law of blackbody radiation, 
which made use of Planck's constant (which was obtainable using Einstein's theory of the 
photoelectric effect) and macroscopically measurable variables such as the speed of light 
to derive Boltzmann's constant. Then you use the ideal gas law PV = nRT; n is the 
number of moles of an ideal gas, and R (the universal gas constant) is the gas constant per 
mole. Boltzmann's constant k is the gas constant per molecule. Hence R/k = Avogadro's 
number (number of molecules per mole). 

(5) Electrochemistry - A Faraday F is the charge required to deposit a mole of 
monovalent metal during electrolysis. This means that you can calculate the number of 
molecules per mole if you know the charge of the electron, F/e = N. Millikan's 
experimental measurements of the charge of the electron can then be used to derive 
Avogadro's number. 

Now, the scientific realist wants to claim that the fact that all of these different 
measurement techniques (along with many others not mentioned here) all lead to the 
same value for Avogadro's number. The argument, then, is how you can explain this 
remarkable convergence on the same result if it were not for the fact that there are atoms 
& molecules that are behaving as the theories say they do? Otherwise, it would be a 
miracle. 

Inference To The Best Explanation (Revisited) 

We have to now state carefully what is being claimed here. The scientific realist argues 
that the fact that the reality of molecules explains the convergence on Avogadro's number 
better than its rival, that the world is not really molecular but that everything behaves as 
if it were. That is because given the molecular hypothesis, convergence would be 
strongly favored, whereas if the underlying world were not molecular we wouldn't expect 
any stability in such a result. Now it is claimed that being the best explanation of 

background image

something is a mark of truth. Thus, we have an inference pattern.  

A explains X better than its rivals, B, C, and so on.  

The ability of a hypothesis to explain something better 
than all its rivals is a mark of its truth.  

A is true. 

Now why should we think that this is a good reasoning pattern? (That is, why should we 
think that the second premise is true?) The scientific realist argues that it is a reasoning 
pattern that we depend on in everyday life; we must assume the truth of the second 
premise if we are to act reasonably in everyday life. To refer to the example we discussed 
the last session, the scientific realist argues that if this reasoning pattern good for the 
detective work that infers the presence of an unseen mouse, it is good enough for the 
detective work that infers the presence of unseen constituents of matter. 

Van Fraassen has argued that the hypothesis that we infer to the truth of our best 
explanation can be replaced by the hypothesis that we infer to the empirical adequacy of 
our best explanation without loss in the case of the mouse, since it is observable. How 
then, can you determine whether we ought to follow the former rule rather than the latter? 
The only reason we have for thinking that we follow IBE is everyday examples such as 
the one about mice; but the revised rule that van Fraassen suggests could account for this 
inferential behavior just as well as the IBE hypothesis. Thus, there is no real reason to 
think that we follow the rule of IBE in our reasoning. 

Van Fraassen's Criticisms Of Inference To The Best Explanation 

Van Fraassen also has positive criticisms of IBE. First, he argues that it is not what it 
claims to be. In science, you don't really choose the best overall explanation of the 
observable phenomena, but the best overall explanation that you have available to you. 
However, why should we think that the kinds of explanations that we happen to have 
thought of are the best hypotheses that could possibly be thought up by any intelligent 
being? Thus, the IBE rule has to be understood as inferring the truth of the best 
explanation that we have thought of. However, if that's the case our "best" explanation 
might very well be the best of a bad lot. To be committed to IBE, you have to hold that 
the hypotheses that we think of are more likely to be true than those that we do not, for 
that reason. This seems implausible. 

Reactions On Behalf Of The Scientific Realist 

(1) Privilege - Human beings are more likely to think up hypotheses that are true than 
those that are false. For otherwise, evolution would have weeded us out. False beliefs 
about the world make you less fit, in that you cannot predict and control your 
environment, and may likely die. 

Objection: The kinds of things that selected us during our evolution based on our 
inferences do not depend on what we have believed to be true. They just have to not kill 
us, and enable us to have children. In addition, we only have to be able to infer what is 

background image

empirically adequate. 

(2) Forced Choice - We cannot do without inferring what goes beyond our evidence; thus 
the choice between competing hypotheses is forced. To guide our choices, we need rules 
of reasoning, and IBE fits the bill. 

Objection - The situation may force us to choose the best we have; but it cannot force us 
to believe that the best we have is true. Having to choose a research program, for 
example, only means that you think that it is the best one available to you, and that you 
can best contribute to the advancement of science by choosing that one. It does not 
thereby commit you to belief in its truth. 

The problem that is evinced here, and will crop up again, is that any rule of this sort has 
to make assumptions about the way the world is. The things that seem simple to us are 
most likely to be true; the things that seem to explain better than any of the alternatives 
that we've come up with are more likely to be true that those that do not; and so on. 

 

Lecture 19 

4/12/94 

Entity Realism (Hacking & Cartwright)  

Last time we looked at an argument for scientific realism based on an appeal to a putative 
rule of reasoning, called Inference to the Best Explanation (IBE). We examined a case 
study where the scientific realist argues that convergent but independent determinations 
of Avogadro's number were better explained by the truth of the molecular hypothesis than 
its empirical adequacy (Salmon 1984, Scientific Explanation and the Causal Structure of 
the World
, pages 213-227). (Note that Salmon's treatment of these cases does not 
explicitly appeal to IBE, but to the Common Cause principle.) The primary problems 
with that argument were (1) the realist has given no reason to think that we as a rule infer 
the truth of the best explanation, rather than its empirical adequacy, and (2) it is 
impossible to argue for IBE as a justified rule of inference unless one assumes that 
human beings are, by nature, more likely to think up true explanations rather than ones 
that are merely empirically adequate. There are, of course, responses a realist can make to 
these objections, and we examined two responses to problem (2), one that argues that (a) 
evolution selected humans based on our ability to generate true rather than false 
hypotheses about the world, and the other that (b) accepts the objection but argues that 
we are somehow forced to believe the best available explanation. Neither of these 
responses seems very convincing (van Fraassen 1989, Laws and Symmetry, pages 142-
150). 

Today we will look at a moderate form of realism, which I will call "entity realism," and 
arguments for that view that do not depend on IBE. Entity realists hold that what one is 
rationally compelled to believe the existence of some of the unobservable entities 
postulated by our best scientific theories, but one is not obligated to believe that 
everything that our best theories say about those entities is true. Nancy Cartwright, for 
example, argues that we are compelled to believe in those entities that figure essentially 

background image

in causal explanations of the observable phenomena, but not in the theoretical 
explanations that accompany them. The primary reason she gives is that causal 
explanations, e.g., that a change in pressure is caused by molecules impinging on the 
surface of a container with greater force after heat energy introduced into the container 
increases the mean kinetic energy of the molecules, make no sense unless you really 
think that molecules exist and behave roughly as described. Cartwright claims that you 
have offered no explanation at all if you give the preceding story and then add, "For all 
we know molecules might not really exist, and the world simply behaves as if they exist." 
Theoretical explanations, on the other hand, which merely derive the laws governing the 
behavior of those entities from more fundamental laws, are not necessary to believe, 
since a multiplicity of theoretical laws can account for the phenomenological laws that 
we derive from experiment. Cartwright argues that scientists use often different and 
incompatible theoretical models based on how useful those models are in particular 
experimental situations; if this is so, scientists cannot be committed to the truth of all 
their theoretical models. However, scientists do not admit incompatible causal 
explanations of the same phenomenon; according to Cartwright, that is because a causal 
explanation cannot explain at all unless the entities that play the causal roles in the 
explanation exist. 

Cartwright's argument depends on a certain thesis about explanation (explanations can 
either cite causes or can be derivations from fundamental laws) and an associated 
inference rule (one cannot endorse a causal explanation of a phenomenon without 
believing in the existence of the entities that, according to the explanation, play a role in 
causing the phenomenon). As Cartwright sometimes puts it, she rejects the rule of 
Inference to the Best Explanation but accepts a rule of Inference to the Most Probable 
Cause. Van Fraassen, of course, is unlikely to acquiesce in such reasoning, since he 
rejects the notion that a causal explanation cannot be acceptable unless the entities it 
postulates exist; on the contrary, if what is requested in the circumstances is information 
about causal processes according to a particular scientific theory, it will be no less 
explanatory if we merely accept the theory (believe it to be empirically adequate) rather 
than believe that theory. Thus, the constructive empiricist can reject Cartwright's 
argument since he holds a different view of what scientific explanation consists in. 

Hacking takes a different route in arguing to an entity realist position. Hacking argues 
that the mistake that Cartwright and van Fraassen both make is concentrating on 
scientific theory rather than experimental practice. His approach can be summed up in the 
slogans "Don't Just Peer, Interfere" (with regard to microscopes), and "If you can 
manipulate them, they must be real" (with regard to experimental devices that use 
microscopic particles such as electrons as tools). Let's look at his arguments for these two 
cases. 

In his article, "Do We See Through a Microscope?" (Churchland and Hooker, eds., 1985, 
Images of Science), Hacking argues that what convinces experimentalists that they are 
seeing microscopic particles has nothing to do with the theory of those particles or of 
how a microscope behaves, but that they can manipulate those particles in very direct and 
tangible ways to achieve certain results. 

background image

•  The ability to see through a microscope is acquired through manipulation (what is 

an artifact of the instrument and what is reality are learned through practice)  

•  We believe what we see because by manipulation we have found the preparation 

process that produces these sights to give stable and reliable results, and to be 
related to what we see macroscopically in certain regular ways (the Grid 
Argument, the electron beam arguments of "Experimentation and Scientific 
Realism," in Kourany)  

•  We believe what we see through a particular instrument is reliable because we can 

invent new and better ways of seeing it (optical microscopes, ultraviolet 
microscopes, electron microscopes, etc.)  

Hacking's argument contains three elements, that (a) manipulation causes cognitive 
changes that give us new perceptual abilities, (b) we can manipulate the world in such a 
way as to create microstructures that have the same properties as macrostructures we can 
observe, and that, (c) combined with this fact, the convergence of the various instruments 
on the same visual results gives us additional reason to believe that what we are seeing is 
real, not an artifact of any particular instrument. 

The final element (c) seems similar to the convergence argument we looked at last time, 
when we were discussing IBE. There is a difference, however, since what is at issue is 
not whether a single scientific theory implies things that are verified under many 
independent circumstances, but whether we are convinced that we are seeing something 
based on the fact of stable features using different viewing techniques. Nevertheless, it is 
an argument from coincidence--wouldn't it be a miracle if all these independent viewing 
techniques shared stable structural features and those features weren't really present in the 
microscopic specimen?--and stands or falls on the same grounds as were discussed last 
time. 

However, that is not all that Hacking has at his disposal. His greatest strength is 
discussing how we acquire new modes of perception by using instruments to manipulate 
a world we cannot see. In his words, we don't see through a microscope, we see with a 
microscope. That is something that must be learned by interacting with the microscopic 
world, just as ordinary vision is acquired by interacting with the macroscopic world 
around us. 

In addition, Hacking wants to argue that we come to manipulate things in ways that do 
not involve direct perception. This is where the example of using electrons to check for 
parity violation of weak, neutral currents comes in. In this case, Hacking argues that it 
might have once been the case that the explanatory virtues of atomic theory led one to 
believe in their existence; but now we have more direct evidence. We can now use 
electrons to achieve other results, and thus we are convinced of the existence of entities 
with well-defined, stable causal properties. That does not mean that we know everything 
there is to know about those particles (thus, we may disbelieve any of the particular 
theories of the electron that are in existence); however, that there are entities with certain 
causal properties is shown by experience, by manipulating electrons to achieve definite, 
predictable results. (Hence the slogan, "If you can manipulate them, they must be real.") 

background image

This is why Hacking, like Cartwright, is an entity realist, but not a realist about scientific 
theories. 

Next time, we will examine various responses that a constructive empiricist such as van 
Fraassen might give to such arguments. 

 

Lecture 20 

4/14/94 

Entity Realism And The "Non-Empirical" Virtues  

Last time we discussed arguments due to Cartwright and Hacking for the entity realist 
position. Entity realism, as you recall, is the view that belief in certain microphysical 
entities can be (and is) rationally compelling. Cartwright argues that we are rationally 
required to believe in the existence of those entities that figure essentially in causal 
explanations that we endorse. (Van Fraassen's response to her argument is that since the 
endorsement of the explanation only amounts to accepting it--i.e., believing it to be 
empirically adequate--belief in the unobservable entities postulated by the explanation is 
not rationally required.) By contrast, Hacking argued that we are rationally required to 
believe in the existence of those entities that we can reliably and stably manipulate. He 
argues that once we start using entities such as electrons then we have compelling 
evidence of their existence (and not merely the empirical adequacy of the theory that 
postulates their existence). To make his case, he gives detailed descriptions of how we 
stably and reliably interact with things shown by optical microscopes, and with electron 
guns in the PEGGY series. The microscope case is especially interesting since it indicates 
that a person can acquire new perceptual abilities by using new instruments and that 
"observability" is a flexible notion. 

The question that we will examine in the first part of the lecture today is whether 
Hacking's arguments do not simply beg the question against van Fraassen's constructive 
empiricism. Let us begin by discussing Hacking's argument that stability of certain 
features of something observed using different instruments is a compelling sign of their 
reality. In response, van Fraassen asks us to consider the process of developing such 
instruments. When we are building various types of microscopes, we not only use theory 
to guide the design, but also learn to correct for various artifacts (e.g., chromatic 
aberration). As van Fraassen puts it, "I discard similarities that do not persist and also 
build machines to process the visual output in a way that emphasizes and brings out the 
noticed persistent similarities. Eventually the refined products of these processes are 
strikingly similar when initiated in similar circumstances ... Since I have carefully 
selected against non-persistent similarities in what I allow to survive the visual output 
processing, it is not all that surprising that I have persistent similarities to display to you" 
(Images of Science, page 298). In other words, van Fraassen argues that we design our 
observational instruments (such as microscopes) to emphasize those features that we 
regard as real, and de-emphasize those we regard as artifacts. If that is so, however, we 
cannot point to convergence as evidence in the reality of those features, since we have 
designed the instruments (optical, ultraviolet, electron microscopes) so that they all 

background image

converge on those features we have antecedently decided are real, not artifacts. 

The principle that Hacking is using in his argument from stability across different 
observational techniques is that if there is a certain kind of stable input-output match in 
our instruments, we can be certain that the output is a reliable indicator of what is there at 
the microphysical level. Van Fraassen notes that, given the constraints that we place on 
design and that the input is the same (say, a certain type of prepared blood sample), it is 
not surprising that the output would be the same, even if the microstructure that we "see" 
through the microscope has no basis in reality. 

Thus, Hacking is correct in seeking a more striking example, which he attempts to 
provide with his Grid Argument. (Here, as you recall, a grid is drawn, photographically 
reduced, and that reduction is used to manufacture a tiny metal grid. The match between 
the pattern we see through the microscope at the end of the process and the pattern 
according to which the grid was drawn at the beginning indicates both that the grid-
manufacturing process is a reliable one and that the microscope is a reliable instrument 
for viewing microphysical structure. Thus, by analogy, Hacking argues that we should 
believe it what the microscope reveals about the microphysical structure of things that we 
have not manufactured.) Van Fraassen objects on two levels. First, he argues that 
argument by analogy is not strong enough to support scientific realism. For analogy 
requires an assumption that if one class of things resembles another in one respect, they 
must resemble it in another. (To be concrete, if the microscopic grid and a blood cell 
resemble one another in being microscopic, and we can be sure that the former is real 
because its image in the microscope matches the pattern that we photographically 
reduced, then we can by analogy infer that what the microscope shows us about the blood 
cell must be accurate as well.) To this van Fraassen replies, "Inspiration is hard to find, 
and any mental device that can help us concoct more complex and sophisticated novel 
hypotheses is to be welcomed. Hence, analogical thinking is welcome. But it belongs to 
the context of discovery, and drawing ingenious analogies may help to find, but does not 
support, novel conjectures" (Images of Science, page 299). 

Discuss. What could Hacking reply? Consider the following: to show that an analogical 
inference is unjustified, you have to argue that there is an important disanalogy between 
the two that blocks the inference. Here the obvious candidate is that the grid is a 
manufactured object, whereas the blood cell, if it exists, is not. Since we would not 
accept any process as reliable that did not reproduce the macroscopic grid pattern we 
drew, it is no surprise that what we see through the microscope matches the pattern we 
drew--for we developed the process so that things would work out that way. However, we 
cannot say the same thing about the blood cell. Is this convincing? 

Second, van Fraassen argues that to make Hacking's argument work we have to assume 
that we have successfully produced a microscopic grid. How do we know that? Well, 
because there is a coincidence between the pattern we observe through the microscope 
and the pattern we drew at the macroscopic level. Hacking's argument is that we would 
have to assume some sort of cosmic conspiracy to explain the coincidence if the 
microscopic pattern did not reflect the real pattern that was there, which would be 
unreasonable. Van Fraassen's response is that not all observable regularities require 
explanation.  

background image

Discuss. Might Hacking fairly object that, while it is true that not all coincidences require 
explanation, it is true that coincidences require explanation unless there is positive reason 
to think they cannot be explained. For this reason, he might argue that van Fraassen's 
counterexamples to a generalized demand for explanation, e.g., those based on the 
existence of verified coincidences predicted by quantum physics (the EPR type), are not 
telling, since we have proofs (based on the physical theories themselves) that there can be 
no explanations of those kinds of events. (Also, discuss van Fraassen's argument that to 
explain a coincidence by postulating a deeper theory will not remove all coincidences; 
eventually, "why" questions terminate.) 

Now, we turn to the question of whether Hacking's claim that the fact that we manipulate 
certain sorts of microscopic objects speaks to their reality, not just the empirical adequacy 
of the theory that postulates those entities. Van Fraassen would want to ask the following 
question: How do you know that your description of what you are manipulating is a 
correct one? All you know is that if you build a machine of a certain sort, you get certain 
regular effects at the observable level, and that your theory tells you that this is because 
the machine produces electrons in a particular way to produce that effect. The 
constructive empiricist would accept that theory, too, and so would accept the same 
description of what was going on; but for him that would just mean that the description is 
licensed by a theory he believes to be empirically adequate. To infer that the observable 
phenomena that are described theoretically as "manipulating electrons in such-and-such a 
way to produce an effect" compels belief in something more than empirical adequacy 
requires assuming that the theoretical description of what he is doing is not merely 
empirically adequate, but also true. However, that is what the constructivist would deny. 

Non-Empirical Virtues As A Guide To Truth: A Defense? 

We have already examined in some detail van Fraassen's reasons for thinking that non-
empirical features of theories that we regard as desirable (simplicity, explanation, 
fruitfulness) are not marks of truth, nor even of empirical adequacy, though they are 
certain pragmatic reasons for preferring one theory over another (i.e., using that theory 
rather than another). Let us look briefly at a defense of the view that "non-empirical 
virtues" of this sort are indeed guides to truth, due to Paul Churchland. 

Churchland's view is basically that van Fraassen's argument against the non-empirical 
virtues as guides to truth can be used against him. Since we are unable to assess all the 
empirical evidence for or against a particular theory, we have in the end to decide what to 
believe based on what is simplest, most coherent or explanatory. This would sound much 
like the "forced choice" response that we looked at last time, except that Churchland is 
arguing something subtly different. He argues that van Fraassen cannot decide which of 
two theories is more likely to be empirically adequate without basing his decision on 
which of the two theories is simplest, most fruitful and explanatory, and so on. That is 
because the arguments that apply to the theory's truth with regard to its subject matter 
(available evidence in principle can never logically force the issue one way or another 
when it comes to unobservable structure), apply just as well to empirical adequacy. We'll 
never have complete knowledge of all observable evidence either, and so nothing 
compels us one way or another to accept one theory as empirically adequate rather than 
another (when both agree on what we have observed so far, or ever will observe). 

background image

However, we cannot completely suspend belief altogether. Since that is the only choice 
that van Fraassen's reasoning gives us in the end, it ought to be rejected, and so we can 
conclude that non-empirical virtues are just as much a guide to truth as are what we have 
observed. 

Churchland is a philosopher who works in cognitive psychology (with an emphasis on 
neurophysiology). He argues that we know that "values such as ontological simplicity, 
coherence, and explanatory power are some of the brain's most basic criteria for 
recognizing information, for distinguishing information from noise ... Indeed, they even 
dictate how such a framework is constructed by the questing infant in the first place" 
("The Ontological Status of Observables," Images of Science, page 42). Thus, he 
concludes that since even our beliefs about what is observable are grounded in what van 
Fraassen calls the merely "pragmatic" virtues (so-called because they give us reason to 
use a theory without giving us reason to believe that it is true), then it cannot be 
unreasonable to use criteria such as simplicity, explanatory power, and coherence to form 
beliefs about what is unobservable to us. Van Fraassen's distinction between 'empirical, 
and therefore truth-relevant' and 'pragmatic, and therefore not truth-relevant' is therefore 
not a tenable one. 

Churchland concludes with a consideration that I will leave you as food for thought. 
Suppose that a humanoid race of beings were born with electron microscopes for eyes. 
Van Fraassen would then argue that since for them the microstructure of the world would 
be seen directly, they would unlike us have different bounds on what they regard as 
"observable." Churchland regards this distinction as wholly unmotivated. He points out 
that van Fraassen's view leads to the absurd conclusion that they can believe in what their 
eyes tell them but we cannot, even though if we put our eyes up to an electron microscope 
we will have the same visual experiences as the humanoids
. There is no differences 
between the causal chain leading from the objects that are perceived and the experience 
of perception in both cases, but van Fraassen's view leads to the conclusion that 
nevertheless we and the humanoids must embrace radically different attitudes towards the 
same scientific theories. This he regards as implausible. 

Lecture 21 

4/19/94 

Laudan on Convergent Realism 

Last time, we discussed van Fraassen's criticism of Hacking's arguments for entity 
realism based on the Principle of Consistent Observation (as in the microscope case, most 
notably the grid argument). We also talked about Churchland's criticisms of van 
Fraassen's distinguishing empirical adequacy from the "pragmatic" or non-empirical 
virtues. Churchland argued that, based on what we know about perceptual learning, 
there's no principled reason to think that empirical adequacy is relevant to truth but 
simplicity is not. 

Today we're going to look at an argument for scientific anti-realism that proceeds not by 
arguing a priori that observability has a special epistemological status, but by arguing 

background image

that, based on what we know about the history of science, scientific realism is 
indefensible. Laudan discusses a form of realism he calls "convergent" realism, and seeks 
to refute it. Central to this position is the view that the increasing success of science 
makes it reasonable to believe (a) that theoretical terms that remain across changes in 
theories refer to real entities and (b) that scientific theories are increasing approximations 
to the truth. The convergent realist's argument is a dynamic one: it argues from the 
increasing success of science to the thesis that scientific theories are converging on the 
truth about the basic structure of the world. The position thus involves the concepts of 
reference, increasing approximations to the truth, and success, which we have to 
explicate first before discussing arguments for the position. 

Sense vs. Reference. Philosophers generally distinguish the sense of a designating term 
from its reference. For example, "the current President of the U.S." and "the Governor of 
Arkansas in 1990" are different descriptive phrases with distinct senses (meanings), but 
they refer to the same object, namely Bill Clinton. Some descriptive phrases are 
meaningful, but have no referent, e.g., "the current King of France." We can refer to 
something by a description if the description uniquely designates some object by virtue of 
a certain class of descriptive qualities. For example, supposing that Moses was a real 
rather than a mythical figure, descriptions such as "the author of Genesis," "the leader of 
the Israelites out of Egypt" and "the author of Exodus" might serve to designate Moses 
(as would the name "Moses" itself). Suppose now that we accept these descriptions as 
designations of Moses but subsequently discover that Moses really didn't write the books 
of Genesis and Exodus. (Some biblical scholars believe the stories in these books were 
transmitted orally from many sources and weren't written down until long after Moses 
had died.) This would amount to discovering that descriptions such as "the author of 
Genesis" do not refer to Moses, but this wouldn't mean that Moses didn't exist, simply 
that we had a false belief about him. While we may pick out Moses using various 
descriptions, it doesn't mean that all the descriptions we associate with Moses have to be 
correct for us to do so. 

Realists hold that unobservable entities are like that between changes in scientific 
theories. Though different theories of the electron have come and gone, all these theories 
refer to the same class of objects, i.e., electrons. Realists argue that terms in our best 
scientific theories (such as "electron") typically refer to the same classes of unobservable 
entities across scientific change, even though the set of descriptions (properties) that we 
associate with those classes of objects change as the theories develop. 

Approximate Truth. The notion of approximate truth has never been clearly and 
generally defined, but the intuitive idea is clear enough. There are many qualities we 
attribute to a class of objects, and the set of qualities that we truly attribute to those 
objects increases over time. If that is so, then we say that we are moving "closer to the 
truth" about those objects. (Perhaps in mathematical cases we can make clearer sense of 
the notion of increasing closeness to the truth; i.e., if we say a physical process evolves 
according to a certain mathematical equation or "law," we can sometimes think of 
improvements of that law converging to the "true" law in a well-defined mathematical 
sense.) 

The "Success" of Science. This notion means different things to different people, but is 

background image

generally taken to refer to the increasing ability science gives us to manipulate the world, 
predict natural phenomena, and build more sophisticated technology. 

Convergent realists often argue for their position by pointing to the increasing success of 
science. This requires that there be a reasonable inference from a scientific theory's 
"success" to its approximate truth (or to the thesis that its terms refer to actual entities). 
However, can we make such an inference? As Laudan presents the convergent realist's 
argument in "A Confutation of Convergent Realism," the realist is arguing that the best 
explanation of a scientific theory's success is that it is true (and its terms refer). Thus, the 
convergent realist uses "abductive" arguments of the following form. 

If a scientific theory is approximately true, it will (normally) be successful. 
[If a scientific theory is not approximately true, it will (normally) not be 
successful.] 
Scientific theories are empirically successful. 
Scientific theories are approximately true. 

If the terms in a scientific theory refer to real objects, the theory will (normally) 
be successful. 
[If the terms in a scientific theory do not refer to real objects, the theory will 
(normally) not be successful.] 
Scientific theories are empirically successful. 
The terms in scientific theories refer to real objects. 

Laudan does not present the argument quite this way; in particular, he does not include 
the premises in brackets. These are needed to make the argument even prima facie 
plausible. (Note, however, that the premises in brackets are doing almost all the work. 
That's OK since, as Laudan points out, the convergent realist often defends the first 
premise in each argument by appealing to the truth of the second premise in that 
argument.) 

Question: Could the convergent realist use weaker premises, e.g., that a scientific theory 
is more likely to be successful if it is true than if it is false (with a similar premise serving 
the role in the case of reference). Unfortunately, this would not give the convergent 
realist what he wants since he could only infer from this that success makes it more likely 
that the theory is true--which is not to say that success makes it likely that the theory is 
true. (This can be seen by examining the probabilistic argument below, where pr(T is 
successful) _ 1 and pr(T is true) _ 1, and PR(-) = pr(-|T is successful).) 

(1) pr(T is successful | T is true) > pr(T is successful | T is false) 
(2) pr(T is true | T is successful) > pr(T is true) 
(3) PR(T is successful) = 1 
PR(T is true) > pr(T is true) 

That said, let's go with the stronger argument that uses the conditionals stated above. 
Without the premises in brackets, these arguments are based on inference to the best 
explanation of the success of science. As such, it would be suspect, for the reasons given 
before. Leaving that issue aside, Laudan argues that any premises needed for the 

background image

argument to go through are false. Let us consider the case of reference first. Laudan 
argues that there is no logical connection between having terms that refer and success. 
Referring theories often are not successful for long periods of time (e.g., atomic theory, 
Proutian hydrogen theory of matter, and the Wegnerian theory of plates), and successful 
theories can be non-referring (e.g., phlogiston theory, caloric theory, and nineteenth 
century ether theories). 

Next, we consider the premise that if a scientific theory is approximately true, it is 
(normally) successful. However, we wouldn't expect theories that are approximately true 
overall to result in more true than false consequences in the realm of what we have 
happened to observe. (That's because a false but approximately true scientific theory is 
likely to make many false predictions, which could for all we know "cluster" in the 
phenomena we've happened to observe.) In addition, since theories that don't refer can't 
be "approximately true," the examples given above to show that there are successful but 
non-referring theories also show that there can be successful theories that aren't 
approximately true (e.g., phlogiston theory, caloric theory, and nineteenth century ether 
theories). 

Retention Arguments. Because the simple arguments given above are not plausible, 
convergent realists often try to specify more precisely which terms in successful scientific 
theories we can reasonably infer refer to real things in the world. The basic idea is that we 
can infer that those features of a theory that remain stable as the theory develops over 
time are the ones that "latch onto" some aspect of reality. In particular, we have the 
following two theses.  

Thesis 1 (Approximate Truth). If a certain claim appears an initial member of a 
succession of increasingly successful scientific theories, and either that claim or a claim 
of which the original claim is a special case appears in subsequent members of that 
succession, it is reasonable to infer that the original claim was approximately true, and 
that the claims that replace it as the succession progresses are increasing approximations 
to the truth.  

Thesis 2 (Reference). If a certain term putatively referring to certain type of entity 
occurs in a succession of increasingly successful scientific theories, and there is an 
increasingly large number of properties that are stably attributed to that type of entity as 
the succession progresses, then it is reasonable to infer that the term refers to something 
real that possesses those properties. 

There are many ways a theory can retain claims (or terms) as it develops and becomes 
more successful. (We will allow those claims and terms that remain stable across radical 
change in theories, such as "paradigm shifts" of the sort described by Kuhn.) For 
example, the old theory could be a "limiting case" of the new, in the formal sense of 
being derivable from it (perhaps only with auxiliary empirical assumptions that according 
to the new theory are false); or, the new theory may reproduce those empirical 
consequences of the old theory that are known to be true (and maybe also explain why 
things would behave just as if the old there were true in the domain that was known at the 
time); finally, the new theory may preserve some explanatory features of the old theory. 
Convergent realists argue from retention of some structure across theoretical change that 

background image

leads to greater success that whatever is retained must be either approximately true (in the 
case of theoretical claims such as Newton's Laws of Motion, which are "limiting cases" 
of laws that appear in the new, more successful theory) or must refer to something real 
(in the case of theoretical terms such as "electron," which have occurred in an 
increasingly successful succession of theories as described by Thesis 2). 

Next time, we will examine an objection to retention arguments, as well as several replies 
that could be made on behalf of the convergent realist. 

 

Lecture 22 

4/21/94 

Convergent Realism and the History of Science  

Last time, we ended by discussing a certain type of argument for convergent realism 
known as retention arguments. These arguments rely on the following two theses. 

Thesis 1 (Approximate Truth). If a certain claim appears an initial member of a 
succession of increasingly successful scientific theories, and either that claim or a claim 
of which the original claim is a special case appears in subsequent members of that 
succession, it is reasonable to infer that the original claim was approximately true, and 
that the claims that replace it as the succession progresses are increasing approximations 
to the truth.  

Thesis 2 (Reference). If a certain term putatively referring to certain type of entity 
occurs in a succession of increasingly successful scientific theories, and there is an 
increasingly large number of properties that are stably attributed to that type of entity as 
the succession progresses, then it is reasonable to infer that the term refers to something 
real that possesses those properties. 

As I noted last time, there are many ways a theory can retain claims (or terms) as it 
develops and becomes more successful. (We will allow those claims and terms that 
remain stable across radical change in theories, such as "paradigm shifts" of the sort 
described by Kuhn.) For example, the old theory could be a "limiting case" of the new, in 
the formal sense of being derivable from it (perhaps only with auxiliary empirical 
assumptions that according to the new theory are false); or, the new theory may 
reproduce those empirical consequences of the old theory that are known to be true (and 
maybe also explain why things would behave just as if the old there were true in the 
domain that was known at the time); finally, the new theory may preserve some 
explanatory features of the old theory. Convergent realists argue from retention of some 
structure across theoretical change that leads to greater success that whatever is retained 
must be either approximately true (in the case of theoretical claims such as Newton's 
Laws of Motion, which are "limiting cases" of laws that appear in the new, more 
successful theory) or must refer to something real (in the case of theoretical terms such as 
"electron," which have occurred in an increasingly successful succession of theories as 
described by Thesis 2). 

background image

Warning: Now I should note that I am presenting the convergent realist's position, and 
retention arguments in general, somewhat differently than Laudan does in "A confutation 
of convergent realism." In that article, Laudan examines the "retentionist" thesis that new 
theories should retain the central explanatory apparatus of their predecessors, or that the 
central laws of the old theory should provably be special cases of the central laws of the 
new theory. This is a prescriptive account of how science should proceed. According to 
Laudan, convergent realists also hold that scientists follow this strategy, and that the fact 
that scientists are able to do so proves that the successive theories as a whole are 
increasing approximations to the truth. He objects, rightly, that while there are certain 
cases where retention like this occurs (e.g., Newton-Einstein), there are many cases in 
which retention of this sort does not occur (Lamarck-Darwin; catastrophist-
uniformitarian geology; corpuscular-wave theory of light; also, any of the examples that 
involve an ontological "loss" of the sort emphasized by Kuhn, e.g., phlogiston, ether, 
caloric). Moreover, Laudan points out that when retention occurs (as in the transition 
from Newtonian to relativistic physics), it only occurs with regard to a few select 
elements of the older theory. The lesson he draws from this is that the "retentionist" 
strategy is generally not followed by scientists, so the premise in the retentionist 
argument that says that scientists successfully follow this strategy is simply false. I take it 
that Laudan is correct in his argument, and refer you to his article for his refutation of that 
type of "global" retentionist strategy. What I'm doing here is slightly different, and more 
akin to the retentionist argument given by Ernan McMullin in his article "A Case for 
Scientific Realism." A retentionist argument that uses Theses 1 and 2 above and the fact 
that scientific theories are increasingly successful to argue for a realist position is not 
committed to the claim that everything in the old theory must be preserved in the new 
(perhaps only as a special case); it is enough that some things are preserved. (McMullin, 
in particular, uses a variation on Thesis 2, where the properties in question are structural 
properties, to argue for scientific realism. See the section of his article entitled "The 
Convergences of Structural Explanation.") That is because the convergent realist whom I 
am considering claims only that it is reasonable to infer the reality (or approximate truth) 
of those things that are retained (in one of the senses described above) across theoretical 
change. Thus, it is not an objection to the more reasonable convergent realist position that 
I'm examining here (of which McMullin's convergent realism is an example) to claim that 
not everything is retained when the new theory replaces the old, and that losses in overall 
ontology occur with theoretical change along with gains. 

That said, there are still grounds for challenging the more sensible, selective retentionist 
arguments that are based on Theses 1 and 2. I will concentrate on the issue of successful 
reference (Thesis 2); similar arguments can be given for the case of approximate truth 
(Thesis 1). (Follow each objection and reply with discussion.) 

Objection: The fact that a theoretical term has occurred in a succession of increasingly 
successful scientific theories as described by Thesis 2 does not guarantee that it will 
continue to appear in all future theories. After all, there have been many theoretical terms 
that appeared in increasingly successful research programs (phlogiston, caloric, ether) in 
the way described by Thesis 2, but those research programs subsequently degenerated 
and went the way of the dinosaurs. What the convergent realist needs to show to fully 
defend his view is that there is reason to think that those terms and concepts that have 

background image

been retained in recent theories (electron, quark, DNA, genes, fitness) will continue to be 
retained in all future scientific theories. However, there is no reason to think that: indeed, 
if we examine the history of science we should infer that any term or concept that appears 
in our theories today is likely to be replaced at some time in the future. 

Reply 1: The fact that many terms have been stably retained as described by Thesis 2 in 
recent theories is the best reason one could have to believe that they will continue to be 
retained in all future theories. Of course, there are no guarantees that this will be the case: 
quarks may eventually go the way of phlogiston and caloric. Nevertheless, it is 
reasonable to believe that those terms will be retained, and, what is more important, it 
becomes more reasonable to believe this the longer such terms are retained and the more 
there is a steady accumulation of properties that are stably attributed to the type of 
entities those terms putatively designate. Anti-realists such as Laudan are right when they 
point out that one cannot infer that the unobservable entities postulated by a scientific 
theory are real based solely on the empirical success of that theory, but that is not what 
scientists do. The inference to the reality of the entities is also a function of the degree of 
stability in the properties attributed to those entities, the steadiness of growth in that class 
of properties over time, and how fruitful the postulation of entities of that sort has been in 
generating accumulation of this sort. (See McMullin, "The Convergences of Structural 
Explanation" and "Fertility and Metaphor," for a similar point.) 

Reply 2: It's not very convincing to point to past cases in the history of science where 
theoretical entities, such as caloric and phlogiston, were postulated by scientists but later 
rejected as unreal. Science is done much more rigorously, rationally, and successfully 
nowadays than it was in the past, so we can have more confidence that the terms that 
occur stably in recent theories (e.g., electron, molecule, gene, and DNA) refer to 
something real, and that they will continue to play a role in future scientific theories. 
Moreover, though our conception of these entities has changed over time, there is a 
steady (and often rapid) accumulation of properties attributed to entities postulated by 
modern scientific theories, much more so than in the past. This shows that the terms that 
persist across theoretical change in modern science should carry more weight than those 
terms that persisted across theoretical change in earlier periods. 

Reply 3: The objection overemphasizes the discontinuity and losses that occur in the 
history of science. Laudan, like Kuhn, wants to argue there are losses as well as gains in 
the ontology and explanatory apparatus of science. However, in doing so he 
overemphasizes the importance of those losses for how we should view the progression 
of science. The fact that scientific progress is not strictly cumulative does not mean that 
there is not a steady accumulation of entities in scientific ontology and knowledge about 
the underlying structural properties that those entities possess. Indeed, one can counter 
Laudan's lists of entities that have been dropped from the ontology of science with 
equally long lists of entities that were introduced and were retained across theoretical 
change. Our conception of these entities has changed, but rather than focusing on the 
conceptual losses, one should focus on the remarkable fact that there is a steadily 
growing class of structural properties that we attribute to the entities that have been 
retained, even across radical theoretical change. (See McMullin, "Sources of Antirealism: 
History of Science," for a similar point.) 

background image

Lecture 23 

4/26/94 

The Measurement Problem, Part I 

For the next two days we will be examining the philosophical problems that arise in 
quantum physics, specifically those connected with the interpretation of measurement. 
Today's lecture will provide the necessary background and describe how measurement is 
understood according to the "orthodox" interpretation of quantum mechanics, and several 
puzzling features of that view. The second lecture will describe various "non-orthodox" 
interpretations of quantum mechanics that attempt to overcome these puzzles by 
interpreting measurement in a quite different way. 

A Little Historical Background 

As many of you know, the wave theory of light won out over the corpuscular theory by 
the beginning of the nineteenth century. By the end of that century, however, problems 
had emerged with the standard wave theory. The first problem came from attempts to 
understand the interactions of matter and light. A body at a stable temperature will radiate 
and absorb light. The body will be at equilibrium with the light, which has its energy 
distributed among various frequencies. The frequencies of light emitted by the body 
varies with its temperature, as is familiar from experience. However, by the end of the 
nineteenth century physicists found that they could not account theoretically for that 
frequency distribution. Two attempts to do so within the framework of Maxwell's theory 
of electromagnetic radiation and statistical mechanics led to two laws--Wien's law and 
the Rayleigh-Jeans law--that failed at lower and higher frequencies, respectively. 

In 1900, Max Planck postulated a compromise law that accounted for the observed 
frequency distribution, but it had the odd feature of postulating that light transmitted 
energy in discrete packets. The packets at each frequency had energy E = hn, where h is 
Planck's constant and n is the frequency of the light. Planck regarded his law as merely a 
phenomenological law, and thought of the underlying distribution of energy as 
continuous. In 1905, Einstein proposed that all electromagnetic radiation consists of 
discrete particle-like packets of energy, each with an energy hn. He referred to these 
packets, which we now call photons, as quanta of light. Einstein used this proposal to 
explain the photoelectric effect (where light shining on an electrode causes the emission 
of electrons). 

After it became apparent that light, a wave phenomenon, had particle-like aspects, 
physicists began to wonder whether particles such as electrons might also have wave-like 
aspects. L. de Broglie proposed that they did and predicted that under certain conditions 
electrons would exhibit wave-like behavior such as interference and diffraction. His 
prediction turned out to be correct: a diffraction pattern can be observed when electrons 
are reflected off a suitable crystal lattice (see Figure 1, below). 

Figure 1 

Electron Diffraction By A Crystal Lattice  

background image

Schrödinger soon developed a formula for describing the wave associated with electrons, 
both in conditions where they are free and when they are bound by forces. Famously, his 
formula was able to account for the various possible energy levels that had already been 
postulated for electrons orbiting a nucleus in an atom. At the same time, Heisenberg was 
working on a formalism to account for the various patterns of emission and absorption of 
light by atoms. It soon became apparent that the discrete energy levels in Heisenberg's 
abstract mathematical formalism corresponded to the energy levels of the various 
standing waves that were possible in an atom according to Schrödinger's equation. 

Thus, Schrödinger proposed that electrons could be understood as waves in physical 
space, governed by his equation. Unfortunately, this turned out to be a plausible 
interpretation only in the case of a single particle. With multiple-particle systems, the 
wave associated with Schrödinger's equation was represented in a multi-dimensional 
space (3n dimensions, where n is the number of particles in the system). In addition, the 
amplitudes of the waves were often complex (as opposed to real) numbers. Thus, the 
space in which Schrödinger's waves exist is an abstract mathematical space, not a 
physical space. Thus, the waves associated with electrons in Schrödinger's wave 
mechanics could not be interpreted generally as "electron waves" in physical space. In 
any case, the literal interpretation of the wave function did not cohere with the observed 
particle-like aspects of the electron, e.g., that electrons are always detected at particular 
points, never as "spread-out" waves. 

The Probability Interpretation Of Schrödinger's Waves; The Projection Postulate 

M. Born contributed to a solution to the problem by suggested that the square of the 
amplitude of Schrödinger's wave at a certain point in the abstract mathematical space of 
the wave represented the probability of finding the particular value (of position or 
momentum) associated with that point upon measurement. This only partially solves the 
problem, however, since the interference effects are physically real in cases such as 
electron diffraction and the famous two-slit experiment (see Figure 2). 

Figure 2 

The Two-Slit Experiment 

If both slits are open, the probability of an electron impinging on a certain spot on the 
photographic plate is not the sum of the probabilities that it would impinge on that spot if 
it passes through slit 1 and that it would impinge on that spot if it passes through slit 2, as 
is illustrated by pattern (a). Instead, if both slits are opinion you get an interference 
pattern of the sort illustrated by pattern (b). This is the case even if the light source is so 
weak that you are in effect sending only one photon at a time through the slits. 
Eventually, an interference pattern emerges, one point at a time. 

Interestingly, the interference pattern disappears if you place a detector just behind one of 
the slits to determine whether the photon passed through that slit or not, and you get a 
simple sum of the waves as illustrated by pattern (a). Thus, you can't explain the 
interference pattern as resulting from interaction among the many photons in the light 
source. 

background image

A similar sort of pattern is exhibited by electrons when you measure their spin (a two-
valued quantity) in a Stern-Gerlach device. (This device creates a magnetic field that is 
homogeneous in all but one direction, e.g., the vertical direction; in that case, electrons 
are deflected either up or down when they pass through the field.) You get interference 
effects here just as in the two-slit experiment described above. (See Figure 3) 

Figure 3 

Interference Effects Using Stern-Gerlach Devices  

When a detector is placed after the second, left-right oriented Stern-Gerlach device as in 
(d), the final, up-down oriented Stern-Gerlach device shows that half of the electrons are 
spin "up" and half of the electrons are spin "down." In this case, the electron beam that 
impinges on the final device is a definite "mixture" of spin "left" and spin "right" 
electrons. On the other hand, when no detector is placed after the second, left-right 
oriented Stern-Gerlach device as in (e), all the electrons are measured "up" by the final 
Stern-Gerlach device. In that case, we say that the beam of electrons impinging on the 
final device is a "superposition" of spin "left" and spin "right" electrons, which happens 
to be equivalent to a beam consisting of all spin "up" electrons. Thus, placing a device 
before the final detector destroys some information present in the interference of the 
electron wave, just as placing a detector after a slit in the two-slit experiment destroyed 
the interference pattern there. 

J. von Neumann generalized the formalisms of Schrödinger and Heisenberg. In von 
Neumann's formalism, the state of a system is represented by a vector in a complex, 
multi-dimensional vector space. Observable features of the world are represented by 
operators on this vector space, which encode the various possible values that that 
observable quantity can have upon measurement. Von Neumann postulated that the "state 
vector" evolves deterministically in a manner consistent with Schrödinger's equation, 
until there is a measurement, in which case there is a "collapse," which 
indeterministically alters the physical state of the system. This is von Neumann's famous 
"Projection Postulate." 

Thus, von Neumann postulated that there were two kinds of change that could occur in a 
state of a physical system, one deterministic (Schrödinger evolution), which occurs when 
the system is not being measured, and one indeterministic (projection or collapse), which 
occurs as a result of measuring the system. The main argument that von Neumann gave 
for the Projection Postulate is that whatever value of a system is observed upon 
measurement will be found with certainty upon subsequent measurement (so long as 
measurement does not destroy the system, of course). Thus, von Neumann argued that the 
fact that the value has a stable result upon repeated measurement indicates that the system 
really has that value after measurement.  

The Orthodox Copenhagen Interpretation (Bohr) 

What about the state of the system in between measurements? Do observable quantities 
really have particular values that measurement reveals? Or are there real values that exist 
before measurement that the measurement process itself alters indeterministically? In 

background image

either case, the system as described by von Neumann's state vectors (and Schrödinger's 
wave equation) would have to be incomplete, since the state vector is not always an 
"eigenvector," i.e., it does not always lie along an axis that represents a particular one of 
the possible values that a particular observable quantity (such as spin "up"). Heisenberg 
at first took the view that measurement indeterministically alters the state of the system 
that existed before measurement. This implies that the system was in a definite state 
before measurement, and that the quantum mechanical formalism gives an incomplete 
description of physical systems. Famously, N. Bohr proposed an interpretation of the 
quantum mechanical formalism that denies that the description given by the quantum 
mechanical formalism is incomplete. According to Bohr, it only makes sense to attribute 
values to observable quantities of a physical system when system is being measured in a 
particular way. Descriptions of physical systems therefore only make sense relative to 
particular contexts of measurement. (This is Bohr's solution to the puzzling wave-particle 
duality exhibited by entities such as photons and electrons: the "wave" and "particle" 
aspects of these entities are "complementary," in the sense that it is physically impossible 
to construct a measuring device that will measure both aspects simultaneously. Bohr 
concluded that from a physical standpoint it only makes sense to speak about the "wave" 
or "particle" aspects of quantum entities as existing relative to particular measurement 
procedures.) One consequence of Bohr's view is that one cannot even ask what a physical 
system is like between measurements, since any answer to this question would 
necessarily have to describe what the physical system is like, independent of any 
particular context of measurement. 

It is important to distinguish the two views just described, and to distinguish Bohr's view 
from a different view that is sometimes attributed to him: 

A physical system's observable properties always have definite values between 
measurement, but we can never know what those values are since the values can only be 
determined by measurement, which indeterministically disturbs the system. (Heisenberg) 

It does not make sense to attribute definite values to a physical system's observable 
properties except relative to a particular kind of measurement procedure, and then it only 
makes sense when that measurement is actually being performed. (Bohr) 

A physical system's observable properties have definite values between measurement, but 
these values are not precise, as is the case when the system's observable properties are 
being measured; rather, the values of the system's observable quantities before 
measurement are "smeared out" between the particular values that the observable quantity 
could have upon measurement. (Pseudo-Bohr) 

Each of these views interprets a superposition differently, as a representation of our 
ignorance about the true state of the system (Heisenberg), as a representation of that 
values that the various observable quantities of the system could have upon measurement 
(Bohr), or as a representation of the indefinite, imprecise values that the observable 
quantities have between measurements (Pseudo-Bohr). Accordingly, each of these views 
interprets projection or collapse differently, as a reduction in our state of ignorance 
(Heisenberg), as the determination of a definite result obtained when a particular 
measurement procedure is performed (Bohr), or as an instantaneous localization of the 

background image

"smeared out," imprecise values of a particular observable quantity (Pseudo-Bohr). 

Next time we will discuss several problematic aspects of the Copenhagen understanding 
of measurement, along with several alternative views. 

 

Lecture 24 

4/28/94 

The Measurement Problem, Part II 

Last time, we saw that Bohr proposed an interpretation of the quantum mechanical 
formalism, the Copenhagen interpretation, that has become the received view among 
physicists. Today we will look more closely at the picture that view gives us of physical 
reality and the measurement process in particular. Then we will examine several 
alternative interpretations of measurement, proceeding from the least plausible to the 
most plausible alternatives. 

The Copenhagen Understanding Of Measurement 

According to Bohr, it only makes sense to attribute values to observable quantities of a 
physical system when system is being measured in a particular way. Descriptions of 
physical systems therefore only make sense relative to particular contexts of 
measurement. (This is Bohr's solution to the puzzling wave-particle duality exhibited by 
entities such as photons and electrons: the "wave" and "particle" aspects of these entities 
are "complementary," in the sense that it is physically impossible to construct a 
measuring device that will measure both aspects simultaneously. Bohr concluded that 
from a physical standpoint it only makes sense to speak about the "wave" or "particle" 
aspects of quantum entities as existing relative to particular measurement procedures.) 
One consequence of Bohr's view is that one cannot even ask what a physical system is 
like between measurements, since any answer to this question would necessarily have to 
describe what the physical system is like, independent of any particular context of 
measurement.  

On Bohr's view (which is generally accepted as the "orthodox" interpretation of quantum 
mechanics even though it is often misunderstood), the world is divided into two realms of 
existence, that of quantum systems, which behave according to the formalism of quantum 
mechanics and do not have definite observable values outside the context of 
measurement, and of "classical" measuring devices, which always have definite values 
but are not described within quantum mechanics itself. The line between the two realms 
is arbitrary: at any given time, one can consider a part of the world that serves as a 
"measuring instrument" (such as the Stern-Gerlach device) either as a quantum system 
that interacts with other quantum systems according to the deterministic laws governing 
the state vector (Schrödinger's equation) or as a measuring device that behaves 
"classically" (i.e., always has definite observable properties) though indeterministically. 

There are several difficulties with this view, which together constitute the "measurement 
problem." To begin with, the orthodox interpretation gives no principled reason why 

background image

physics should not be able to give a complete description of the measurement process. 
After all, a measuring device (such as a Stern-Gerlach magnet) is a physical system, and 
in performing a measurement it simply interacts with another physical system such as a 
photon or an electron. However, (a) there seems to be no principled reason why one 
particular kind of physical interaction is either indescribable within quantum physics 
itself (as Bohr suggests), or is not subject to the same laws (e.g., Schrödinger's equation) 
that governs all other physical interactions, and (b) the orthodox view offers no precise 
characterization that would mark off those physical interactions that are measurements 
from those physical interactions that are not. Indeed, the orthodox interpretation claims 
that whether a certain physical interaction is a "measurement" is arbitrary, i.e., a matter of 
choice on the part of the theorist modeling the interaction. However, this hardly seems 
satisfactory from a physical standpoint. It does not seem to be a matter of mere 
convenience where we are to draw the line between the "classical" realm of measuring 
devices and the realm of those physical systems that obey the deterministic laws of that 
formalism (in particular, Schrödinger's equation). For example, Schrödinger pointed out 
that the orthodox interpretation allows for inconsistent descriptions of the state of 
macroscopic systems, depending on whether we consider them measuring devices. For 
example, suppose that you placed a cat in an enclosed box along with a device that will 
release poisonous gas if (and only if) a Geiger counter measures that a certain radium 
atom has decayed. According to the quantum mechanical formalism, the radium atom is 
in a superposition of decaying and not decaying, and since there is a correlation between 
the state of the radium atom and the Geiger counter, and between the state of the Geigen 
counter and the state of the cat, the cat should also be in a superposition, specifically, a 
superposition of being alive and dead, if we do not consider the cat to be a measuring 
device. If the orthodox interpretation is correct, however, this would mean that there is no 
fact of the matter about whether the cat is alive or dead, if we consider the cat not to be a 
measuring device. On the other hand, if we consider the cat to be a measuring device, 
then according to the orthodox interpretation, the cat will either be definitely alive or 
definitely dead. However, it certainly does not seem to be a matter of "arbitrary" choice 
whether (a) there is no fact of the matter about whether the cat is alive or dead before we 
look into the box, or (b) the cat is definitely alive or dead before we look into the box. 

Wigner's Idealism: Consciousness As The Cause Of Collapse  

Physicist E. Wigner argued against Bohr's view that the distinction between measurement 
and "mere" physical interaction could be made arbitrarily by trying to emphasize its 
counterintuitive consequences. Suppose that you put one of Wigner's friends in the box 
with the cat. The "measurement" you make at a given time is to ask Wigner's friend if the 
cat is dead or alive. If we consider your friend as part of the experimental setup, quantum 
mechanics predicts that before you ask Wigner's friend whether the cat is dead or alive, 
he is in a superposition of definitely believing the cat is dead and definitely believing that 
the cat is alive. Wigner argued that this was an absurd consequence of Bohr's view. 
People simply do not exist in superposed belief-states. Wigner's solution was that, 
contrary to what Bohr claimed, there is a natural division between what constitutes a 
measurement and what does not--the presence of a conscious observer. Wigner's friend is 
conscious; thus, he can by an act of observation cause a collapse of the wave function. 
(Alternately, if we consider the cat to be a conscious being, the cat would be definitely 

background image

alive or definitely dead even before Wigner's friend looked to see how the cat was doing.) 

Few physicists have accepted Wigner's explanation of what constitutes a measurement, 
though his view has "trickled down" to some of the (mainly poor quality) popular 
literature on quantum mechanics (especially the type of literature that sees a direct 
connection between quantum mechanics and Eastern religious mysticism). The basic 
source of resistance to Wigner's idealism is that it requires that physicists solve many 
philosophical problems they'd like to avoid, such as whether cats are conscious beings. 
More seriously, Wigner's view requires a division of the world into two realms, one 
occupied by conscious beings who are not subject to the laws of physics but who can 
somehow miraculously disrupt the ordinary deterministic evolution of the physical 
systems, and the other by the physical systems themselves, which evolve 
deterministically until a conscious being takes a look at what's going on. This is hardly 
the type of conceptual foundation needed for a rigorous discipline such as physics. 

The Many-Worlds Interpretation 

Another view, first developed by H. Everett (a graduate student at Princeton) in his Ph.D. 
thesis, is called the many-worlds interpretation. It was later developed further by another 
physicist, B. de Witt. This view states that there is no collapse when a measurement 
occurs; instead, at each such point where a measurement occurs the universe "branches" 
into separate, complete worlds, a separate world for every possible outcome that 
measurement could have. In each branch, it looks like there the measuring devices 
indeterministically take on a definite value, and the empirical frequencies that occur upon 
repeated measurement in almost every branch converge on the probabilities predicted by 
the Projection Postulate. The deterministically evolving wave function correctly describes 
the quantum mechanical state of the universe as a whole, including all of its branches. 

Everett's view has several advantages, despite its conceptual peculiarity. First, it can be 
developed with mathematical precision. Second, it postulates that the universe as a whole 
evolves deterministically according to Schrödinger's equation, so that you only need only 
type of evolution, not Schrödinger's equation plus collapse during a measurement. Third, 
the notion of a measurement (which is needed to specify where the branching occurs) can 
be spelled out precisely and non-arbitrarily as a certain type of physical interaction. 
Fourth, it makes sense to speak of the quantum state of the universe as a whole on 
Everett's view, which is essential if you want to use quantum mechanical laws to develop 
cosmological theories. By contrast, on the orthodox, Copenhagen view, quantum 
mechanics can never describe the universe as a whole since quantum mechanical 
description is always relativized to an arbitrarily specified measuring device, which is not 
itself given a quantum mechanical description. 

On the other hand, the many worlds interpretation has several serious shortcomings. First, 
it's simply weird conceptually. Second, it does not account for the fact that we never 
experience anything like a branching of the world. How is it that our experience is unified 
(i.e., how is it that my experience follows one particular path in the branching universe 
and not others)? Third, as noted above, the many worlds interpretation predicts that there 
will be worlds where the observed empirical frequencies of repeated measurements will 
not fit the predictions of the quantum mechanical formalism. In other words, if the theory 

background image

is true, there will be worlds at which it looks as if the theory is false! Finally, the theory 
does not seem to give a clear sense to locutions such as "the probability of this electron 
going up after passing through this Stern-Gerlach device is 1/2." What exactly does that 
number 1/2 mean, if the universe simply branches into two worlds, in one of which the 
electron goes up and in the other of which it goes down? The same branching would 
occur, after all, if the electron were in a state where the probability of its going up were 
3/4 and the probability of its going down were 1/4. 

Because of problems like this, the many worlds interpretation has gained much notoriety 
among physicists, but almost no one accepts it as a plausible interpretation. 

The Ghirardi-Rimini-Weber (GRW) Interpretation 

In 1986, three physicists (G.C. Ghirardi, A. Rimini, and T. Weber) proposed a way of 
accounting for the fact that macroscopic objects (such as Stern-Gerlach devices, cats, and 
human beings) are never observed in superpositions, whereas microscopic systems (such 
as photons and electrons) are. (Their views were developed further in 1987 by physicist 
John Bell.) According to the Ghirardi-Rimini-Weber (GRW) interpretation, there is a 
very small probability (one in a trillion) that the wave functions for the positions of 
isolated, individual particles will collapse spontaneously at any given moment. When 
particles couple together to form an object, the tiny probabilities of spontaneous collapse 
quickly add up for the system as a whole, since when one particle collapses so does every 
particle to which that particle is coupled. Moreover, since even a microscopic bacterium 
is composed of trillions and trillions of subatomic particles, the probability that a 
macroscopic object will have a definite spatial configuration at any given moment is 
vanishingly close to 100 percent. Thus, GRW can explain in a mathematically precise 
way why we never observe macroscopic objects in superpositions, but often observe 
interference effects due to superposition when we're looking at isolated subatomic 
particles. Moreover, they too can give a precise and non-arbitrary characterization of the 
measurement process as a type of physical interaction. 

There are two problems with the GRW interpretation. First, though subatomic particles 
are localized to some degree (in that the collapse turns the wave function representing the 
position of those particles into a narrowly focused bell curve), they never do have precise 
positions. The tails of the bell curve never vanish, extending to infinity. (Draw diagram 
on board.) Thus, GRW never give us particles with definite positions or even with a small 
but spatially extended positions: instead, what we get are particles having "mostly" small 
but spatially extended positions. Importantly, this is also true of macroscopic objects 
(such as Stern-Gerlach devices, cats, and human beings) that are composed of subatomic 
particles. In other words, according to GRW you are "mostly" in this room, but there's a 
vanishingly small part of you at every other point in the universe, no matter how distant! 
Second, GRW predicts that energy is not conserved when spontaneous collapses occur. 
While the total violation of conservation of energy predicted by GRW is too small to be 
observed, even over the lifetime of the universe, it does discard a feature of physical 
theory that many physicists consider to be conceptually essential. 

Bohm's Interpretation 

background image

In 1952, physicist David Bohm formulated a complete alternative to standard (non-
relativistic) quantum mechanics. Bohm's theory was put into a relatively simple and 
elegant mathematical form by John Bell in 1982. Bohm's theory makes the same 
predictions as does standard (non-relativistic) quantum mechanics but describes a 
classical, deterministic world that consists of particles with definite positions. The way 
that Bohm does this is by postulating that besides particles, there is a quantum force that 
moves the particles around. The physically real quantum force, which is represented 
mathematically by Schrödinger's wave equation, pushes the particles around so that they 
behave exactly as standard quantum mechanics predicts. Bohm's theory is deterministic: 
if you knew the initial configuration of every particle in the universe, applying Bohm's 
theory would allow you to predict with certainty every subsequent position of every 
particle in the universe. However, there's a catch: the universe is set up so that it is a 
physical (rather than a merely practical) impossibility for us to know the configuration of 
particles in the universe. Thus, from our point of view the world behaves just as if it's 
indeterministic (though the apparent indeterminism is simply a matter of our ignorance); 
also, though the wave function governing the particle's motion never collapses, the 
particle moves around so that it looks as if measurement causes it to collapse. 

Despite its elegance and conceptual clarity, Bohm's theory has not been generally 
accepted by physicists, for two reasons. (1) Physicists can't use Bohm's theory, since it's 
impossible to know the configuration of all particles in the universe at any given time. 
Because of our ignorance of this configuration, the physical theory we have to use to 
generate predictions is standard quantum mechanics. Why then bother with Bohm's 
theory at all? (2) What is more important, in Bohm's theory all the particles in the 
universe are intimately connected so that every particle instantaneously affects the 
quantum force governing the motions of the other particles. Many physicists (Einstein 
included) saw this feature of Bohm's theory as a throwback to the type of "action at a 
distance" expunged by relativity theory. Moreover, Bohm's theory assumes that 
simultaneity is absolute and so is inconsistent with relativity theory in its present form. 
(There is no Bohmian counterpart to relativistic quantum mechanics, though people are 
currently working on producing one.) 

Conclusion 

Each of the various interpretations of quantum mechanics we have examined, from the 
orthodox Copenhagen interpretation to the four non-orthodox interpretations (idealistic, 
many-worlds, GRW, and Bohm) has some sort of conceptual shortcoming. This is what 
makes the philosophy of quantum mechanics so interesting: it shows us that the fact that 
experiments agree with the predictions of quantum mechanics, under any of these 
interpretations, indicates one thing with certainty--the world we live in is a very weird 
place! 

 

 

 

background image