Strong Inference: rationale or inspiration?

John Platt's article "Strong Inference" (1964) suggested a general and effective method of scientific investigation. It describes a disciplined strategy of falsification of multiple, clearly formulated hypotheses that is used more regularly in some scientific fields than in others. Platt urged that strong inference be more widely and more systematically applied, particularly in slower-moving fields of science. The article has influenced integrative biological fields since its publication, ranging from ecology to psychology, and has had a substantial following in some of the social sciences. It has also evoked severe criticism for its idealization of certain fields as exemplars and for its imperfections in historiography and philosophy of science. I argue here that the article was more an inspirational tract than the development of a formal scientific methodology. Although both Platt's critics and his adherents appeared to take the article far too seriously, its influence has transcended its limitations.

I N 1964, JOHN PLATT PUBLISHED a lead article in the journal Science entitled "Strong Inference." He called attention to the rapid progress being made in some scientific fields, such as nuclear physics and molecular biology, and he explored the reasons why these areas moved so rapidly, while others languished as unexciting fields of investigation. The article captured the attention of many academics dissatisfied with the pace and standards of their disciplines.
"Strong Inference" describes a systematic use of inductive reasoning that prom-

Perspectives in Biology and Medicine
of integrative fields, such as medicine, psychology, and ecology, but citations from the social sciences-anthropology, education, linguistics, marketing, economics, management, and the like-have grown more numerous as these fields have strived for scientific legitimacy. Along the way, however, a significant number of authors have severely criticized Platt's premises and his argument.
Here I will argue that both Platt's critics and those who embraced his views took his article far too seriously. The article was an inspirational tract, the strengths of which were largely rhetorical. Its undoubted success in its aim and influence in biologically oriented fields overshadowed its shortcomings in its historical perspective, its inappropriate examples, and its idealization of scientific procedure.

Philosophical Context
Platt enhanced the messages of Bacon and Chamberlin with colorful descriptions and amplified them by references to the then-recent publications of Karl Popper and Thomas Kuhn. In his Logic of Scientific Discovery, Popper (1959) argued that falsification, not verification, defines true science.This surprised few, inasmuch as many scientists, from Bacon's time onward, had discovered this for themselves. However, Popper's approach was abstract and uncompromising, articulated with a purity that excluded as unscientific any investigation that relied on exploratory inductive methods and "creative intuitions." Instead, he confined himself to testing the validity of theories in light of agreed-upon ("infallible") empirical information. Experiments must specify in advance a result that will compel abandonment of the theory. Physicists and chemists welcomed Popper's stress on falsification as a component of their standard methodology. Kuhn (1962) presented a more radical notion: that "discovery" was not a logical but a psychological process. As anomalies accumulate in the context of one accepted "paradigm," a crisis develops and is relieved by the introduction of a rival view, accepted as much through persuasion as by evidence.While Kuhn cited the familiar examples of Copernicus supplanting Ptolemy and Einstein supplanting Newton, biologists could see-if inappropriately-the discovery of the genetic role of DNA as a revolutionary paradigm shift of their own. The incompatibility of Popper's and Kuhn's accounts of scientific programs bothered no one. The situation actually called attention to a liberating contest of paradigms in the philosophy of science.
The philosophy of science soon underwent a tempest of argument that remains far more interesting for its lack of consensus than for its contribution to the natural sciences. Lakatos (1970) presented a more realistic account of the "methodology of scientific research programmes," integrating elements of both Popper's and Kuhn's views. Falsification remained an indispensable feature of Lakatos' methodology, but progress was slower, measured in incremental "problemshifts." A belt of "auxiliary hypotheses"-the targets of potentially falsifying experiments-surrounded the central theory. Thus the fundamental elements of re- search programs were not so vulnerable to failure as counterevidence and anomalies accumulated. At about the same time, Feyerabend (1975) presented a much more pluralistic ("anarchic") account of the methods of science, usually referred to as "anything goes."This stance devalued itself as time went on with its wicked, snide rhetoric ("Popper . . . is just a tiny puff of hot air in the positivistic teacup") and its extension into provocative social criticism (Feyerabend 1988). These developments in the philosophy of science engaged integrative biologists after 1975, as they were induced to consider their methods more closely.

Historical Context
In 1964, many experimental biologists were innocent or dismissive of the philosophy of investigation. Platt's article was a fresh view of hard science, emphasizing falsification as the engine of progress. This was a time when science became infused with considerable government support. Universities were expanding, and industry increasingly sought the expertise of academia. The preliminary steps toward molecular biology turned into enormous strides with the advent of microbial genetics on the one hand and the physical study of protein and nucleic acid structure on the other (Davis 2003;Judson 1979). The biological territory between the molecular and cellular dimensions fascinated an admiring public (Kay 1993).The money flowing into new scientific enterprises, especially after the launch of Sputnik in 1957, gave biologists many new experimental tools. These included radioactive tracers, ultracentrifuges, amino acid analyzers, X-ray crystallographic instruments, and fluorescence and electron microscopes. The exciting questions of the day and new instrumentation arose in a reciprocal relationship: an idea generated an instrument to test it; the instrument suggested other ideas that might be tested. We must ask whether strong inference, simpler biological systems, or instrumental opportunity played the greater role in the rapid progress Platt refers to in some fields.This question has been debated for decades and cannot possibly have a single, general answer (Fruton 1999). One of Platt's too understated points is that systematic formulation of multiple hypotheses and decisive falsifying tests contribute to, rather than assure, rapid progress. The entry of physicists into molecular biology in the 1940s led, as Gunther Stent (1968) put it, to "the introduction of previously unknown standards of experimental design." These standards, coupled with the traditions of genetics, included a taste for universality, model building, and clear-cut exclusionary experiments. Among the latter were the Meselson-Stahl experiment (1958), which excluded conservative and dispersive models of DNA replication in favor of the semi-conservative mode. Another was the winnowing of the many early models of the genetic code that left the triplet code standing (Crick et al. 1961). Later developments bear out the pattern as chemiosmotic mechanisms of energy transduction, self-assembly of macromolecular aggregates, compartmental regulation of metabolism, and the genetic heterogeneity of natural populations supplanted Perspectives in Biology and Medicine older views. As a result, many biologists had adopted Platt's and Popper's admonitions without having heard of either writer.

Some Critiques
Platt's critics fall into two groups. One faults Platt, as they would Popper, for too heavy a reliance on falsification or on the requirement for multiple hypotheses for effective scientific practice.The other group faults him for poor historiography and for ignoring a host of factors that foster or impede research programs.
The first major critique of Platt's thesis, written by a nuclear physicist and a historian of science, appeared the year after Platt's article was published (Hafner and Presswood 1965). Entitled "Strong Inference and Weak Interactions," it tells in detail of the development of the present V-A theory of weak interactions in nuclear physics (which include nuclear beta decay) in the period 1949 to 1957. Fermi had proposed an early theory accounting for nuclear emission of an alpha particle and a beta emission (electron) with vector and axial vector properties (the V and A of the V-A designation). The "tortuous path" to the V-A theory of 1965 described by Hafner and Presswood includes the assumed conservation of parity in beta decay, later disproven; a persistent belief in certain prior experimental measurements recognized later as inaccurate; the designation of anomalous findings as inaccurate, recognized later as strong support for a newer theory; and at the end, an insistence on theoretical considerations that compelled repetition of anomalous experiments and getting different and more dependable results. The process looks more like a badminton game than climbing a logical tree, but it led to the weak interaction theory in a more precise form. As this example illustrates, Platt ignored the assessment of prior efforts and information in a field. Moreover, this oversight greatly complicated the choice of crucial experiments to test a model, a choice he left to investigators' imaginations.
Hafner and Presswood give strong inference its due. Some of the early work on the V-A theory can be seen as a process of exclusion of two or more contending hypotheses. In this as in many case studies, however, individual scientists gather data and formulate hypotheses consistent with them, hypotheses they test, yes-or-no, one at a time.This is simple inductive science, proven effective over the centuries. In such cases, strong inference comes into play only when opposing theories arise in a field, reaching a point at which, as Kuhn says, their proponents agree on what would constitute a crucial experiment.Thus strong inference is more often a social rather than an individual enterprise. It only appears to be an individual enterprise in fields in which data emerge so quickly that all members of the community have the competing views in mind at all times. In such cases, mature, competing theories are distinguished from one another in classical, tie-breaking tests, but strong inference may have had little role in the development of the clashing theories.
In another critique, the experimental psychologist McDonald (1992)  subjects were presented with problems of identifying, in two steps, the true category (e.g.,"animal," known only to the experimenter) when presented with an example (e.g., "bird"). They were to use strategies of simple inference (two hypotheses in sequence, to be confirmed or disconfirmed) versus strong inference (two pairs of hypotheses, each pair's members to be distinguished from one another). The initial tests confirmed the superiority of strong inference, since a simple test of "bird" gave subjects increased confidence that "bird" was the true category. Disconfirmatory tests (e.g.,"dog") tended to be used more regularly if two hypotheses were in play. Other experiments, however, showed that simple inference may be more effective than strong inference under other circumstances. This work was extended by Sanbonmatsu et al. (2005), who judged the effectiveness of confirmatory or disconfirmatory searches when choices were "graded" in categories of all, most, some, few, none. No exclusive, real-world rule emerges from these studies, largely because it is impossible to gauge the probability of the effectiveness of a given strategy. McDonald also points to a common experience in science: often falsification of a single ruling theory occurs with no alternative available to assimilate the result. Only at this point-a Kuhnian moment that is both social and scientific-does another hypothesis emerge, with many adherents of the previous theory still resisting change.
Finally, O'Donohue and Buchanan (2001) have published a thorough critique of Platt's article.They contend that strong inference is no more frequently used in fast-moving than in slower fields; that Platt's historiography misrepresents his examples; and that numerous other methods than strong inference are used successfully in many sciences. For instance, Platt downplayed the use of prior information in formulating hypotheses; failed to acknowledge the logical impossibility of enumerating all relevant hypotheses in a given case; and failed to show how the steps of strong inference are to be carried out.These criticisms also can be leveled against Popper, since he deliberately confined himself to the deductive process of falsification and little else. O'Donohue and Buchanan conclude: "Overall, there does not appear to be any evidence to suggest that SI [strong inference] accounts for what progress has occurred in science, what is currently being done in science, or is a method that will inevitably lead to scientific progress in the future. These problems may explain why despite SI being frequently touted as a regulative scientific method, it is infrequently used." These critiques have an air of excess about them, given the brevity and colloquial tone of their target. Nevertheless, they demonstrate the complexity of scientific investigation, never easy to characterize and rarely confined to single procedural formulas. These points are made by Lakatos and by Feyerabend, and tacitly by Kuhn, some of whom these modern critiques duly cite, along with Popper. Platt, taken as seriously as the critics above take him, is dead wrong on many points. Indeed, Platt was not a serious philosopher of science: he simply brought home the Baconian lesson to well-domesticated scientists and introduced them to Popper. Platt's name appears nowhere in the later books of Kuhn,

Perspectives in Biology and Medicine
Lakatos, or Feyerabend. In fact, he is not even cited in Peter Medawar's The Art of the Soluble (1967), a book popular with scientists and nonscientists alike on the nature and philosophy of science. So why does Platt have such a following, especially outside the more rigorous fields of the natural sciences? Is "strong inference" more than a buzzword used to legitimize vaporous investigations as scientific? I argue that Platt imparted to many natural and social scientists an ambition to test hypotheses rather than to prove them.

Who Cares, and Why?
Despite the flaws of his argument, Platt addressed scientists in an engaging prose style rarely found in serious philosophical writing. Up to his time, few biologists had any grounding in the formalities of scientific rationales. They simply emulated their elders and, if they had the knack, improved on their work. Platt's article arguably induced many scientists to become interested in (or at least bemused by) the history and philosophy of science. More important, Platt energized many academics, particularly in some tired or intractable fields, by making them more self-conscious about their procedures. In some integrative fields, investigators believed Platt's methods included how to "think outside the box" through practice.This belief conflates insight-even genius-and rationale. Nevertheless, this rhetorical device-suggesting that "genius" might be learned-is a part of the inspirational character of Platt's article.
Physicists, chemists, geneticists, and molecular biologists were smug in light of Platt's use of their programs as models, and they had much to be smug about. Accordingly, their actual practices were least affected by Platt's arguments. In biology, however, the fields of embryology, ecology, taxonomy, neurobiology, cancer biology, pharmacology, epidemiology, and population biology had reached impasses or plateaus. The same could certainly be said of the social sciences. In many areas, ruling hypotheses prevailed. Platt stressed the need, if progress were to be made, to keep the "questions" in mind, and to formulate falsifying tests to clarify them. Some fields that later benefited from the stronginference approach seemed least likely to be successful with it, while others, seemingly ripe for strong inference, could not use it. Let us look at the latter first.
At the middle of the 20th century, experimental embryology was a slow-moving field, despite the formulation and testing of multiple hypotheses. The field demonstrates how the nature of its problems and its technical limitations impeded progress.We have seen that genetics and molecular biology had the advantages of particulate phenomena, all-or-nothing mutations, discreteness of phenotypic characteristics defined by mutation, and finally chemical information about DNA, proteins, and enzymes that underlay the abstractions of early genetics. By contrast, embryology was, for want of a better word, theoretically "unitless." It had instead continuous changes with time, continuous fields and gradients, inducers and polarities, and gross dissections of early embryos having few morphological land- marks. Spemann pioneered work between 1900 and 1925 on embryonic induction-research characterized by strong-inference formulation and test of multiple hypotheses. As Spemann (1938) mused about the formation of the frog eye: How is it that the lens begins to grow just at that spot in the epidermis where it is touched by the optic cup and exactly at that moment when the rudiment of the retina invaginates? Do these processes mutually influence each other, either in that the growing lens presses the retina inward or at least causes it to invaginate, or that the retina, while drawing in, starts the growth of the lens? Or do both processes go on independently of each other in self-differentiation of their respective rudiments, and does their exact fitting together depend upon previous and accurate tuning of the parts to a perfect harmony between them? (p. 44) This was a good start in posing the questions, but even Spemann felt that theory would emerge only from further empirical research. A huge literature on polarity, gradients, determination, potency, and the like eventually grew out of descriptive experiments, and these properties became the realities of embryology despite the lack of corresponding molecular information. Multiple hypotheses accumulated in the literature, simply because they were impossible to falsify (see especially Waddington 1962).A distaste for "idle speculation" prevailed in many quarters. Jane Oppenheimer observed in 1955 that "The greatest progressive minds of embryology have not searched for hypotheses; they have looked at embryos," and she actually chided another author for "wishing to concentrate on a few key data in order to derive the key hypotheses we require to proceed," rather than sticking to observations (p. 168). Progress remained slow until more precise mutational approaches to early embryogenesis, subcellular probes for RNA expression, and an influential, testable model of positional information (proposed by Wolpert in 1968) merged in the 1980s (Nüsslein-Volhard and Wieschaus 1980). On the other hand, Platt has had his greatest impact in biology in the field of ecology. Ecologists were inspired to wrestle more productively with complex phenomena involving many causes and their possible interactions. In 1983, the American Naturalist devoted an entire issue to a "A Round Table on Research in Ecology and Evolutionary Biology," largely focused on the persistent question of whether and how competition shaped biological communities.The round table pitted "commonsense" approaches against more disciplined, hypothesis-testing approaches, with a valuable discussion of the limits of a strong-inference rationale. As I will indicate, ecologists did not adopt Platt's rationale unthinkingly, but were moved by him to judge their methods much more critically.
As might be expected, the round-table participants reached no full agreement, and their discussion illuminates the difficulty of applying abstract methods to real, complex problems. In addition to citing incompatible philosophical views, they assessed the value of different ecological approaches. Roughgarden (1983) objected that theory simplified problems to the point that mathematical models took on an irrelevant life of their own. He objected to a study of Simberloff 's Perspectives in Biology and Medicine group that claimed that competition and coevolution of competitors did not occur, based on a Popperian falsificationist "experiment." Simberloff (1983) denied that he had made such a claim, but only that a null hypothesis (no influence of competition) was a good starting point; the negative oucome "narrowed the universe." Others, including Quinn and Dunham (1983), objected that a null hypothesis has little content and is therefore trivial, not a legitimate substrate for Popper-Platt methods. Citing Platt as their point of reference, Quinn and Dunham asked further questions: is it legitimate to test one of many factors in a complex system for its possible role? Is strong inference of value in multivariate tests, which involve non-exclusive factors that may or may not interact? In fact, how does an ecologist do clear-cut, hypothesis-testing experiments or make clear, falsifying observations? Roughgarden (1983) brushed such questions aside, pointing to the down-to-earth validity of observational "facts" that speak for themselves.The influence of Platt, often cited in the discussion, was to provoke a useful contest of ideas, to promote the self-consciousness I allude to above. Indeed, the next few years saw a number of articles invoking the history and philosophy of science as a guide in ecology and evolutionary biology (Atkinson 1985;Loehle 1987;Wenner 1989). Most of these (of which Loehle's article is the most thorough) cite Platt as the origin of the authors' interest in new approaches.
The ecologist A. M. Wenner (1989) diagrammed the positions of the truthseeking realists Popper and Carnap, the relativists Chamberlin and Platt, who simply seek best-fit hypotheses, and Kuhn, who promoted the psychological aspects of theory change.Wenner claimed that multiple rationales-empirical exploration, falsification, verification, strong inference, modeling, and others-all have their place in real science. Exclusive use of one approach, he said, limits inquiry. In short, science needs the corrective of competing rationales, a polite echo of Feyerabend's anarchic position.While Wenner's theme might be described as a metarationale, he certainly saw a place for Platt's ideas in ecological studies.
An even later ecological study by Huey et al. (1999) extended the ecological conundrum posed above. These authors transformed the spirit of multiple hypotheses into a search for the relative impact of several different influences. Specifically, they sought to identify physical factors (called "hypotheses" in the paper) operating during juvenile or larval development on the heat sensitivity of the adults of three different organisms (Drosophila, Volvox, and Trichogramma).This approach, which Chamberlin certainly would have endorsed, contrasts both with pitting one hypothesis against another and with seeking data contradicting a null hypothesis. The approach is hard to distinguish from simple multivariate analysis, but it clearly draws on Chamberlin and Platt by imagining and assessing multiple factors. It also escapes the limitations of the Popper-Platt strategy of excluding all but one surviving hypothesis. Thus Platt's article, well after its publication, inspired ecologists and evolutionary biologists to explore the philosophy of science and thereby to improve the sophistication of their fields. They take from Platt and Chamberlin the idea spring 2006 • volume 49, number 2 247 that multiple hypotheses must be considered together in complex areas. Their problems are different in many ways from the better defined questions in chemistry, molecular genetics, or nuclear physics, and ecological studies point up the major difference: in the latter group of "hard" sciences, the clearest tests of a hypothesis are qualitative, while in ecology and psychology, decisions are quantitative and thus less exclusive or secure.

Beyond Biology
Platt has also influenced the field of psychology, another integrative field. One of many studies influenced by Platt was Dixon and Moore's (2000) examination of "developmental ordering" in infants. Methodologically similar to the study of Huey et al. (1999), it focused on such questions as whether skill A appears before skill B or vice versa; whether skills develop continuously or in saltatory fashion; whether the onset of separate skills overlap; and whether measurements of continuous phenomena using categorical scales is legitimate. Dixon and Moore credited Platt as a guiding light in suggesting disconfirmatory tests and multiple hypotheses. Similarly, Carpenter et al. (1993) strongly advocated Platt's recommendations in resolving schizophrenia into its subcategories and correlating them with anatomical derangements of the brain with a strong-inference rationale. Platt's article excited interest in other social sciences as well. From the outset, citations from journals of social psychology, education, social work, marketing, management, and economics proliferated in parallel with those from the natural sciences. Several recent examples are those of Sparks and Ganschow (1995) on barriers to foreign language learning, Kinraide and Denison (2003) on teaching students about hypothesis testing, Karson and Fisher (2005) on attitudes toward advertisements and the intention to buy, Staw and Cohen-Charash (2005) on personal dispositions and job satisfaction, and Fischhoff (2000) on means of estimating the future outcome of resource allocations by government agencies. Significantly, Platt is the usually the sole authority for the social scientists' discussion of falsificationist strategies.

Conclusion
Platt purported to illustrate the use of strong inference in physics and molecular biology. In doing so, he cleverly idealized both his examples and the rationale of strong inference. He stressed that the rationale could be learned through systematic application. He inspired his readers to rethink their scientific goals and to use strong inference in their research programs.While Platt has had more admirers than critics, the strongest critiques of his recommendations were entirely justified. Indeed, some of the most dutiful admirers found themselves at sea as long as they tried to apply strong inference alone to intractable problems. However, neither his critics nor his admirers acknowledge fully the frank inspirational intent of his 1964 article. Platt achieved his goal of encouraging serious attacks Perspectives in Biology and Medicine on major problems and did so with colorful rhetoric, not nuanced philosophic rigor and historiography. A social message was that if investigators test multiple hypotheses prevailing in their field with disconfirmatory tests rather than simply defending their own views, science becomes more a game than a war.
While Platt left to his readers the job of imagining good experiments, he offered promise to those in slower fields that they might change their world by adopting new and simple ways of thought. It was not necessarily the subject matter of the traditional fields that made them so dull, he felt, it was a want of challenging alternatives and of an effective scientific rationale. Clever people might unclog the backwaters and impart to them the status of modern science. The article, I believe, encouraged better ideas, better choices of research problems, better model systems, and thus better science overall, even in the fields relatively resistant to the rigors of strong inference.