Scientific method

The scientific method is how scientists investigate phenomena and acquire new knowledge. It is based on observable, empirical, measurable evidence. Scientists propose hypotheses to explain phenomena, and test those hypotheses by examining the evidence from experimental studies. Scientists also formulate theories that encompass whole domains of inquiry and bind hypotheses together into logically coherent wholes.

"Science is a way of thinking much more than it is a body of knowledge." (Carl Sagan^[1]).

Elements of scientific method

According to Charles Darwin ,

". . .science consists in grouping facts so that general laws or conclusions may be drawn from them." ^[2]

Darwin's simple account begs many questions. What do we mean by ‘facts’? How much can we trust our senses to enable us to believe that what we see is true? How do scientists ‘group’ facts? How do they select which facts to pay attention to, and is it possible to do this in an objective way? And having done this, how do they draw any broader conclusions from the facts that they assemble? How can we know more than we observe directly?

The English philosopher, Francis Bacon is often credited as the pioneer of the modern scientific method. He proposed that scientists should "empty their minds" of self-evident truths and, by 'observation and experimentation' should generate hypotheses by a process of induction. He nevertheless recognised that interpreting nature needs something more than observation and reason:

...the universe to the eye of the human understanding is framed like a labyrinth, presenting as it does on every side so many ambiguities of way, such deceitful resemblances of objects and signs, natures so irregular in their lines and so knotted and entangled. And then the way is still to be made by the uncertain light of the sense, sometimes shining out, sometimes clouded over, through the woods of experience and particulars; while those who offer themselves for guides are (as was said) themselves also puzzled, and increase the number of errors and wanderers. In circumstances so difficult neither the natural force of man's judgement nor even any accidental felicity offers any chance of success. No excellence of wit, no repetition of chance experiments, can overcome such difficulties as these. Our steps must be guided by a clue... ^[3]

We live in a world that is not directly understandable. We sometimes disagree about the ‘facts’ we see around us, and some things in the world are at odds with our understanding. What we call the “scientific method” is an account of how scientists attempt to reach agreement and understanding, how they gather and report observations in ways that will be understood by others and accepted as valid evidence, how they construct explanations that will be consistent with the world, that will withstand critical logical and experimental scrutiny, and that will provide the foundations for further increases in understanding.

The success of science, as measured by the technological achievements that have progressively changed our world, have led many to conclude that this must reflect the success of some methodological rules that scientists follow in their research. However, not all philosophers accept this conclusion; notably, the philosopher Paul Feyerabend denied that science is genuinely a methodological process. In his book Against Method he argued that scientific progress is not the result of applying any particular rules ^[4] . Instead, he concluded almost that "anything goes", in that for any particular ‘rule’ there are abundant examples of successful science that have proceeded in a way that seems to contradict it. ^[5] To Feyeraband, there is no fundamental difference between science and other areas of human activity characterised by reasoned thought. A similar sentiment was expressed by T.H. Huxley in 1863: "The method of scientific investigation is nothing but the expression of the necessary mode or working of the human mind. It is simply the mode at which all phenomena are reasoned about, rendered precise and exact." ^[6]

Nevertheless, in the Daubert v. Merrell Dow Pharmaceuticals [509 U.S. 579 (1993)] decision, the U.S. Supreme Court accorded a special status to ‘The Scientific Method ‘, in ruling that "… to qualify as ’scientific knowledge’ an inference or assertion must be derived by the scientific method. Proposed testimony must be supported by appropriate validation - i.e., ‘good grounds,’ based on what is known." The Court also stated that "A new theory or explanation must generally survive a period of testing, review, and refinement before achieving scientific acceptance. This process does not merely reflect the scientific method, it is the scientific method."^[7]

Hypotheses and theories

Hypotheses and theories play a central role in science; the idea that any observer can study the world except through the spectacles of his or her preconceptions and expectations is not sustainable. As preconceptions change with progressively changing understanding of the world, the nature of science itself changes, and what was once considered conventionally scientific might no longer seems so.

A hypothesis is a proposed explanation of a phenomenon. It is an “inspired guess”, a “bold speculation” , embedded in current understanding yet going beyond that to assert something that we do not know for sure as a way of explaining something not otherwise accounted for. Scientists use many different means to generate hypotheses including their own creative imagination, ideas from other fields, induction, Bayesian inference. Charles Sanders Peirce described the incipient stages of inquiry, instigated by the "irritation of doubt" to venture a plausible guess, as abductive reasoning. The history of science is filled with stories of scientists claiming a "flash of inspiration", or a hunch, which then motivated them to look for evidence to support or refute their idea. Michael Polanyi made such creativity the centrepiece of his discussion of methodology.

If a hypothesis has any scientific content, then it will lead to predictions, and by experiments to see whether these predictions are fulfilled of not, the hypothesis can be tested. If the predictions prove wrong, the hypothesis is discarded, otherwise the hypothesis is put to further test, and if it resists determined attempts to disprove it, then it might come to be accepted, at least for the moment, as plausibly true.

The philosopher Karl Popper , in The Logic of Scientific Discovery, a book that Sir Peter Medawar called "one of the most important documents of the 20th century", argued that this 'hypothetico-deductive' method was the only sound way by which science makes progress. He argued that the alternative process of induction - of gathering facts, considering them, and inferring general laws, is logically unsound, as many mutually inconsistent hypotheses might be consistent with any given facts. Popper concluded that for a proposition to be considered scientific, it must, at least in principle, be possible to make an observation that would show it to be false. Otherwise the proposition has, as Popper put it, no connection with the real world. Popper argued that the explanations of Freudian psychoanalysis, those of Marxism, and those of astrology, were all ‘empty’ and unscientific in this sense. ^[8]

For Popper, a theory has a profound importance; it encompasses the preconceptions by which the world is viewed, and defines what we choose to study, and how we study it and understand it. He recognised that theories are not discarded lightly, and a theory might be retained long after it has been shown to be inconsistent with many known facts (anomalies). However, the recognition of anomalies drives scientists to elaborate or adjust the theory, and if the anomalies continue to accumulate, will drive them to develop alternative theories. Theories always also contain many elements that are not falsifiable, but Popper argued that these should be kept to a minimum, and that the content of a theory should be judged by the extent to which it inspired testable hypotheses. This was not the only criterion in choosing a theory; scientists also seek theories that are "elegant"; a theory should yield clear, simple explanations of complex phenomena, that are intellectually satisfying in appearing to be logically coherent, rich in content, and involving no miracles or other supernatural devices.

Popper's views were in marked contrast to those of his contemporary, the historian of science Thomas Kuhn. Kuhn's own book "The Structure of Scientific Revolutions" was no less influential than Popper's, but its message was very different. Kuhn analysed times in the history of science when one dominant theory was replaced by another - such as the replacement of Ptolemy's heliocentric model of the Universe with the Copernican geocentric model, and the replacement of Newtonian laws of motion with Einstein's theory of Relativity. In many respects Popper was asserting what he held to be the rules for "good science", Kuhn on the other hand considered himself to be reporting what scientists actually did, although he believed that as what they did was undeniably successful, probably there was merit in what they actually did. Kuhn concluded that falsifiability in fact had played almost no role in these "scientific revolotions" where one paradigm was relaced by another. He argued that scientists working in a field form a closed group. The group resists attempts from outside to offer alternative interpretations, and tenaciously defends their world view by continually elaborating their shared theory. This mode, when one theory or paradigm is dominant is when normal science is done and according to Kuhn, when most progress is made, by what Kuhn called "puzzle solving" in a way that extends the scope and explanatory power of the dominant theory. He argued that one theory is replaced by another not because scientists are "converted" to a different world view; rather a new theory starts as an unfashionable alternative that gains adherents as its advantages become apparent to new scientists entering the field. Seldom are experiments decisive in refuting one theory and imposing a new one; Kuhn argued that theories are "incomensurable"; one theory cannot be tested by the assumptions of another, and for the adherents of a theory, "once it has been adopted by a profession ... no theory is recognized to be testable by any quantitative tests that it has not already passed".

^[9]

Experiments and observations

Werner Heisenberg in a quote that he attributed to Albert Einstein , stated [Heisenberg 1971]:

The phenomenon under observation produces certain events in our measuring apparatus. As a result, further processes take place in the apparatus, which eventually and by complicated paths produce sense impressions and help us to fix the effects in our consciousness. Along this whole path—from the phenomenon to its fixation in our consciousness—we must be able to tell how nature functions, must know the natural laws at least in practical terms, before we can claim to have observed anything at all. Only theory, that is, knowledge of natural laws, enables us to deduce the underlying phenomena from our sense impressions.

For much of the 20th century, the dominant approach to science has been reductionism – the attempt to explain all phenomena in terms of basic laws of physics and chemistry. This driving principle has ancient roots - Francis Bacon (1561-1626) quotes Aristotle favourably as declaring "That the nature of everything is best seen in his smallest portions." ^[10] In many fields, however, reductionist explanations are impractical, and all explanations involve 'high level' concepts. Nevertheless, the reductionist belief has been that these high level concepts are all ultimately reducible, and that the role of science is to progressively explain high level concepts by concepts closer and closer to the basic physics and chemistry. For example, to explain the behaviour of individuals we might refer to motivational states such as hunger or anxiety. We believe that these reflect features of the activity of the brain that are still poorly understood, but can investigate the brain areas that house these drives, calling them, for example, “hunger centres”, These centres each involve many neural networks – interconnected nerve cells, and the functions of each network we can again probe in detail. These networks in turn are composed of specialised neurons, whose behaviour can be analysed individually. These nerve cells have properties that are the product of a genetic program that is activated in development – and so reducible to molecular biology. However, while behaviour is thus reducible to basic elements, explaining the behaviour of an individual in terms of these elements has little predictive value, because the uncertainties in our understanding are too great, so explanations of behaviour still largely depend upon the high level constructs. Historically, the converse philosophical position to reductionism has taken many names, but the clearest debate was between “vitalism” and reductionism. Vitalism held essentially that some features of living organisms, including life itself, were not amenable to a physico-chemical explanation, and so asserted that high level constructs were essential to understanding and explanation.

The reductionist approach assigned a particular importance to measurement of observable quantities. Measurements may be tabulated, graphed, or mapped, and statistical analysed; often these representations of the data use tools and conventions that are at a given time, accepted and understood by scientists within a given field. Measurements may need specialized instruments such as thermometers, microscopes, or voltmeters, whose properties and limitations are familiar within the field, and scientific progress is usually intimately tied to their development. Measurements also need operational definitions: a scientific quantity is defined precisely by how it is measured, in terms that enable other scientists to reproduce the measurements, and which may ultimately involve internationally agreed ‘standards’. For example, electrical current, in amperes, can be defined in terms of the mass of silver deposited in a certain time on an electrode in an electrochemical device that is described in some detail. The scientific definition of a term sometimes differs substantially from their natural language use. For example, mass and weight overlap in meaning in common use, but have different meanings in physics. Scientific quantities are often characterized by units of measure which can be described in terms of conventional physical units. All measurements are accompanied by the possibility of error in measurement, so may be accompanied by estimates of their uncertainty, This is often estimated by making repeated measurements, and seeing by how much these differ. Counts of things, such as the number of people in a nation at a particular time, may also have an uncertainty: counts may only represent a sample of desired quantities, with an uncertainty that depends upon the sampling method used and the number of samples taken.

The scientific method in practice

The UK Research Charity Cancer Research UK gives an outline of the scientific method, as practised by their scientists [6]. The quotes that follow are all from this outline

[Scientists] start by making an educated guess about what they think the answer might be, based on all the available evidence they have. This is known as forming an hypothesis. They then try to prove if their hypothesis is right or wrong. Researchers carry out carefully designed studies, often known as experiments, to test their hypothesis. They collect and record detailed information from the studies. They look carefully at the results to work out if their hypothesis is right or wrong…

Once predictions are made, they can be tested by experiments. If the outcome contradicts the predictions, then explanations may be sought before the hypothesis is discarded as false. Sometimes there is a flaw in the experimental design, only recognised in retrospect. If the results confirm the predictions, then the hypotheses might still be wrong and if important, will be subjected to further testing. Scientists keep detailed records, both to provide evidence of the effectiveness and integrity of the procedure and to ensure that the experiments can be reproduced reliably. This tradition can be seen in the work of Hipparchus (190 BCE - 120 BCE), when determining a value for the precession of the Earth over 2100 years ago, and 1000 years before Al-Batani.

Peer review

…Once they have completed their study, the researchers write up their results and conclusions. And they try to publish them as a paper in a scientific journal. Before the work can be published, it must be checked by a number of independent researchers who are experts in a relevant field. This process is called ‘peer review’, and involves scrutinising the research to see if there are any flaws that invalidate the results…

Manuscripts submitted for publication in scientific journals are normally sent by the editor to (usually one to three) fellow (usually anonymous) scientists who are familiar with the field for evaluation. The referees advise the editor as to the suitability of the paper for publication in the journal, and report on its strengths and weaknesses, pointing out any errors or omissions that they have noticed and offering suggestions for how the paper might be improved by revision or by additional experiments. On the basis of these reports, the editor will then either reject the paper or advise that it might be acceptable if appropriately revised. The peer review process has been criticised, but is very widely adopted by the scientific community. Nevertheless, there are weaknesses: it is very much easier to publish data that are consistent with generally accepted theory than to publish data that contradict accepted theory: the 'bar' for acceptance of work is higher the more remarkable the claim. This helps to ensure the stability of the body of accepted theory, but also means that the appearance of the extent to which a conventionally accepted theory is supported by evidence might be misleading - boosted by poor quality supportive work and protected against higher quality opposing work.

On the other hand, originality, importance and interest are particularly important in 'high impact' general journals of science ^[11] thus if controversial work appears to be very convincing then it stands a good chance of being published in such journals Criticisms of journal publication priorities are that they are so vague, subjective and open to ideological or political, manipulation, that they can seem to impede rather than promote scientific discovery. Apparent censorship by refusing to publish ideas unpopular with mainstream scientists has soured the popular perception of scientists, by apparently contradicting their claim to be objective seekers of truth.

The scientific literature

…If the study is found to be good enough, the findings are published and acknowledged by the wider scientific community…

However Thomas Kuhn argued that scientists are

Sir Peter Medawar, Nobel laureate in Physiology and Medicine in his article “Is the scientific paper a fraud?” answered yes, "The scientific paper in its orthodox form does embody a totally mistaken conception, even a travesty, of the nature of scientific thought." In scientific papers, the results of an experiment are interpreted only at the end, in the discussion section, giving the impression that those conclusions are drawn by induction or deduction from the reported evidence. Instead, explains Medawar, the expectations that a scientist begins with provide the incentive for the experiments, and determine their nature, and they determine which observations are relevant and which are not. Only in the light of these initial expectations that the activities described in a paper have any meaning at all. The expectation, the original hypothesis, according to Medawar, is not the product of inductive reasoning but of inspiration, educated guesswork. Medawar was echoing Karl Popper, who proclaimed that

Confirmation

…But, it isn’t enough to prove a hypothesis once. Other researchers must also be able to repeat the study and produce the same results, if the hypothesis is to remain valid…

Sometimes experimenters make systematic errors during their experiments, Consequently, it is common for other scientists to attempt to repeat experiments, especially those that have yielded unexpected results ^[12]. Accordingly, scientists keep detailed records of their experiments, to provide evidence of their effectiveness and integrity and assist in reproduction. However, a scientist cannot record everything that took place in an experiment; he must select the facts that he believes are relevant. This may lead to problems if some supposedly irrelevant feature is questioned. For example, Heinrich Hertz did not report the size of the room that he used to test Maxwell's equations, and this turned out to account for a deviation in the results. The problem is that parts of the theory must be assumed in order to select and report the experimental conditions. Observations are thus sometimes described as being 'theory-laden'.

It seems to be only very rarely that scientists falsify their results; any who does so takes an enormous risk, because if the claim is important it is likely to be subjected to very detailed scrutiny, and the reputation of a scientist depends upon the credibility of his or her work. Nevertheless, there have been many well publicised examples of scientific fraud, and some have blamed the insecurity of employment of scientists and the extreme pressure to win grant funding for these. Under Federal regulations ^[13]"A finding of research misconduct requires that: There be a significant departure from accepted practices of the relevant research community; and The misconduct be committed intentionally, or knowingly, or recklessly; and The allegation be proven by a preponderance of evidence."

Honor in Science, published by Sigma Xi, quotes C.P. Snow (The Search, 1959): "The only ethical principle which has made science possible is that the truth shall be told all the time. If we do not penalise false statements made in error, we open up the way, don’t you see, for false statements by intention. And of course a false statement of fact, made deliberately, is the most serious crime a scientist can commit." It goes on to say: "It is not sufficient for the scientist to admit that all human activity, including research, is liable to involve errors; he or she has a moral obligation to minimize the possibility of error by checking and rechecking the validity of the data and the conclusions that are drawn from the data."

Statistics

…If the initial study was carried out using a small number of samples or people, larger studies are also needed. This is to make sure the hypothesis remains valid for bigger group and isn't due to chance variation…

Statistical analysis is a standard part of hypothesis testing in many areas of science. This use of statistics formalises the criteria for disproof by allowing statements of the following form "If a given hypothesis is true, the chance of getting the results that we observed is (say) only 1 in 20 or less (P < 0.05), so it is very likely that the hypothesis is wrong, and accordingly we reject it.

This notion of a hypothesis is quite different to Popper's. For instance, we might predict that a certain chemical will produce a certain surprising effect. However what we test is often not this, but the complementary null hypothesis - that the chemical will have no effect. The reason for this shift is that if our original hypothesis tells us that there will be an effect but is vague about its expected magnitude, we can still logically disprove the null hypothesis (by showing an effect), even though we cannot disprove the hypothesis that the chemical is effective as we cannot exclude the possibility that the effect is smaller than we can measure reliably.

The best answer might be to choose hypotheses that give quantitatively precise predictions, but in many areas of science this is often unrealistic. In medicine for example, we might expect a new drug to be effective in a particular condition from our understanding of its mechanism of action, but might be very uncertain how big the effect will be because of many uncertainties - how many individuals in a genetically variable population will be resistant to the drug? for example, and how quickly will tolerance to the drug develop in individuals that respond well?

To make this clearer; the scientist starts with an original hypothesis, a bold speculation, consistent with existing theory but extending it in some way. The scientist then tests the hypothesis by deriving a prediction - a proposition that will be true if the hypothesis is true but would not be expected to be true otherwise. The scientist then may design an experiment to test the null hypothesis - the assertion that the prediction is false. If the null hypothesis is disproved, the original hypothesis survives.

In fact, this is not hypothesis testing in Popper's sense at all, because this type of design does not put the original hypothesis itself at any hazard of disproof. Verification of this type is something that Popper considered to be, at best, weak corroborative evidence. Part of that weakness comes because it is impossible to put any sensible measure on the degree of support that such evidence gives to a hypothesis. ^[14]

An important school of Bayesian statistics seeks to provide a statistical basis for support by induction, and some areas of science use these approaches; but often this approach is not tenable because of the difficulty of attaching a priori probabilities in any meaningful way to the alternative predicted outcomes of an experiment.

Progress in science

…Over time, scientific opinion can change. This is because new technologies can allow us to re-examine old questions in greater detail.

Einstein's prediction (1907): Light bends in a gravitational field

Einstein's theory of General Relativity makes several specific predictions about the observable structure of space-time, such as a prediction that light bends in a gravitational field and that the amount of bending depends in a precise way on the strength of that gravitational field. Arthur Eddington's observations made during a 1919 solar eclipse supported General Relativity rather than Newtonian gravitation.

Notes and references

↑ Sagan C. The fine art of baloney detection. Parade Magazine, p 1213, Feb 1, 1987.
↑ From the autobiography of Charles Darwin [1]
↑ from Preface to The Great Instauration; 4.18 quoted in Pesic P (2000)The Clue to the labyrinth: Francis Bacon and the decryption of nature Cryptologia [2]
↑ Feyerabend PK (1975) Against Method, Outline of an Anarchistic Theory of Knowledge Reprinted, Verso, London, UK, 1978
↑ Feyerabend's "anything goes" argument explained at the Galilean Library
↑ Huxley TH (1863) From a 1863 lecture series aimed at making science understandable to non-specialists
↑ Text of the opinion, LII, Cornell University; Daubert-The Most Influential Supreme Court Decision You've Never Heard of
↑ As expressed by Peter Singer, [3] "That an interpretation of history is capable of ordering a good many facts does not demonstrate its truth, any more than do similar claims about past conjunctions between the positions of the planets and events on earth demonstrate the truth of astrology. The historicist, like the astrologer, must say what future developments, or new discoveries about the past, would refute his theory; and if he cannot or will not do so, his claim to scientific knowledge need not be taken seriously."
↑
Kuhn TS (1961) The Function of Measurement in Modern Physical Science ISIS 52:161–193
- Kuhn TS (1962)The Structure of Scientific Revolutions, University of Chicago Press, Chicago, IL, 1962. 2nd edition 1970. 3rd edition 1996
- Kuhn TS (1977) The Essential Tension, Selected Studies in Scientific Tradition and Change, University of Chicago Press, Chicago, IL
↑ Francis Bacon 'The Advancement of Learning' [4]
↑ -see for example the author guidelines for Nature
↑ Georg Wilhelm Richmann was killed by lightning (1753) when attempting to replicate the 1752 kite experiment of Benjamin Franklin. See, e.g., Physics Today, Vol. 59, #1, p42. [5]
↑ the Federal Register, vol 65, no 235, December 6, 2000
↑ In appendix ix to The Logic Popper states: "As to degree of corroboration, it is nothing but a measure of the degree to which hypothesis h has been tested...it must not be interpreted therefore as a degree of the rationality of our belief in the truth of h...rather it is a measure of the rationality of accepting, tentatively, a problematic guess."

External links

An Introduction to Science: Scientific Thinking and a scientific method by Steven D. Schafersman.
Introduction to a scientific method
Theory-ladenness by Paul Newall at The Galilean Library
Scientific Method
Analysis and Synthesis: On Scientific Method based on a study by Bernhard Riemann From the Swedish Morphological Society
Using the scientific method for designing science fair projects from Science Made Simple

[1] Sagan C. The fine art of baloney detection. Parade Magazine, p 1213, Feb 1, 1987.

[2] From the autobiography of Charles Darwin [1]

[3] rom Preface to The Great Instauration; 4.18 quoted in Pesic P (2000)The Clue to the labyrinth: Francis Bacon and the decryption of nature Cryptologia [2]

[4] Feyerabend PK (1975) Against Method, Outline of an Anarchistic Theory of Knowledge Reprinted, Verso, London, UK, 1978

[5] Feyerabend's "anything goes" argument explained at the Galilean Library

[6] Huxley TH (1863) From a 1863 lecture series aimed at making science understandable to non-specialists

[7] Text of the opinion, LII, Cornell University; Daubert-The Most Influential Supreme Court Decision You've Never Heard of

[8] As expressed by Peter Singer, [3] "That an interpretation of history is capable of ordering a good many facts does not demonstrate its truth, any more than do similar claims about past conjunctions between the positions of the planets and events on earth demonstrate the truth of astrology. The historicist, like the astrologer, must say what future developments, or new discoveries about the past, would refute his theory; and if he cannot or will not do so, his claim to scientific knowledge need not be taken seriously."

[9] Kuhn TS (1961) The Function of Measurement in Modern Physical Science ISIS 52:161–193
Kuhn TS (1962)The Structure of Scientific Revolutions, University of Chicago Press, Chicago, IL, 1962. 2nd edition 1970. 3rd edition 1996

Kuhn TS (1977) The Essential Tension, Selected Studies in Scientific Tradition and Change, University of Chicago Press, Chicago, IL

[10] Kuhn TS (1962)The Structure of Scientific Revolutions, University of Chicago Press, Chicago, IL, 1962. 2nd edition 1970. 3rd edition 1996

[11] Kuhn TS (1977) The Essential Tension, Selected Studies in Scientific Tradition and Change, University of Chicago Press, Chicago, IL

[10] Francis Bacon 'The Advancement of Learning' [4]

[11] -see for example the author guidelines for Nature

[12] Georg Wilhelm Richmann was killed by lightning (1753) when attempting to replicate the 1752 kite experiment of Benjamin Franklin. See, e.g., Physics Today, Vol. 59, #1, p42. [5]

[13] the Federal Register, vol 65, no 235, December 6, 2000

[14] In appendix ix to The Logic Popper states: "As to degree of corroboration, it is nothing but a measure of the degree to which hypothesis h has been tested...it must not be interpreted therefore as a degree of the rationality of our belief in the truth of h...rather it is a measure of the rationality of accepting, tentatively, a problematic guess."

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]