Considering my own ambitions with regard to the content to be presented on this blog, it seems to me that a quick brush-up of **scientific method** essentials is in order. In this blog post I will provide an **overview** and some illuminating **examples**. My focus is primarily on the **scientific hypothesis **and the **testable predictions** derived from it. For more details on all the elements of the scientific method including its’ history and tons of references for further reading, Wikipedia is a good place to start.

I have been reading quite a few scientific articles recently, and it has occured to me how poorly many of these articles **present** the hypothesis and the predictions tested for their readers. The information *is* there somewhere, hidden between descriptions of the theoretical framework and recounts of previous scientific efforts, but as a reader one often has to start over several times in order to get a clear picture. As a consequence it can be difficult to separate the **actual testing results** from **explorative investigations**.

After reading this blog post I hope you will find it just a little bit easier to identify the key elements in scientific articles.

## The hypothesis as part of the scientific method

The following illustration is presented to many a **university freshman** in one version or another. It depicts the human quest for knowledge as a never ending process of observations leading to questions and investigations, resulting in the development of theory, giving rise to new observations, questions and so forth.

When observing an interesting phenomenon, the researcher will call upon established scientific theories for explanation. In case no satisfactory relationship between theory and observation can be found, time has come to either **extend** existing theories or **replace** them with new and better ones. The scientific process is set in motion.

One of the core concepts of the scientific method is the **hypothesis**. The hypothesis is a statement that provides an answer to the questions that were probed by observation. As scientific tradition has it, a hypothesis has to be sufficiently concrete to enable the derivation of **testable predictions**. Through the investigation of these predictions, the hypothesis can be **corroborated** or **falsified**. The distinction between the hypothesis and the testable prediction is not always obvious, and in statistical hypothesis testing, see below, the hypothesis and the prediction are often considered to be one and the same thing.

## Hypothesis testing: examples

The process of formulating meaningful predictions and testing them in order to corroborate or falsify a hypothesis is labeled **hypothesis testing**. How exactly this testing takes place, depends on the type of predictions at hand, as the following examples serve to illustrate.

*It is possible to sail from South-America to Polynesia on a primitive raft*. Thor Heyerdahl (1914-2002) derived this prediction from hypothesis, that the first people to settle on the islands of the Pacific originated from Peru. This prediction stood the ultimate test in 1947, when Heyerdahl and his crew sailed from Peru to the Tuamoto Islands on a balsa raft in his famous**Kon-Tiki expedition**.*Venus has new and crescent phases, but no gibbous and full phases.*This is predicted by the**geocentric**^{1)}model for the movements of the planets (and the sun), which was the official view at the time of Galileo Galilei (1564-1642). He demonstrated this prediction to be false by observing Venus’ appearance on the night sky through a**telescope**(a brand new invention), registering new, crescent, gibbous and full phases, just like the moon. His discovery eventually lead to a complete rejection of the geocentric hypothesis and it being replaced by**heliocentric**^{2)}models.*The incidence of fatal childbed fever can be reduced by washing hands with chlorine,*Ignaz Semmelweis (1818-1865) introduced this practice in 1847 at the obstetric ward he was appointed at. Semmelweis adhered the hypothesis, that “cadaveric material” on the hands of the medical staff (who performed autopsies as well as assisting women during child birth) was a cause of disease. His hand washing policy immediately caused the mortality rate among new mothers admitted to the clinic to drop from 10% to 3%, in accordance with his prediction.

In the last example, the prediction involves a comparison of two **numerical quantities**: the proportion of women contracting fatal childbed fever before and after a change in medical procedure. In order to be able to make a meaningful comparison of these two quantities, Semmelweis had to invoke methods of **statistical inference**: is the difference between the rates before and after large enough to be considered significant?^{3)}

## Statistical hypothesis testing

In the vast majority of cases, scientific hypothesis testing involves predictions regarding counts, amounts, or other kinds of numerical quantities. Data in numerical form is collected for evaluation, typically involving **repeated measurements*** *of some sort. For instance, the data collected by Ignaz Semmelweis in the third example above consisted of monthly counts of the number of births at his clinic and the number of deaths due to childbed fever.

Obviously, the number of women giving birth at the clinic varied from month to month, and so did the number of deaths. In order to test his prediction, Semmelweis had to compare the data from before the introduction of the hand washing protocol with the data afterwards – two *time series *of fluctuating counts (the number of deaths) and weights (the number of births).

This raises the issue of deciding, whether the data material is in concordance with the predictions. In Semmelweis’ investigation the difference between the data “before” and “after” was huge, but in general,** intuition** is not considered to be sufficient as a decision making tool and the researcher has to invoke **statistical techniques**.

It is customary in statistical data analysis, that the prediction to be tested is formulated as a *hypotheses pair *(this is where statistical hypothesis testing basically treats the hypothesis and the prediction as one and the same thing). The prediction is labeled the **alternative hypothesis**. Typically the alternative hypothesis postulates that “there is a difference”. In the case of Semmelweis – a difference in observed incidence rates of fatal childbed fever.** **Its’ counterpart is the **null hypothesis**, is states that “there is no difference” or “any observed differences are random”.

In fact, hypothesis testing always implies a null hypothesis and an alternative hypothesis, even if statistical testing is not appropriate. In the latter case, the null hypothesis represents the **common understanding** of a phenomenon, whereas the alternative hypothesis is a **competing view**. In the case of Galilei, the contemporary viewpoint was the geocentric model of the universe, so this model was his null hypothesis. The alternative hypothesis was Galilei’s heliocentric model.^{4)}

Having formulated the null hypothesis and the alternative hypothesis, the testing proceeds by planting itself solidly into the soil of the null hypothesis. The statistical analysis of the data results in the **probability of obtaining the data observed under the assumption that the null hypothesis is true**. Only if this probability is very small, typically the upper limit is set to 5%, in some applications even 1%, the null hypothesis is considered to be falsified and the **alternative hypothesis is accepted**. Otherwise it has to be concluded that **no evidence against the null hypothesis was found**, resulting in its’ corroboration.

Falsification is considered to be the stronger result – the absence of evidence against the null hypothesis guarantees very little with regard to future investigations where evidence against the null hypothesis may actually be found.

## From hypothesis to theory

In most cases the hypothesis can be interpreted as a modification or extension of an existing theory, but in some cases the hypothesis represents an opposing view with far reaching implications for “the world as we know it”.Especially if the hypothesis can not be reconciled with the current understanding, the onus is on the scientist to formulate a theoretical framework which explains the **underlying mechanisms** for the hypothesis.

A lack of theoretical groundwork reduces the hypothesis to a “black box” where the questions preceding the hypothesis are not really answered at all. As a consequence, those to whom the new insights bear relevance will be reluctant to accept the validity of the results.

Semmelweis failed to provide a mechanism for the spread of disease from corpses to women in childbirth. The contemporary view of what caused disease was very different from today and microorganisms such as bacteria were yet to be discovered. The hands of a gentleman were considered to be clean no matter what and Semmelweis’ colleagues were indignated by the request to wash hands. As many other pioneers of science, Ignaz Semmelweis did not live to receive due credit for his discovery.

### Footnotes

^{1)}meaning all bodies circle around the Earth.

^{2)}meaning all bodies circle around the Sun.

^{3)}Further reading: Studies of the history of probability and statistics: Semmelweis and childbed fever. A statistical analysis 147 years later.

^{4)}See Explorable.com under “Development of the Null”.