From the Shadow of the Tetons: Null hypothesis

Showing posts with label Null hypothesis. Show all posts

Monday, February 27, 2012

On Science 4: The Scientific Method

If you are like most people once upon a time you heard/learned that the scientific method includes a series of predefined, more or less rigid steps that if followed, like a recipe as shown below, will result in "science." This view of how science is done is just way too simple.

All the images in this posting are from a fantastic web site on the process of science put together by the good folks at UC Berkeley. You can access it at: http://undsci.berkeley.edu/article/howscienceworks_01.

To get back to the process of science...it is true that the general steps outlined above do matter and are part of what happens, but there are MANY ways that scientists work. A more accurate representation of how science is really done is shown below:

Rather than a static set of steps, the scientific approach to asking questions, obtaining empirical evidence, analyzing those data, and developing conclusions is a dynamic process. Most scientists start the process in the top sphere "Exploration and Discovery" where they either ask questions about new observations or see things differently through new technology, or identify a meaningful problem through curiosity and inspiration, but ideas and motivation for carrying out research can happen at any stage of the process and in any of the spheres in the figure above.

Regardless of where a scientist starts, the process of inquiry almost always includes the generation of a scientific argument. This step is shown in the upper half of the center sphere, "Testing Ideas." A scientific argument includes three parts - hypotheses (Null and Research Hypotheses - see the earlier posting on the Power and Strength of Science), expected observations, and actual observations. The Research Hypothesis is the researcher's best explanation of what they thing the eventual answer will be, and it is used to generate a set of expected or predicted results. The researcher then uses one of any number of approaches to collect empirical observations. This can be done through experimentation, observation, etc. Once the empirical observations are in hand the researcher moves to the lower part of the central sphere and carries out an analysis of the results...that is, they almost always usually use statistical methods or other mathematical tools to see how well the observed data and the predicted data match up. This can lead to a variety of outcomes.

The observed observations may not support the expected observations, so the expected set and the hypothesis that led to them has to be upgraded, improved, changed, or even tossed out in favor of a better hypothesis and set of expected explanations. This lead the researcher back to the top sphere or over to the sphere labeled "Community Analysis and Feedback." No matter what path a project follows through this overall process, the research must pass through the "Community Analysis" sphere at least once. This is where the observations and explanations are subjected to critical peer-review. Peer-review is a rather ruthless process where other scientists who are experts in the field look over the hypotheses, expected data, observed data, methods of analysis used to examine the data, the conclusions reached by the researcher, etc. The basic outline of what happens during peer review is shown in the figure below:

Major benefits of subjecting the outcome of a study to peer review include: minimizing or eliminating researcher bias in the results of the paper, rejecting papers that had poor research designs, inappropriate or internally inconsistent conclusions, helping the researchers address weaknesses in writing, analysis, etc.

If the results and conclusions of a study make it through peer-review (usually with a significant amount of additional review and revision work) the study can be accepted for publication in a professional journal. Once this happens, the conclusions can inform new research in the top sphere or inform policy or be recognized as new knowledge, etc., in the sphere labeled "Benefits and Outcomes." This is also when the larger scientific community can access the work, assess its validity and value and, if desired, replicate the study or carry out studies similar to it that will help determine whether the conclusions of the paper hold up under additional scrutiny.

Remember that the Null Hypothesis is always the preliminary explanation that is tested. By so doing the scientific community eliminates false explanations, and thereby moves progressively closer to the true explanation about the question of interest.

Keep in mind that the outcome of any sphere can lead to any other sphere in the chart, and this dynamic, cyclic process is a much better representation of how science is actually done than the over-simplified "recipe" shown at the top of this posting.

Once a preliminary explanation has been tested using as many available approaches and sets of data as possible, and it still holds up, the hypothesis can become a theory - the most powerful kind of explanatory statement in science.

I heartily refer you to the website indicated above for the full treatment of how science works. It's well worth the time, and I have all of my general education students review this site as part of the course.

Saturday, February 25, 2012

On Science 3: The Power and Strength of Science

In earlier postings I shared some thoughts on the assumptions and limitations of science. This posting focuses on the power and strength of science as a way of learning and knowing about the natural world.

In my earlier postings you read that one limitation of science is that science we can address only questions that are objective AND empirical, and another limitation is that there is no way we can be absolutely sure that a scientific explanation is correct. These are limitations, but they are also the basis for some of the great power and strength of scientific inquiry.

Strength #1: Independent confirmation or refutation via empirical evidence.

The scientific approach to answering objective questions about the natural world always includes the collection and analysis of empirical evidence. (Objective questions are those that have a definitive answer, and something that is empirical can be investigated through observations via our physical senses or technology that extends those senses.)

Because scientific explanations are based on empirical observations, anyone with access to the right kind of equipment (if needed) can replicate an experiment or collect their own observations independently, and independently test their own evidence to see whether a scientific conclusion is confirmed or refuted. If claimed evidence cannot be replicated, then the conclusion is put into significant doubt and others will carry out their own studies to add to the body of observations which eventually becomes so compelling that the original explanation is either accepted, modified to explain all available evidence, or rejected in favor of a different, but stronger explanation. Plus, this process of discovery and dissemination of scientific explanations includes independent, critical peer-review before it can be accepted for publication in a professional journal. Then, once published, the information is read and assessed by the larger scientific community that in most cases carries out independent tests that allow for independent confirmation or refutation before it is accepted by the larger scientific community as a viable explanation.

This approach minimizes researcher bias and the chances that poor methodology or faulty or poorly supported conclusions will make their way into the accepted body of scientific knowledge.

Point #2: The self-correcting nature of science.

Statistical analysis of empirical data, and consideration of new evidence as it becomes available are a routine part of most scientific studies. Result of these tests provide a statistical level of confidence we have in relation to hypotheses we test. These levels of confidence are based on mutually accepted levels of confidence that are based on statistical critical values that have been calculated by statisticians for data sest of definitive sizes and for each kind of statistical test that exists.

These critical values allow us to identify the likelihood or level of confidence we have in accepting a scientific explanation as valid. The scientific community typically requires a researcher to be at least 95% confident that a particular explanation (hypothesis) should not be rejected before it can be considered a viable possibility.

When a scientist carries out a research project they most often employ two preliminary explanations - hypotheses. One represents the researcher's best prediction of what the outcome or eventual explanation will be. This is called the Alternative or Research Hypothesis. The other hypothesis is called the Null Hypothesis. This hypothesis is a statement that says that the correct explanation is anything other than the Research Hypothesis. The Null Hypothesis is the one that a researcher tests and must decide whether to reject or fail to reject based on the analysis of empirical evidence. The decision about what to do about the Null is determined by the amount of possible error that is associated with the outcome of the statistical tests. What this means in practice is that a researcher must be more than 95% confident that the Null Hypothesis is NOT correct before they reject it. The other 5% represents the amount of error that exists in relation to that decision.

Actually, there are two types of error associated with this kind of decision-making. One type is the possibility of accepting an explanation when it should have been rejected, and the other type is the chance of wrongly rejecting an explanation when it should have been accepted.

So if the outcome of a statistical test shows a p-value (probability value of the null hypothesis being correct, or level of error in decision-making) is greater than 0.05 or 5% the researcher is compelled to fail to reject (i.e., accept) the null hypothesis. If the p-value is smaller than 0.05 or >5% the researcher is compelled to reject the null hypothesis. Only if this happens can the researcher consider the research hypothesis as a viable possible explanation. it does not, however, mean that the research hypothesis is correct. It means only that it has not been rejected as a possible explanation.

Since this process of eliminating possible explanations has been going on systematically for around 300 years now, many, many weak or incorrect scientific explanations have been corrected or rejected in favor of better ones. What this also means is that whenever this process is applied, there is ALWAYS a margin of error, slim though it may be, associated with every decision. What this also means is that as hypotheses are tested and rejected or not, we get progressively closer to describing truths about the natural world and how it works.

About now I hope you are asking yourself "Is it possible to discover absolute truth through science?"

I strongly contend that not only can science discover absolute truth, but that it does so on a regular basis. The problem though is that while absolute truth can be discovered, it is impossible to be absolutely confident that what science has been discovered is the absolute truth.

Because science cannot be absolutely confident in its discovery of truth, scientists continually test explanations with new sets of data collected in new ways or with new types of technology. Hypotheses that that bear up under repeated testing become theories. Theories that have withstood the test of time and many repeated tests for validity are considered strong theories. Those that do not hold up under this type of scrutiny are either modified to explain all previously existing pertinent data and new data, or they are rejected in favor of new explanations that are able to explain all pertinent data.

This approach to testing, re-testing, and improving, or revising and replacing explanations is referred to as "The Self Correcting Nature of Science" which is one of the greatest strengths of the scientific approach to discovery of truth. Dr. John Moore expressed the power of this aspect of science when he wrote, "Great art is eternal; great science tends to be replaced by greater science" in his book, Science as a Way of Knowing: The Foundations of Modern Biology."

In conclusion, science relies on independent review in order to minimize the effects of personal bias and to maximize the quality of scientific explanations. It also includes a systematic process for eliminating weak or incorrect scientific explanations in favor of more complete or better-supported explanations. These two strengths make the scientific approach an extremely powerful way to discover truth about the natural world and how it works.

Tuesday, February 21, 2012

On Science 2: The Limitations of Science

The scientific approach is a powerful way to discover natural laws and describe how the universe works, but it is not all-powerful. Science has limitations that any student of the sciences needs to be aware of.

Limitation #1: Science is objective and empirical

The scientific approach can be used to address only questions that are objective and empirical. An objective question is one for which a definitive answer actually exists. And a question is empirical only if the answer can be discovered through the collection and analysis of empirical evidence = observations made via the physical senses or technology that extends the senses.

An example of an objective, empirical question is: "What is the fastest way to get to work?" This question has an actual answer, and it can be discovered by collecting empirical observations on the amount of time it takes to walk, bike, drive, take a train, bus, helicopter, plane, etc., from point A to point B under a wide range of conditions.

An example of a question that is neither objective nor empirical is "What is the best way to get to work?" This question can be answered any number of different ways - fastest, most economical, most comfortable, most prestigious, most environmentally-minded, etc., depending on the opinion or perspective of the question asker. So the answer is based completely on the subjective measure a person may want to apply to it, and is therefore not scientific, because someone else is likely to apply a different subjective measure, etc.

An example of a question that is objective but not empirical is "Does God exist?" This question is objective because there are only two possible answers to the question, "Yes" and "No". But it is a non-empirical question since we have no way to collect empirical, demonstrable, repeatable observations that can be shown to anyone who wants to see them. This means that science simply cannot answer questions about things outside of our ability to observe and measure the physical/natural world. Consequently, science is God-neutral and religion-neutral. So if anyone ever tries to content one way or another about the existence of God using only scientific evidence, you should realize that they don't understand the limitations of science, and a little warning flag should pop up in your head.

Limitation #2: Scientific explanations always contain an unavoidable element of uncertainty

One of the first things a scientist does when s/he prepares to do some research is to state what they think they will observe and explain what those observations might mean. This preliminary explanation is called a hypothesis, a Research Hypothesis to be more precise. It would be bad process, though for a scientist to set out to prove their Research Hypothesis correct, so in order to reduce personal bias the scientist develops a second option or explanation that is the opposite or negation of their Research Hypothesis. This second explanation is called the Null Hypothesis.

After the researcher has collected as many empirical observations as possible, given time and other constraints, s/he analyzes those data. This usually involves the application of statistical methods. The interesting thing about this overall process is that the Research Hypothesis should never be the one that is tested. Rather, the Null Hypothesis is the possible explanation that the researcher has to decide to reject or fail to reject (i.e., accept). The really interesting thing about this process is that the researcher is forced to accept the Null Hypothesis unless the data and analysis of the data provide overwhelming evidence that the Null Hypothesis is not correct. How overwhelming? The scientific standard for most disciplines is that the research has to be at least 95% confident that the Null Hypothesis is not correct before it can be rejected. Of course, some data sets allow the researcher to be more than 95% confident. Sometimes they can be 99% confident, 99.9% confident, or even more, but no matter how confident a scientist is in their decision, they can never be 100% confident. The 5%, 1%, or 0.1% represents the chance of making a wrong decision regarding the Null Hypothesis - that it should have been rejected when it was accepted, or that it should have been accepted when it was rejected. This unavoidable element of uncertainly means that the researcher can never be absolutely 100% confident that their decision about the Null Hypothesis is the right one, though levels of confidence typically exceed reasonable doubt. I mean, I would LOVE it if I could be at least 95% confident that was I making the right choice whenever I make any kind of decision, wouldn't you?

You should be aware that there are people out there that use this unavoidable, but usually minuscule amount of uncertainty to say that because there is some uncertainty in the data that we should feel free to reject any scientific explanation or conclusion that we don't happen to like. This is faulty logic, so don't be misled by the kind of misguided decision-making promoted by this faulty logic.

Limit #3: It is possible to make the wrong decision regarding the Null Hypothesis, even if the scientist applies good process

As mentioned above in Limitation #2, it is possible that the decision to reject the Null rather than accept it is not the right one, even though the data suggest that this is the correct thing to do. It is possible, for instance, that the set of observations you collected are not really representative of a population or a real cause and effect relationship, and this can result in faulty decision-making.

Does this happen? Yes, occasionally, and more often when a scientist has only a small set of data than when they have a large one.

Is there a way to safeguard against or reduce the rick of making the wrong decision? Yes. The first line of defense is called the peer-review process. Before a scientist's work can be accepted for publication in a professional journal, their work is sent out to a hand full of experts in the field. These experts provide a no-holds-barred critical review of the research that was done. They critique the overall research design, amount and type of data collected, the statistical and other analytical methods employed, and the conclusions reached by the researcher. This peer-review process catches most errors that exist in research papers before they are published. Then, once some research is published, other researchers will read it and some of them will carry out independent tests of their own to see if the obtain similar data and conclusions. If they do, then the conclusions of the original research are supported. If not, they are refuted, and the other researchers publish their work after peer-review. This process of independent replication catches most of the other errors, but even so we will never be 100% confident of our decisions, though our confidence in the decision typically exceeds any reasonable person's criteria for surpassing reasonable doubt.

Conclusion

So to sum up, there are three primary limitations of science:

Science can address only objective, empirical questions.
Scientific conclusions and explanations always contain at least a tiny amount of uncertainly
Scientists can make the wrong decision about what to accept when they do their research

I hope that this overview has been helpful and will help you understand what the main limitations of science are, and how risks of reaching faulty decisions and explanations can be minimized.