Getting your Trinity Audio player ready...

Hypothesis testing is the process of evaluation and testing of a proposed hypothesis or a claim about a population parameter. It is tested against the evidence inferred from the sample data.
The scope of this article aims to discuss the underlying concept of Hypothesis Testing in statistics, its types and steps, significance level, pvalue and common errors in hypothesis testing.
What is Hypothesis testing?
Hypothesis testing is a critical part of statistical and scientific research, which, on the basis of a sample of data, makes inferences about a population by assessing the likelihood of the observations taking the assumption that the null hypothesis is true.
Hypothesis testing is used in different domains like business, healthcare, and engineering to make informed decisions.
Types of Hypothesis Testing
 Null Hypothesis (H0) and Alternative Hypothesis (Ha) –
Null hypothesis represents the hypothesis that there is no significant association between between variables. It is the hypothesis that is to be tested and any observed difference is merely due to chance. Alternative hypothesis represents the hypothesis that there is difference or association between variables. It is the accepting hypothesis if the null hypothesis is rejected and the observed difference is not due to chance.
 Simple and Composite Hypothesis Testing –
In hypothesis testing, null and alternative hypotheses can be simple and composite hypotheses. Simple hypothesis specifies a particular value for a population parameter. For example, the null hypothesis that the mean height of a population is 173 cm is a simple hypothesis. Composite hypothesis specifies a range of values for a population parameter. For example, the alternative hypothesis that the mean height of a population is not 173 cm is a composite hypothesis.
 OneTailed and TwoTailed Hypothesis Testing –
In hypothesis testing, alternative hypotheses can be onetailed and twotailed. Onetailed alternative hypothesis specifies the direction of difference between variables. Here, the distribution of the test sample is onesided, meaning, it is either greater or lesser than a specific value. Twotailed alternative hypothesis does not specify the direction of difference between variables. Here, the distribution of the test sample is twosided, meaning, it is checked to be greater or less than a range of values.
Steps in Hypothesis Testing
Step 1: Stating the hypotheses
Begin with stating the null and alternative hypotheses on the basis of the research area. Typically, the null hypothesis is the default assumption and the alternative hypothesis as the assumption of the desired result.
Step 2: Setting the significance level
The significance level, denoted by alpha (α). It is the probability of rejecting the null hypothesis when it is true. The significance level is usually 0.05 or 0.01, meaning the chance of 5% or 1% in accepting to make a Type 1 error (rejection of true null hypothesis).
Step 3: Collecting the data
The data is collected by conducting a study or experiment to test the hypothesis. Here the focus should be on the random data such that the results are unbiased.
Step 4: Calculating the test statistic
The test statistic measures the deviation of sample data from the null hypothesis. Frequently used test statistics are chisquare test, ttest, ztest and Ftest. The type of test statistic being used depends on the type of hypothesis being tested and the level of data measurement.
Step 5: Calculating the pvalue
The pvalue is the probability of observing the test statistic with the assumption of the null hypothesis being true or the probability of the null hypothesis being rejected. A small pvalue (less than significance level) indicates rejection of the null hypothesis and a large pvalue (greater than significance level) indicates that the null hypothesis cannot be rejected.
Step 6: Making a decision and interpreting the results
Finally, on the basis of pvalue and the significance level, a decision is made. In case pvalue is less than significance level, the null hypothesis is rejected in favor of the alternative hypothesis and the results are said to be statistically significant. For cases where pvalue is greater than the significance level, the null hypothesis is not rejected and results are not statistically significant.
Read Blog: Best Statistics Books for Data Science
Hypothesis Testing in Statistics Example:
Example: Suppose a factory produces light bulbs, and the manufacturer claims that the average lifespan of a bulb is 1000 hours. However, you suspect that the bulbs may not actually last that long. To test your hypothesis, you randomly select a sample of 50 bulbs and test their lifespans. You find that the average lifespan of the sample is 980 hours with a standard deviation of 50 hours.
For performing a hypothesis test on the above problem, the steps need to be followed are:
 The null and alternative hypotheses are defined, wherein, the null hypothesis (H0) is the average lifespan of bulbs which is 1000 hours and the alternative hypothesis (Ha) is that it is less than 1000 hours.
 Choose a significance level of 0.05, i.e., acceptance of 5% risk of rejecting the null hypothesis when it is true.
 The test statistic is calculated with the use of ttest as population standard division is not known. The formula for the ttest is:
t = (x̄ – ? ) / (s /√n )
where, x̄ is the sample mean, ? is the population mean (to be tested), s is the sample standard deviation, and n is the sample size.
Putting in the values,
t = (980 – 1000) / (50/√50 ) = 2.24
 To calculate the pvalue, we use the onetailed hypothesis (as we only want to find whether the lifespan is less than 1000 hours), so we have to find the area under the tdistribution to the left of our test statistic. By using a ttable or a calculator, pvalue is found out to be 0.014.
 On comparing the pvalue (0.014) to the significance level (0.05), we conclude that pvalue is less than the significance level, which means the average lifespan of bulbs is less than 1000 hours.
Note: This is just a simple illustrative example and hypothesis testing in statistics can be more complex involving more variables, sample sizes and assumptions.
Common Errors in Hypothesis Testing:
There are two types of error in hypothesis testing:
 Type 1 Error – This is the error of rejecting the null hypothesis when it is true. The probability of occurrence of Type 1 error is denoted as alpha (α).
 Type 2 Error – This is the error of failing to reject the null hypothesis when it is false. The probability of occurrence of Type 2 error is denoted as beta (β).
Read Blog: Probability Distribution in Data Science
Conclusion
Hypothesis testing meaning in statistics defines a tool which models a hypothesis about a population to be true or false. It majorly takes in account of assumptions and interpretations based on probability, and hence it can be erroneous at times. Hypothesis testing is widely used for large sample data such as in psychology, biology, advertising, and many more.