Hypothesis Testing
Refresher reading access
Introduction
Faced with an overwhelming amount of data, analysts must deal with the task of wrangling those data into something that provides a clearer picture of what is going on.
We use the concepts and tools of hypothesis testing to address these issues. Hypothesis testing is part of statistical inference, the process of making judgments about a larger group (a population) based on a smaller group of observations (that is, a sample).
The concepts and tools of hypothesis testing provide an objective means to gauge whether the available evidence supports the hypothesis. After applying a statistical test of a hypothesis, we should have a clearer idea of the probability that a hypothesis is true or not, although our conclusion always stops short of certainty.
The main focus of this reading is on the framework of hypothesis testing and tests concerning mean, variance, and correlation, three quantities frequently used in investments.
Learning Outcomes
- define a hypothesis, describe the steps of hypothesis testing, and describe and interpret the choice of the null and alternative hypotheses;
- compare and contrast one-tailed and two-tailed tests of hypotheses;
- explain a test statistic, Type I and Type II errors, a significance level, how significance levels are used in hypothesis testing, and the power of a test;
- explain a decision rule and the relation between confidence intervals and hypothesis tests, and determine whether a statistically significant result is also economically meaningful.
- explain and interpret the p-value as it relates to hypothesis testing;
- describe how to interpret the significance of a test in the context of multiple tests;
- identify the appropriate test statistic and interpret the results for a hypothesis test concerning the population mean of both large and small samples when the population is normally or approximately normally distributed and the variance is (1) known or (2) unknown;
- identify the appropriate test statistic and interpret the results for a hypothesis test concerning the equality of the population means of two at least approximately normally distributed populations based on independent random samples with equal assumed variances;
- identify the appropriate test statistic and interpret the results for a hypothesis test concerning the mean difference of two normally distributed populations;
- identify the appropriate test statistic and interpret the results for a hypothesis test concerning (1) the variance of a normally distributed population and (2) the equality of the variances of two normally distributed populations based on two independent random samples;
- compare and contrast parametric and nonparametric tests, and describe situations where each is the more appropriate type of test;
- explain parametric and nonparametric tests of the hypothesis that the population correlation coefficient equals zero, and determine whether the hypothesis is rejected at a given level of significance;
- explain tests of independence based on contingency table data.
Summary
In this reading, we have presented the concepts and methods of statistical inference and hypothesis testing.
- A hypothesis is a statement about one or more populations.
- The steps in testing a hypothesis are as follows:
- State the hypotheses.
- Identify the appropriate test statistic and its probability distribution.
- Specify the significance level.
- State the decision rule.
- Collect the data and calculate the test statistic.
- Make a decision.
- We state two hypotheses: The null hypothesis is the hypothesis to be tested; the alternative hypothesis is the hypothesis accepted if the null hypothesis is rejected.
- There are three ways to formulate hypotheses. Let q indicate the population parameters:
- Two-sided alternative: H0: θ = θ0 versus Ha: θ ¹ θ0
- One-sided alternative (right side): H0: θ £ θ0 versus Ha: θ > θ0
- One-sided alternative (left side): H0: θ ³ θ0 versus Ha: θ < θ0
where θ0 is a hypothesized value of the population parameter and θ is the true value of the population parameter.
- When we have a “suspected” or “hoped for” condition for which we want to find supportive evidence, we frequently set up that condition as the alternative hypothesis and use a one-sided test. However, the researcher may select a “not equal to” alternative hypothesis and conduct a two-sided test to emphasize a neutral attitude.
- A test statistic is a quantity, calculated using a sample, whose value is the basis for deciding whether to reject or not reject the null hypothesis. We compare the computed value of the test statistic to a critical value for the same test statistic to decide whether to reject or not reject the null hypothesis.
- In reaching a statistical decision, we can make two possible errors: We may reject a true null hypothesis (a Type I error, or false positive), or we may fail to reject a false null hypothesis (a Type II error, or false negative).
- The level of significance of a test is the probability of a Type I error that we accept in conducting a hypothesis test. The standard approach to hypothesis testing involves specifying only a level of significance (that is, the probability of a Type I error). The complement of the level of significance is the confidence level.
- The power of a test is the probability of correctly rejecting the null (rejecting the null when it is false). The complement of the power of the test is the probability of a Type II error.
- A decision rule consists of determining the critical values with which to compare the test statistic to decide whether to reject or not reject the null hypothesis. When we reject the null hypothesis, the result is said to be statistically significant.
- The (1 − a) confidence interval represents the range of values of the test statistic for which the null hypothesis is not be rejected.
- The statistical decision consists of rejecting or not rejecting the null hypothesis. The economic decision takes into consideration all economic issues pertinent to the decision.
- The p-value is the smallest level of significance at which the null hypothesis can be rejected. The smaller the p-value, the stronger the evidence against the null hypothesis and in favor of the alternative hypothesis. The p-value approach to hypothesis testing involves computing a p-value for the test statistic and allowing the user of the research to interpret the implications for the null hypothesis.
- For hypothesis tests concerning the population mean of a normally distributed population with unknown variance, the theoretically correct test statistic is the t-statistic.
- When we want to test whether the observed difference between two means is statistically significant, we must first decide whether the samples are independent or dependent (related). If the samples are independent, we conduct a test concerning differences between means. If the samples are dependent, we conduct a test of mean differences (paired comparisons test).
- When we conduct a test of the difference between two population means from normally distributed populations with unknown but equal variances, we use a t-test based on pooling the observations of the two samples to estimate the common but unknown variance. This test is based on an assumption of independent samples.
- In tests concerning two means based on two samples that are not independent, we often can arrange the data in paired observations and conduct a test of mean differences (a paired comparisons test). When the samples are from normally distributed populations with unknown variances, the appropriate test statistic is t-distributed.
- In tests concerning the variance of a single normally distributed population, the test statistic is chi-square with n − 1 degrees of freedom, where n is sample size.
- For tests concerning differences between the variances of two normally distributed populations based on two random, independent samples, the appropriate test statistic is based on an F-test (the ratio of the sample variances). The degrees of freedom for this F-test are n1 − 1 and n2 − 1, where n1 corresponds to the number of observations in the calculation of the numerator and n2 is the number of observations in the calculation of the denominator of the F-statistic.
- A parametric test is a hypothesis test concerning a population parameter or a hypothesis test based on specific distributional assumptions. In contrast, a nonparametric test either is not concerned with a parameter or makes minimal assumptions about the population from which the sample comes.
- A nonparametric test is primarily used when data do not meet distributional assumptions, when there are outliers, when data are given in ranks, or when the hypothesis we are addressing does not concern a parameter.
- In tests concerning correlation, we use a t-statistic to test whether a population correlation coefficient is different from zero. If we have n observations for two variables, this test statistic has a t-distribution with n − 2 degrees of freedom.
- The Spearman rank correlation coefficient is calculated on the ranks of two variables within their respective samples.
- A chi-square distributed test statistic is used to test for independence of two categorical variables. This nonparametric test compares actual frequencies with those expected on the basis of independence. This test statistic has degrees of freedom of (r − 1)(c − 2), where r is the number of categories for the first variable and c is the number of categories of the second variable.
1 PL Credit
If you are a CFA Institute member don’t forget to record Professional Learning (PL) credit from reading this article.