Solveeit Logo

Question

Question: How can you determine if a difference between the means of two samples is significant?...

How can you determine if a difference between the means of two samples is significant?

Explanation

Solution

Here in this question we have been asked to determine if a difference between the means of two samples is significant or not. The difference between the means of two samples from a different statistic compared to what we have in the samples.

Complete step-by-step answer:
Now considering from the question we have been asked to determine if a difference between the means of two samples is significant or not.
The difference between the means of two samples from a different statistic compared to what we have in the samples.
The significance test procedure is known as “two-sample-t-test”. This is suitable when the samples are independent and the sampling distribution is approximately normal.
This approach consists of four steps:
(1) State the hypotheses
Every hypothesis test requires the analyst (that is us in this case) to state a “null hypothesis” and an “alternative hypothesis”. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.
The table below shows three sets of null and alternative hypotheses. Each makes a statement about the difference “d” between the mean of one population μ1{{\mu }_{1}} and the mean of another populationμ2{{\mu }_{2}} .

SetNull hypothesisAlternative hypothesisNumber of tails
1μ1μ2=d{{\mu }_{1}}-{{\mu }_{2}}=dμ1μ2d{{\mu }_{1}}-{{\mu }_{2}}\ne d2
2μ1μ2d{{\mu }_{1}}-{{\mu }_{2}}\ge dμ1μ2<d{{\mu }_{1}}-{{\mu }_{2}} < d1
3μ1μ2d{{\mu }_{1}}-{{\mu }_{2}}\le dμ1μ2>d{{\mu }_{1}}-{{\mu }_{2}}>d1

The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.
When the null hypothesis states that there is no difference between the two population means (i.e., d = 0), the null and alternative hypotheses are often stated in the following form:
H0:μ1=μ2 Ha:μ1μ2 \begin{aligned} & {{H}_{0}}:{{\mu }_{1}}={{\mu }_{2}} \\\ & {{H}_{a}}:{{\mu }_{1}}\ne {{\mu }_{2}} \\\ \end{aligned}
(2) Formulate the analysis plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements:
(i) Significance level:
Often, researchers choose significant levels equal to 0.01, 0.05, or 0.10; but we can choose any value between 0 and 1.
(ii) Test method:
We will use the two-sample-t-test to determine whether the difference between means found in the sample is significantly different from the hypothesized difference between means.
(3) Analyze sample data
Using sample data, we will find the standard error, degrees of freedom, test statistic, and the P-value associated with the test statistic.
(i) Standard error:
We need to compute the standard error (SE) of the sampling distribution using the formula given as SE=(s1n1)2+(s1n1)2SE=\sqrt{{{\left( \dfrac{{{s}_{1}}}{{{n}_{1}}} \right)}^{2}}+{{\left( \dfrac{{{s}_{1}}}{{{n}_{1}}} \right)}^{2}}} where s1{{s}_{1}} is the standard deviation of sample 1, s2{{s}_{2}} is the standard deviation of sample 2, n1{{n}_{1}} is the size of sample 1 and n2{{n}_{2}} is the size of sample 2.
(ii) Degrees of freedom:
The degrees of freedom (DF) is given as DF=(s12n1+s22n2)((s12n1)2n11+(s22n2)2n21)DF=\dfrac{\left( \dfrac{s_{1}^{2}}{{{n}_{1}}}+\dfrac{s_{2}^{2}}{{{n}_{2}}} \right)}{\left( \dfrac{{{\left( \dfrac{s_{1}^{2}}{{{n}_{1}}} \right)}^{2}}}{{{n}_{1}}-1}+\dfrac{{{\left( \dfrac{s_{2}^{2}}{{{n}_{2}}} \right)}^{2}}}{{{n}_{2}}-1} \right)} . If DF does not compute to an integer, then we need round it off to the nearest whole number. Some texts suggest that the degrees of freedom can be approximated by the smaller of n11{{n}_{1}}-1 and n21{{n}_{2}}-1but the above formula gives better results.
(iii) Test statistic:
The test statistic is a tt statistic defined by t=([xˉ1xˉ2]d)SEt=\dfrac{\left( \left[ {{{\bar{x}}}_{1}}-{{{\bar{x}}}_{2}} \right]-d \right)}{SE} where xˉ1{{\bar{x}}_{1}} is the mean of sample 1, xˉ2{{\bar{x}}_{2}} is the mean of sample 2, dd is the hypothesized difference between population means and SESE is the standard-error.
(iv) P-value:
The P-value is the probability of observing a sample statistic as extreme as the test statistic.
(4) Interpret results.
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.
Hence we can conclude that in this way we can determine whether the difference between the two means of two samples is significant or not.

Note: In questions of this type we should be sure with the concepts and calculation that we are applying in between the steps during the process of answering. Using this process that is described above we can verify whether the difference between the two means of two samples is significant or not.