|
|
||||||||
|
REVIEW
Significance Tests: Test on a Population Claim (a Two-Tail Test) UK data for male median earnings indicates that this figure is £ 24 000. What can you say about the validity this figure if a simple random sample of 200 families in Britain showed an average earnings level of £ 23 500, with a standard deviation of £ 4000? Use a .05 level of significance. Would your conclusions be any different at a .01 level of significance? This is our suggested response Self-Test on Testing a Specification Claim A manufacturer of plasticised line used in home-assembly mobiles advertises that their product has an average tensile strength of 30 kilograms. You took a sample of 100 sections of the line and tested them. The average tensile strength of this sample was 28 kilos, with a standard deviation of 12 kilos. Does this enable you to dismiss the manufacturer's claims? Answer this question for a significance level of .05 and .01. Think about whether the more appropriate test will be one-tailed or two-tailed. This is our suggested response Comparing Two Sample Means: a z-Test Another important aspect of hypothesis testing in statistics is examining a hypothesis of the Group B type (refer back to the two hypothesis types if you're not sure what this is), which makes a claim about two apparently different populations. Public opinion is often shaped by these claims. In some sense, these statistical claims can shape social or economic policy. How can we evaluate these assertions to ascertain whether or not there is any truth to them? There are a number of ways of evaluating such claims. We're going to use the z-test from calculating the z-score to identify the critical regions of the normal distribution. Let's imagine that a new soft drink has been developed and its manufacturers claim that it boosts memory-recall. We need to test whether or not the drink is effective. We start by collecting two random samples, each of 100 students. We give all students a soft drink, but one group receives the memory drink (Total-Recall) the other a carbonated sugar water drink (this is known as a placebo). All 200 students think they have received the memory drink. The students all take a memory recall test, with the following results: Group 1 (Total -Recall): Mean Score: 55; Standard Deviation: 12 marks Group 2 (placebo): Mean Score 51.8; Standard Deviation: 9 marks The difference in the Mean Scores between the two groups is 3.2 marks, in favour of the Total-Recall drink. Is this result significant or is it just a chance difference arising from the two samples? To test this, we have to formulate a Null and Alternative Hypothesis: Null H0 : There is no difference. Both groups are from the same population. The calculated difference between the sample means is insignificant. Therefore the manufacturer's claims for the memory drink are misplaced. Alternative H1 :There is a real difference between the two groups. They belong to two different populations. Therefore, the Total-Recall drink is effective. The first step is to calculate the standard error for each group. The standard error is the standard deviation of the sample divided by the square root of the number in the sample. Since there are 100 in each sample, the square root in each calculation of the Standard Error will be 10. So, the Standard Error for Group 1 is 1.2 marks and the Standard Error for Group 2 is 0.9 marks. If we took a number of paired tests, as we did with one memory drink group and one placebo group, we would always have two means to compare (one for each group). By subtracting one from the other, every pair of samples would give us a figure for the difference between the two sample means. If the Null Hypothesis is true, then we would expect the average difference between sample means to be 0. Sometimes the memory drink group would have a higher mean score, and sometimes the placebo group would have the higher mean score. If we were to calculate the difference between the means by subtracting the mean for the placebo group from the mean for the memory drink group, we would end up with a collection of positive and negative numbers. But, if these groups really do come from the sample population, the average of those numbers should be 0 (no difference). If we imagine all the possible numbers for this difference between the sample means, those figures would be normally distributed. If the Null Hypothesis is true then the most frequent result should be 0. We would see decreasing frequencies in either direction, one where the placebo group mean was higher and the other indicating the frequency of cases where the memory drink group's mark was higher. This normal curve is a theoretical representation of the distribution if the two populations are the same (if there is no real difference between Group 1 and Group 2). Now, if we could calculate the standard deviation of this normal curve we would know the relationship between the size of the difference between the two sample means and its probability (as we do with any normal distribution). Well, we can do this. It is the standard error of the difference between the sample means. This is a combination of their separate standard errors, based on the following formula. Standard Error (differences) = Square Root {(SE Group 1)2 + (SE Group 2)2} This Standard Error (differences) figure of 1.5 marks tells us that in the normal distribution curve indicating the frequency of the values for the differences between the sample means, there is approximately a .68 probability that the difference will fall between 0 (the mean difference) and 1.5 marks on either side, approximately a .95 probability that the difference will fall between 0 (the mean difference) and 2 standard deviations or between -3 and + 3. Now the difference we observed between the two sample means is 3.2. The z-score for this difference is 3.2 divided by 1.5, or 2.13. In other words, this value falls between 2 standard deviations and 3 standard deviations from the mean. Whether this result of a z-score of 2.13 is significant, therefore, will depend on the level of significance we set. We know that a z-score of 2 includes approximately 95 per cent of all possible scores, or that there is a .95 probability that the result in a normal distributed frequency will fall within a z score of +2 and -2, or, alternatively, that there is a .05 possibility that
any result in a normal distribution will fall beyond a z score of In this case, the figure we obtained for the difference between the means is a z-score of 2.13. We can conclude with 95 per cent certainty that the two samples we have been studying come from different populations. Should we then accept or reject the Null Hypothesis? Our decision, as before, depends upon how much risk we are prepared to take, in other words, on the confidence level we set. If we decide that a confidence level of .05 is acceptable, then we shall reject the Null Hypothesis, conclude that these two samples do indeed come from two different populations, and that the difference between the two is significant. In other words, we would decide that the memory drink does have a significant effect. On the other hand, if we set a more stringent confidence level of .01, then we cannot reject the Null Hypothesis. The z-score calculation of 2.13 falls well within the limits for a z-score indicating a .01 level of certainty (2.58). If we want to be 99 per cent sure of our conclusions, we will have to agree that there is no difference between the memory drink group and the placebo group. Test on a z-Test Follow the same steps to work out your response to the following: In your university or college, there have been loud complaints recently about the different standards set in coursework marking by two of the lecturers in the Economics Faculty. To test the claim, you have collected 30 sample grades from the First Year Economics undergraduate course, with the following results:
On this basis, would you uphold or reject the complaint? This is our suggested response In practice, conducting a z-test is easier that the procedure outlined above. We can get Excel to carry out all the stages for us. There is a section in the TimeWeb Excel Guide in the Reference section on how to use Excel to conduct a z-test: two sample means. The significance test covered here is known as a z-test. This is because we've used the standard deviation as a unit (z-unit) for measuring the point where the critical region starts and then relate it to the proportions of the normal curve. But as was pointed out above, this is accurate only with large samples. When samples contain fewer than 30 observations, the standard deviation of the sample cannot be used as an estimate of the standard deviation in the population. We have to use the t-test. The t-test, sometimes called Student's t-test, (see under W. Gossett in the section on statisticians in history) uses the standard error of the differences between means, calculated exactly as above for z-tests, but re-named the t-value. The respective proportions either side of the given value are not drawn from the normal curve but what is called the t-distribution. The t-distribution is similar to the normal distribution in that it is symmetrical about a mean of zero and is bell-shaped. But what differentiates the t-distribution from the normal curve is that it is flatter, which means it is more dispersed, and its dispersion varies according to the size of the sample. The bigger the sample, the closely more the t-distribution approaches the normal distribution. The t-test is conducted by comparing the calculated value of the statistic with the critical value of t in a t-table, at the required confidence level and with the appropriate number of degrees of freedom, n - 1 or one less than the sample size. When the number of degrees of freedom is small, Student's t-curve is much broader and flatter than the normal curve. As the number of degrees of freedom grows, Student's t-curve gets closer and closer to the normal curve. For numbers of degrees of freedom over 200, the two curves are essentially indistinguishable. There is more information and an explanation of degrees of freedom available if you are not sure about what this means. |
||||||||
|
|
|||||||||