12. Inference About Two Populations



Comments



Description

1Inference about Two Populations 2 Introduction • Variety of techniques are presented whose objective is to compare two populations. • We are interested in: – The difference between two means. – The difference between two proportions. 3 INFERENCE ABOUT THE DIFFERENCE BETWEEN TWO SAMPLES: INDEPENDENT SAMPLES POPULATION 1 POPULATION 2 PARAMETERS: µ 1 , 2 1 o 2 2 o PARAMETERS: µ 2 , Statistics: Statistics: Sample size: n 1 Sample size: n 2 2 1 1 x , s 2 2 2 x , s 4 Inference about the Difference between Two Means: Independent Samples • Two random samples are drawn from the two populations of interest. • Because we compare two population means, we use the statistic 1 2 X X ÷ 5 The Sampling Distribution of 1 2 X X ÷ 1 2 X X ÷ 1 2 X X ÷ 1 2 X X ÷ 1 2 X X ÷ 1. is normally distributed if the (original) population distributions are normal . 2. is approximately normally distributed if the (original) population is not normal, but the samples’ size is sufficiently large (greater than 30). 3. The expected value of is µ 1 - µ 2 4. The variance of is o 1 2 /n 1 + o 2 2 /n 2 6 • If the sampling distribution of is normal or approximately normal we can write: • Z can be used to build a test statistic or a confidence interval for µ 1 - µ 2 2 1 2 1 n n ) ( ) x x ( Z 2 2 2 1 2 1 o + o µ ÷ µ ÷ ÷ = 2 1 x x ÷ Making an inference about µ 1 – µ 2 7 2 1 2 1 n n ) ( ) x x ( Z 2 2 2 1 2 1 o + o µ ÷ µ ÷ ÷ = • Practically, the “Z” statistic is hardly used, because the population variances are not known. ? ? • Instead, we construct a t statistic using the sample “variances” (S 1 2 and S 2 2 ). S 2 2 S 1 2 t Making an inference about µ 1 – µ 2 8 • Two cases are considered when producing the t-statistic. – The two unknown population variances are equal. – The two unknown population variances are not equal. Making an inference about µ 1 – µ 2 : o 1 and o 2 unknown case 9 Inference about µ 1 – µ 2 : Equal variances 2 n n s ) 1 n ( s ) 1 n ( S 2 1 2 2 2 2 1 1 2 p ÷ + ÷ + ÷ = Example: s 1 2 = 25; s 2 2 = 30; n 1 = 10; n 2 = 15. Then, 04347 . 28 2 15 10 ) 30 )( 1 15 ( ) 25 )( 1 10 ( S 2 p = ÷ + ÷ + ÷ = • Calculate the pooled variance estimate by: n 2 = 15 n 1 = 10 2 1 S 2 2 S The pooled variance estimator 10 Inference about µ 1 – µ 2 : Equal variances 2 n n s ) 1 n ( s ) 1 n ( S 2 1 2 2 2 2 1 1 2 p ÷ + ÷ + ÷ = Example: s 1 2 = 25; s 2 2 = 30; n 1 = 10; n 2 = 15. Then, 04347 . 28 2 15 10 ) 30 )( 1 15 ( ) 25 )( 1 10 ( S 2 p = ÷ + ÷ + ÷ = • Calculate the pooled variance estimate by: 2 p S n 2 = 15 n 1 = 10 2 1 S 2 2 S The pooled Variance estimator 11 Inference about µ 1 – µ 2 : Equal variances • Construct the t-statistic as follows: 2 n n . f . d ) n 1 n 1 ( s ) ( ) x x ( t 2 1 2 1 2 p 2 1 ÷ + = + µ ÷ µ ÷ ÷ = 2 1 • Perform a hypothesis test H 0 : µ 1 ÷ µ 2 = 0 H 1 : µ 1 ÷ µ 2 > 0 or < 0 or 0 = Build a confidence interval 1 2 2 1 2 , 2 1 2 1 1 ( ) ( ) is the confidence level. n n p x x t s n n where o o 2 + ÷ ÷ ± + 1÷ 12 EXAMPLE • The statistics obtained from random sampling are given as • It is thought that µ 1 < µ 2 . Test the appropriate hypothesis assuming normality with o = 0.01. 1 1 1 2 2 2 n 8, x 93,s 20 n 9, x 129,s 24 = = = = = = 13 SOLUTION • o 1 and o 2 are unknown¬ t-test • Because s 1 and s 2 are not much different from each other, use equal-variance t-test. H 0 : µ 1 = µ 2 H A : µ 1 < µ 2 (or µ 1 - µ 2 <0) 14 • Decision Rule: Reject H 0 if t < -t 0.01,8+9-2 =-2.602 • Conclusion: Since t = -3.33 < -t 0.01,8+9-2 =-2.602, reject H 0 at o = 0.01. 1 2 2 2 2 2 1 1 2 2 p 2 2 p 1 2 1 2 ( (n 1)s (n 1)s (7)20 (8)24 s 494 n n 2 8 9 2 x x ) 0 (93 129) 0 t 3.33 1 1 1 1 494 s 8 9 n n ÷ + ÷ + = = = + ÷ + ÷ ÷ ÷ ÷ = = = ÷ | | | | + + | | \ . \ . ÷ 15 Test Statistic for µ 1 - µ 2 when o 1 = o 2 and unknown • Test Statistic: with the degree of freedom 1 2 1 2 2 2 1 2 1 2 (x x ) ( ) t = s s n n ÷ ÷ µ ÷µ + ( ) ( ) 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 1 2 (s / n s / n ) s / n s / n n 1 n 1 + v = | | | + | ÷ ÷ \ . 16 Inference about µ 1 – µ 2 : Unequal variances Conduct a hypothesis test as needed, or, build a confidence interval int 2 2 1 2 ( ) ( ) 1 2 , 1 2 is the confidence level Confidence erval s s x x t 2 n n where o v o ÷ ± + 1÷ 17 Which case to use: Equal variance or unequal variance? • Whenever there is insufficient evidence that the variances are unequal, it is preferable to perform the equal variances t-test. • This is so, because for any two given samples The number of degrees of freedom for the equal variances case The number of degrees of freedom for the unequal variances case > 18 –Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not eat high-fiber cereal for breakfast? –A sample of 30 people was randomly drawn. Each person was identified as a consumer or a non-consumer of high-fiber cereal. –For each person the number of calories consumed at lunch was recorded. Example: Making an inference about µ 1 – µ 2 19 ConsumersNon-cmrs 568 705 498 819 589 706 681 509 540 613 646 582 636 601 739 608 539 787 596 573 607 428 529 754 637 741 617 628 633 537 555 748 . . . . . . . . Solution: • The data are interval. • The parameter to be tested is the difference between two means. • The claim to be tested is: The mean caloric intake of consumers (µ 1 ) is less than that of non-consumers (µ 2 ). Example: Making an inference about µ 1 – µ 2 20 • The hypotheses are: H 0 : (µ 1 - µ 2 ) = 0 H 1 : (µ 1 - µ 2 ) < 0 – To check the whether the population variances are equal, we use computer output to find the sample variances We have s 1 2 = 1274.49, and s 2 2 = 13,386.49. – It appears that the variances are unequal. Example: Making an inference about µ 1 – µ 2 21 Example: Making an inference about µ 1 – µ 2 • Compute: Manually – From the data we have: 1 2 1 2 595.8; x 661.1 35.7; s 115.7 = = = = x s ( ) ( ) { } 2 2 2 2 2 2 2 35.7 /10 115.7 / 20 25.01 35.7 /10 115.7 / 20 10 1 20 1 df v + = = = ¦ ¹ ( ( ¦ ¦ ¸ ¸ ¸ ¸ + ´ ` ÷ ÷ ¦ ¦ ¹ ) 22 Example: Making an inference about µ 1 – µ 2 • Compute: Manually – The rejection region is t < -t o,v = -t .05,25 ~ -1.708 1 2 1 2 2 2 2 2 1 2 1 2 (x x ) ( ) (598.8 661.1) 0 t = 2.31 s s 35.7 115.7 n n 30 30 ÷ ÷ µ ÷µ ÷ ÷ = = ÷ + + 23 MINITAB OUTPUT • Two Sample T-Test and Confidence Interval Twosample T for Consumers vs Non-cmrs N Mean StDev SE Mean Consumers 10 595.8 35.7 11 Non-cmrs 20 661 116 26 • 95% C.I. for mu Consumers - mu Non-cmrs: ( -123, -7) T-Test mu Consmers = mu Non-cmrs (vs <): T= -2.31 p-value=0.015 DF= 25 24 | | 2 2 1 2 ( ) 1 2 / 2, 1 2 4103 10670 (604.02 633.239) 1.9796 43 107 29.21 27.65 56.86, 1.56 s s x x t n n o v | | | ÷ ± + | | \ . = ÷ ± + = ÷ ± = ÷ ÷ • Compute: Manually The confidence interval estimator for the difference between two means is Example: Making an inference about µ 1 – µ 2 25 • An ergonomic chair can be assembled using two different sets of operations (Method A and Method B) – The operations manager would like to know whether the assembly time under the two methods differ. Example 26 Example – Two samples are randomly and independently selected • A sample of 25 workers assembled the chair using method A. • A sample of 25 workers assembled the chair using method B. • The assembly times were recorded – Do the assembly times of the two methods differs? 27 Example: Making an inference about µ 1 – µ 2 Method A Method B 6.8 5.2 5.0 6.7 7.9 5.7 5.2 6.6 7.6 8.5 5.0 6.5 5.9 5.9 5.2 6.7 6.5 6.6 . . . . . . . . Assembly times in Minutes Solution • The data are interval. • The parameter of interest is the difference between two population means. • The claim to be tested is whether a difference between the two methods exists. 28 Solution: Making an inference about µ 1 – µ 2 • Compute: Manually –The hypotheses test is: H 0 : (µ 1 - µ 2 ) = 0 H 1 : (µ 1 - µ 2 ) = 0 – To check whether the two unknown population variances are equal we calculate S 1 2 and S 2 2 . – We have s 1 2 = 0.8478, and s 2 2 =1.3031. – The two population variances appear to be equal. 29 Solution: Making an inference about µ 1 – µ 2 • Compute: Manually 48 2 25 25 . f . d 93 . 0 25 1 25 1 076 . 1 0 ) 016 . 6 288 . 6 ( t = ÷ + = = | . | \ | + ÷ ÷ = 3031 . 1 s 8478 . 0 s 016 . 6 x 288 . 6 x 2 2 2 1 2 1 = = = = 076 . 1 2 25 25 ) 303 . 1 )( 1 25 ( ) 848 . 0 )( 1 25 ( S 2 p = ÷ + ÷ + ÷ = – To calculate the t-statistic we have: 30 • The rejection region is t < -t o/2,v =-t .025,48 = -2.009 or t > t o/2,v = t .025,48 = 2.009 • CONCLUSION: Since t = -2.009 < 0.93 < 2.009, there is insufficient evidence to reject the null hypothesis. For o = 0.05 2.009 .093 -2.009 Rejection region Rejection region Solution 31 Solution: Making an inference about µ 1 – µ 2 .3584 > .05 -2.0106 < .93 < +2.0106 t-Test: Two-Sample Assuming Equal Variances Method A Method B Mean 6.29 6.02 Variance 0.8478 1.3031 Observations 25 25 Pooled Variance 1.08 Hypothesized Mean Difference 0 df 48 t Stat 0.93 P(T<=t) one-tail 0.1792 t Critical one-tail 1.6772 P(T<=t) two-tail 0.3584 t Critical two-tail 2.0106 32 • Conclusion: There is no evidence to infer at the 5% significance level that the two assembly methods are different in terms of assembly time Solution: Making an inference about µ 1 – µ 2 33 Solution: Making an inference about µ 1 – µ 2 A 95% confidence interval for µ 1 - µ 2 is calculated as follows: 1 2 2 1 2 , 2 1 2 1 1 ( ) ( ) 1 1 6.288 6.016 2.0106 1.075( ) 25 25 0.272 0.5896 [ 0.3176, 0.8616] n n p x x t s n n o 2 + ÷ ÷ ± + = ÷ ± + = ± = ÷ Thus, at 95% confidence level -0.3176 < µ 1 - µ 2 < 0.8616 Notice: “Zero” is included in the confidence interval 34 Checking the required conditions for the equal variances case The data appear to be approximately normal 0 2 4 6 8 10 12 5 5.8 6.6 7.4 8.2 More Design A 0 1 2 3 4 5 6 7 4.2 5 5.8 6.6 7.4 More Design B 35 ANALYSIS OF PAIRED DATA • What is a matched pair experiment? • Why matched pairs experiments are needed? • How do we deal with data produced in this way? The following example demonstrates a situation where a matched pair experiment is the correct approach to test the difference between two population means. 36 Example – To investigate the job offers obtained by MBA graduates, a study focusing on salaries was conducted. – Particularly, the salaries offered to finance majors were compared to those offered to marketing majors. – Two random samples of 25 graduates in each discipline were selected, and the highest salary offer was recorded for each one. – Can we infer that finance majors obtain higher salary offers than do marketing majors among MBAs?. ANALYSIS OF PAIRED DATA 37 • Solution – Compare two populations of interval data. – The parameter tested is µ 1 - µ 2 Finance Marketing 61,228 73,361 51,836 36,956 20,620 63,627 73,356 71,069 84,186 40,203 . . . . . . µ 1 µ 2 The mean of the highest salary offered to Finance MBAs The mean of the highest salary offered to Marketing MBAs – H 0 : (µ 1 - µ 2 ) = 0 H 1 : (µ 1 - µ 2 ) > 0 ANALYSIS OF PAIRED DATA 38 • Solution – continued From the data we have: 559 , 228 , 262 s , 294 , 433 , 360 s 423 , 60 x 624 , 65 x 2 2 2 1 2 1 = = = = • Let us assume equal variances ANALYSIS OF PAIRED DATA Equal Variances Finance Marketing Mean 65624 60423 Variance 360433294 262228559 Observations 25 25 Pooled Variance 311330926 Hypothesized Mean Difference 0 df 48 t Stat 1.04 P(T<=t) one-tail 0.1513 t Critical one-tail 1.6772 P(T<=t) two-tail 0.3026 t Critical two-tail 2.0106 There is insufficient evidence to conclude that Finance MBAs are offered higher salaries than marketing MBAs. 39 • Question – The difference between the sample means is 65624 – 60423 = 5,201. – So, why could we not reject H 0 and favor H 1 where (µ 1 – µ 2 > 0)? The effect of a large sample variability 40 • Answer: – S p 2 is large (because the sample variances are large) S p 2 = 311,330,926. – A large variance reduces the value of the t statistic and it becomes more difficult to reject H 0 . The effect of a large sample variability ) n 1 n 1 ( s ) ( ) x x ( t 2 1 2 p 2 1 + µ ÷ µ ÷ ÷ = 2 1 41 Reducing the variability The values each sample consists of might markedly vary... The range of observations sample B The range of observations sample A 42 ...but the differences between pairs of observations might be quite close to one another, resulting in a small variability of the differences. 0 Differences The range of the differences Reducing the variability 43 Analysis of Paired Data • Since the difference of the means is equal to the mean of the differences we can rewrite the hypotheses in terms of µ D (the mean of the differences) rather than in terms of µ 1 – µ 2. • This formulation has the benefit of a smaller variability. Group 1 Group 2 Difference 10 12 - 2 15 11 +4 Mean1 =12.5 Mean2 =11.5 Mean1 – Mean2 = 1 Mean Differences = 1 44 Analysis of Paired Data • Data are generated from matched pairs not independent samples. • Let X i and Y i denote the measurements for the i- th subject. Thus, (X i , Y i ) is a matched pair observations. • Denote D i = Y i -X i or X i -Y i . • If there are n subjects studied, we have D 1 , D 2 ,…, D n . Then, n n 2 2 i i 2 2 2 D i 1 i 1 D D D D nD s D and s s n n 1 n = = ÷ = = ¬ = ÷ ¿ ¿ 45 CONFIDENCE INTERVAL FOR µ D = µ 1 - µ 2 • A 100(1-o)° C.I. for µ D =µ 1 ÷µ 2 is given by: • For n > 30, we can use z instead of t. D D /2, n-1 s x t n o ± 46 HYPOTHESIS TESTS FOR µ D = µ 1 - µ 2 • The test statistic for testing hypothesis about µ D is given by with degree of freedom n-1. D D D x t = s / n ÷µ 47 EXAMPLE • Sample data on attitudes before and after viewing an informational film. Subject Before After Difference 1 41 46.9 5.9 2 60.3 64.5 4.2 3 23.9 33.3 9.4 4 36.2 36 -0.2 5 52.7 43.5 -9.2 6 22.5 56.8 34.3 7 67.5 60.7 -6.8 8 50.3 57.3 7 9 50.9 65.4 14.5 10 24.6 41.9 17.3 i X i Y i D i =Y i -X i 48 • 90% CI for µ D = µ 1 - µ 2 : • With 90% confidence, the mean attitude measurement after viewing the film exceeds the mean attitude measurement before viewing by between 0.36 and 14.92 units. D D 7.64,s 12.57 = = D / 2,n 1 s 12.57 D t 7.64 1.833 n 10 o ÷ ± = ± t 0.05, 9 D 1 2 0.36 14.92 s µ = µ ÷µ s 49 EXAMPLE • How can we design an experiment to show which of two types of tires is better? Install one type of tire on one wheel and the other on the other (front) wheels. The average tire (lifetime) distance (in 1000’s of miles) is: with a sample difference s.d. of • There are a total of n=20 observations 4.55 D X = 7.22 D s = 50 SOLUTION H 0 : µ D =0 H A :µ D >0 • Test Statistics: D D D x 4.55 0 t = 2.82 s / n 7.22/ 20 ÷µ ÷ = = Rejection H 0 if t>t .05,19 =1.729, Conclusion: Reject H 0 at o=0.05 51 EXAMPLE • It is claimed that an industrial safety program is effective in reducing the loss of working hours due to factory accidents. The following data are collected concerning the weekly loss of working hours due to accidents in six plants both before and after the safety program is instituted. 52 Loss of working hours 1 2 3 4 5 6 Before 12 30 15 37 29 15 After 10 29 16 35 26 16 Do the data substantiate the claim? Use o = 0.05. 53 ANSWER • This is a matched pair experiment because samples from two populations are not independent. Loss of working hours Difference 2 1 -1 2 3 -1 1, 1.67, 6 D D x s n = = = 54 • µ 1 denote the average loss of working hours due to factory accidents before the safety program. • µ 2 denote the average loss of working hours due to factory accidents after the safety program. • Also let . Then, 1 2 D µ µ µ = ÷ 0 : 0 : 0 D A D H H µ µ = > 55 Test statistic: • Rejection region: • Conclusion: Do not reject H 0 at o = 0.05 because . There is not sufficient evidence to conclude that the mean loss of working hours due to factory accidents reduces after the safety program. 1 1.47 / 1.67/ 6 D D x t s n = = = , 1 0.05,5 2.015 n t t t o ÷ > = = 0.05,5 1.47 2.015 t t = < = 56 PAIRED DATA AND TWO SAMPLE t PROCEDURE • The two-sample t test is based on the assumption of independence. • In many paired experiments, there is a strong dependence between variables. 57 Inference About the Difference of Two Population Proportions Population 1 Population 2 PARAMETERS: p 1 PARAMETERS: p 2 Statistics: Statistics: Sample size: n 1 Sample size: n 2 1 ˆ p 2 ˆ p 58 Inference about the difference between two population proportions • In this section we deal with two populations whose data are nominal. • For nominal data we compare the population proportions of the occurrence of a certain event. • Examples – Comparing the effectiveness of new drug versus older one – Comparing market share before and after advertising campaign – Comparing defective rates between two machines 59 Parameter and Statistic • Parameter – When the data are nominal, we can only count the occurrences of a certain event in the two populations, and calculate proportions. – The parameter is therefore p 1 – p 2. • Statistic – An unbiased estimator of p 1 – p 2 is (the difference between the sample proportions). 1 2 ˆ ˆ p p ÷ 60 Sample 1 Sample size n 1 Number of successes x 1 Sample proportion • Two random samples are drawn from two populations. • The number of successes in each sample is recorded. • The sample proportions are computed. Sample 2 Sample size n 2 Number of successes x 2 Sample proportion x n 1 1 ˆ = p 1 2 2 2 n x p ˆ = Sampling Distribution of 1 2 ˆ ˆ p p ÷ 61 SAMPLING DISTRIBUTION OF • A point estimator of p 1 -p 2 is • The sampling distribution of is if n i p i > 5 and n i (1-p i ) > 5, i=1,2. 1 2 ˆ ˆ p p ÷ 1 2 1 2 1 2 x x ˆ ˆ p p n n ÷ = ÷ 1 2 ˆ ˆ p p ÷ 1 1 2 2 1 2 1 2 1 2 p (1 p ) p (1 p ) ˆ ˆ p p ~ N(p p , ) n n ÷ ÷ ÷ ÷ + 62 2 2 2 1 1 1 2 1 2 1 ) 1 ( ) 1 ( ) ( ) ˆ ˆ ( n p p n p p p p p p Z ÷ + ÷ ÷ ÷ ÷ = The z-statistic Because and are unknown the standard error must be estimated using the sample proportions. The method depends on the null hypothesis 1 p 2 p 63 Testing the p 1 – p 2 • There are two cases to consider: Case 1: H 0 : p 1 -p 2 =0 Calculate the pooled proportion 1 2 1 2 ˆ x x p n n + = + Then Then Case 2: H 0 : p 1 -p 2 =D (D is not equal to 0) Do not pool the data 2 2 2 ˆ x p n = 1 1 1 ˆ x p n = 1 2 1 2 ˆ ˆ ( ) 0 1 1 ˆ ˆ (1 )( ) p p Z p p n n ÷ ÷ = ÷ + 2 2 2 1 1 1 2 1 n ) p ˆ 1 ( p ˆ n ) p ˆ 1 ( p ˆ D ) p ˆ p ˆ ( Z ÷ + ÷ ÷ ÷ = 64 EXAMPLE (CASE 1) • A manufacturer claims that compared with his closest competitor, fewer of his employees are union members. Over 318 of his employees, 117 are unionists. From a sample of 255 of the competitor’s labor force, 109 are union members. Perform a test at o = 0.05. • p 1 : the proportion of the manufacturer’s employees that are union members. • p 2 : the proportion of his closest competitor’s employees that are union members. 65 SOLUTION H 0 : p 1 - p 2 =0 H A : p 1 - p 2 < 0 and , so pooled sample proportion is Test Statistic: 1 1 1 x 117 ˆ p n 318 = = 2 2 2 x 109 ˆ p n 255 = = 1 2 1 2 x x 117 109 ˆ p 0.39 n n 318 255 + + = = = + + (117/ 318 109/ 255) 0 z 1.4518 1 1 (0.39)(1 0.39) 318 255 ÷ ÷ = = ÷ | | ÷ + | \ . 66 • Decision Rule: Reject H 0 if z < -z 0.05 =-1.645. • Conclusion: Because z = -1.4518 > -z 0.05 =- 1.645, not reject H 0 at o=0.05. Manufacturer is wrong. 67 • The marketing manager needs to decide which of two new packaging designs to adopt, to help improve sales of his company’s soap. – A study is performed in two supermarkets: • Brightly-colored packaging is distributed in supermarket 1. • Simple packaging is distributed in supermarket 2. – First design is more expensive, therefore,to be financially viable it has to outsell the second design. Testing p 1 – p 2 (Case 1) 68 • Summary of the experiment results –Supermarket 1 - 180 purchasers of Johnson Brothers soap out of a total of 904 –Supermarket 2 - 155 purchasers of Johnson Brothers soap out of a total of 1,038 –Use 5% significance level and perform a test to find which type of packaging to use. Testing p 1 – p 2 (Case 1) 69 • Solution – The problem objective is to compare the population of sales of the two packaging designs. – The data are nominal (Johnson Brothers or other soap) – The hypotheses are H 0 : p 1 - p 2 = 0 H 1 : p 1 - p 2 > 0 – We identify this application as case 1 Population 1: purchases at supermarket 1 Population 2: purchases at supermarket 2 Testing p 1 – p 2 (Case 1) 70 Testing p 1 – p 2 (Case 1) • Compute: Manually – For a 5% significance level the rejection region is z > z o = z .05 = 1.645 1 2 1 2 ˆ ( ) ( ) (180 155) (904 1, 038) .1725 The pooled proportion is p x x n n = + + = + + = 90 . 2 038 , 1 1 904 1 ) 1725 . 1 ( 1725 . 1493 . 1991 . 1 1 ) ˆ 1 ( ˆ ) ( ) ˆ ˆ ( 2 1 2 1 2 1 = | | . | \ | + ÷ ÷ = | | . | \ | + ÷ ÷ ÷ ÷ = n n p p p p p p Z becomes statistic z The 1 2 ˆ ˆ 180 904 .1991, 155 1, 038 .1493 The sample proportions are p and p = = = = 71 Testing p 1 – p 2 (Case 1) • Excel (Data Analysis Plus) Conclusion: There is sufficient evidence to conclude at the 5% significance level, that brightly-colored design will outsell the simple design. z-Test: Two Proportions Supermarket 1 Supermarket 2 Sample Proportions 0.1991 0.1493 Observations 904 1038 Hypothesized Difference 0 z Stat 2.90 P(Z<=z) one tail 0.0019 z Critical one-tail 1.6449 P(Z<=z) two-tail 0.0038 z Critical two-tail 1.96 72 – The bath soap of Johnson Brother Company is not selling well. Hoping to improve sales, the company’s advertising agency developed two new designs. The first design features several bright colors and the second design is light green in color with the company’s logo on it. Management needs to decide which of two new packaging designs to adopt, to help improve sales of a certain soap. – A study is performed in two supermarkets: – For the brightly-colored design to be financially viable it has to outsell the simple design by at least 3%. Testing p 1 – p 2 (Case 2) 73 • Summary of the experiment results –Supermarket 1 - 180 purchasers of Johnson Brothers’ soap out of a total of 904 –Supermarket 2 - 155 purchasers of Johnson Brothers’ soap out of a total of 1,038 – Use 5% significance level and perform a test to find which type of packaging to use. Testing p 1 – p 2 (Case 2) 74 • Solution – The hypotheses to test are H 0 : p 1 - p 2 = .03 H 1 : p 1 - p 2 > .03 – We identify this application as case 2 (the hypothesized difference is not equal to zero). Testing p 1 – p 2 (Case 2) 75 • Compute: Manually The rejection region is z > z o = z .05 = 1.645. Conclusion: Since 1.15 < 1.645 do not reject the null hypothesis. There is insufficient evidence to infer that the brightly-colored design will outsell the simple design by 3% or more. Testing p 1 – p 2 (Case 2) 15 . 1 038 , 1 ) 1493 . 1 ( 1493 . 904 ) 1991 . 1 ( 1991 . 03 . 038 , 1 155 904 180 ) ˆ 1 ( ˆ ) ˆ 1 ( ˆ ) ˆ ˆ ( 2 2 2 1 1 1 2 1 = ÷ + ÷ ÷ | | . | \ | ÷ | . | \ | = ÷ + ÷ ÷ ÷ = n p p n p p D p p Z 76 Testing p 1 – p 2 (Case 2) • Using Excel (Data Analysis Plus) z-Test: Two Proportions Supermarket 1 Supermarket 2 Sample Proportions 0.1991 0.1493 Observations 904 1038 Hypothesized Difference 0.03 z Stat 1.14 P(Z<=z) one tail 0.1261 z Critical one-tail 1.6449 P(Z<=z) two-tail 0.2522 z Critical two-tail 1.96 77 ESTIMATING p 1 -p 2 1 1 2 2 1 2 / 2 1 2 ˆ ˆ ˆ ˆ ˆ ˆ ( ) p q p q p p z n n o ÷ ± + 100(1÷o)% Confidence Interval for p 1 -p 2 : 78 EXAMPLE • An antibiotic for pneumonia was injected into 100 patients with kidney malfunctions (called uremic patients) and 100 patients with no kidney malfunctions (called normal patients). Some allergic reaction developed in 38 of the uremic patients and 21 of the normal patients. 79 a) Do the data provide strong evidence that the rate of incidence of allergic reaction to the antibiotics is higher in uremic patients than normal patients? • Let p 1 : the rate of incidence of allergic reaction to the antibiotics in uremic patients and P2: the rate of incidence of allergic reaction to the antibiotics in normal patients b) Construct a 95% confidence interval for the difference between the population proportions and interpret the result.
Copyright © 2024 DOKUMEN.SITE Inc.