Correlation and regression problems 1. The following data represent the years of experience X and salary Y(in thousand dollars) of a random sample of professional engineers (n= 27) X 1 2 2 Y 25 28 37 13 52 14 66 18 53 3 3 4 4 6 6 6 31 40 36 43 40 43 51 18 70 20 70 21 60 22 61 22 71 7 7 42 55 25 58 9 41 9 51 9 60 12 50 12 64 25 74 a. Find the coefficient of correlation r. b. Is it sufficient sample evidence to indicate that there is linear correlation between the years of experience and salary? c. Test the hypothesis that ρ = 0.6 against the alternative ρ > 0.6 . Use α = 0.01 2. A chemical engineer is studying the effect of temperature on the yield of a certain product in chemical process. The process is run 10 times and the following data is observed for the temperature of each process X and corresponding yield Y. Temperature X (in °C ) 95 110 118 124 145 140 185 190 205 222 Yield (in kgs) 108 126 102 121 118 155 158 178 159 184 The following information is available: ∑ x = 1534 , ∑ x 2 = 252, 684 , ∑ y = 1409 , ∑ y 2 = 206, 319 ∑ xy = 226,463 a. Find the coefficient of correlation r. b. Is it sufficient sample evidence to indicate that there is linear correlation between temperature and yield of chemical product? c. Test the hypothesis that ρ = 0.8 against the alternative ρ ≠ 0.8 . Use α = 0.05 1 3. In the following data, X represents the number of years of formal education and Y represents the salary in thousands of dollars of a random sample of adult males. X 13 17 9 18 16 18 13 16 Y 21.6 25.8 15.9 48.3 38.2 56.4 28.4 43.3 a. Plot the scatter diagram. b. Compute the coefficient of correlation and test the hypothesis H 0 : ρ = 0 against H1 : ρ ≠ 0 . c. Find the coefficients of the least –squares line and write the equation for estimated regression line. d. Compute mean square error. e. Find a 95% confidence interval for β1 and β 0 . f. Test the hypothesis that β 0 = 0 against the alternative β 0 ≠ 0 . g. Test the hypothesis that β1 = 0 against the alternative β1 ≠ 0 . h. Estimate the true mean response µY | x to x = 16 years of education. i. Predict the salary Y corresponding to 16 years of education. j. Find the coefficient of determination and explain its meaning. k. Construct ANOVA table for regression. 4. The table below displays the mathematics achievement test scores for a random sample of n = 10 students, selected from the population of 12th graders, together with their final calculus grades. Math achievement 39 43 21 64 57 47 28 75 34 52 Test score Final calculus grade 65 78 52 82 92 89 73 98 56 75 a. Find the coefficients of the least –squares line and write the equation for estimated regression line. b. B. Determine whether there is significant relationship between the calculus grades and test scores . Use α = 0.05 c. Find a 95% confidence interval for the slope of regression line. d. Estimate the average calculus grade for the students whose achievement score is 50 with a 95% CI. e. A student took the achievement test, but has not yet taken the calculus test. Predict the calculus grade for this student with a 95% prediction interval. f. Find a 95% confidence interval for the intercept of regression line. g. Calculate the coefficient of determination and explain its meaning. h. Construct ANOVA table for regression. 2 5. You are given data set with 6 pairs of x‐ values and y –values. We assume that x is independent variable and y is dependent variable. x ‐2 ‐1 0 1 2 y 1 1 3 5 5 a. Find the least –squares line for the data. b. Do the data present sufficient evidence to indicate that y and x are linearly related? c. Construct thee ANOVA table for linear regression and use to calculate F = MSR . Verify that the square of test MSE statistic used in part b. coincides with critical value for F at α = 0.05 . d. Find 90% confidence interval for the slope of the line. e. Estimate the average value of y when x =1, using 90% confidence interval. f. Find a 90% prediction interval for future value of y when x =1. 6. The data, together with a portion of Minitab printout, is given below. Certain information in Minitab printout is missed. x 1 2 3 4 5 6 y 5.6 4.6 4.5 3.7 3.2 3.7 a. Find the least –squares line for the data. b. Fill in the missing entries in the MINITAB analysis of variance table. c. Do the data present sufficient evidence to indicate that y and x are linearly related? Use the information in the MINITAB printout to answer this question at the 1% level of significance. d. Find the coefficient of determination. e. Find 90% confidence interval for the slope of the line. f. Estimate the average value of y when x =2, using 90% confidence interval. g. Find a 95% prediction interval for the future value of y when x =2. Predictor Coef St. dev t P Constant 6.0000 0.1759 34.10 0.000 X ‐ 0.55714 0.04518 ‐12.33 0.000 S = 0.1890 R –sq =97.4% Analysis of variance Source DF SS MS Regression * * 5.4221 Residual error * 0.1429 * Total * 5.5750 3
Report "Correlation and Simple Linear Regression (Problems With Solutions)"