STAT 2610 Spring 2007Make-up Exam 1 Name___________________________________ Answer all 30 questions on the exam by selecting the best alternative for the multiple choice type questions or writing in your answer in the space provided for the open-ended ones. Show working wherever appropriate. 1. A survey parked in student and staff lots at a large college recorded the make, country of origin, type of of autos vehicle (car, SUV, etc.) and age. Classify the make. a. Categorical b. Qua ntitative T h 2 . Quantitative b. Cate gorical W num of homeruns in a professional h ber baseball player's career 3 . type of fish c. caught number of d. pars in a round of golf daily high e. temperature in New York City brand of tennis shoe A nurse measured the blood pressure of each person who visited her clinic. Following is a relative-frequency histogram for the systolic blood pressure readings for those people aged between 25 and 40. Use the histogram to answer the question. The blood pressure readings were given to the nearest whole number. 4. Given that 200 people were aged between 25 and 40, approximately how many had a systolic blood pressure reading less than 130? 5. Identify overall shape of the distribution of systolic blood pressure in the histogram below. the Shape__________________ 6. F i ll in the blanks. The two primary graphical displays for summarizing a categorical variable are the ____________________ and the ____________________. 7. For the distribution drawn here, identify the mean, median, and mode. a. A = mode, B = median, C = mean b. A = mode, B = mean, C = median c. A = median, B = mode, C = mean d. A = mean, B = mode, C = median e. A = median, B = mean, C = mode 8. Which of the following numerical summary measures cannot be negative? a. mode b. mean c. Q3 d. z-score e. standard deviation 9. The students in a particular calculus class is 110, with a standard deviation of 5. The distribution is average roughly bell-shaped. Use the Empirical Rule to find the percentage of students with an IQ above IQ of 120. Percentage = ______________ 10. SAT verbal scores are normally distributed with a mean of 433 and a standard deviation of 90. Use the Empirical Rule to determine what percent of the scores lie between 433 and 523. Percentage = ______________ 11. In a compute the distance they travel one way to school to the nearest tenth of a mile. The data is listed random =29.916 sample, below. Compute the sample standard deviation of the data given that 10 students were asked to Standard deviation = __________________ 12. The test below. Find the first quartile, scores of 15 students are listed Q1 = ___________________ . 13. Fill in the blank: The _________________________ represents the typical distance of a data value from the mean. 14. Hi-Tech Agi Inc wants to determi ne if the rainfall in inches can be used to predict the yield per acre on a corn farm. The response variable and explanatory variable are respectively Response Variable __________________________ Explanatory Variable ________________________ 15. The relative frequencies of the data on age, in years, and sex from the residents of a retirement home. partially Age (yrs) filled 60-69 70-79 Over 79 Total conditio Male 0.19 0.1 0.11 nal proporti Female 0.2 0.1 0.3 on table Total 1 gives the What percentage of residents are males and over 79? Ans = _________________ 16. The relative frequencies of the data on age, in years, and sex from the residents of a retirement home. partially Age (yrs) filled 60-69 70-79 Over 79 Total conditio Male 0.19 0.1 0.11 nal proporti Female 0.2 0.1 0.3 on table Total 1 gives the Of the females what percentage is in the over 79 age group? Ans = _________________ 17. The between the number of games won by a minor league baseball team and the average attendance relations at their home games is analyzed. A regression to predict the average attendance from the number hip of games won has an Interpret this statistic. a. No b. association Positive, relationship. 7.29% of the variation in average attendance is explained by the number of weak linear games won. c. Negative, linear relationship. 53.29% of the variation in average attendance is explained by the number fairly strong of games won. d. Positive, linear relationship. 53.29% of the variation in average attendance is explained by the number fairly strong of games won. e. Positive, fairly strong linear relationship. 73% of the variation in average attendance is explained by the number of games won. Determine whether the scatterplot shows little or no association, a negative association, a linear association, a moderately strong association, or a very strong association then decide on the most plausible correlation value. 18. a. 0.7 b. -0.7 c. 1.0 d. -1.0 e. - 0 0 19. N Ny that can a abe used to m describe ethe associatio an between two dcategorica il s variables? pDisplay = l __________ a________ T d ds on the h e units of p measure e ment of x. n 21. d d t depend s e o on the of p e units of m e s measure e n n ment of y a d o or x. s s u r o e n m e tn h t e of y u a n n id tx. d i lways a ni e s positive ts p a number of e m n e d a s s u 22. o r n e m te h n e t of u y. Nine 0.867 and the regression equation pairs of for ? data yield r = Predicted Value = ________________ Also, What is the best predicted value of y 23. A random sample of records of electricity usage of homes in the month of July gives the amount of electricity used and size (in square feet) of 135 homes. A regression was done to predict the amount of electricity used (in kilowatt-hours) from size. Assume that a linear model is appropriate. The model is = 1229 + 0.02 size. What would a negative residual mean for people living in a house that is 2290 square feet? a. They are b. using more electricity than expected from the regression equation. Their house is smaller than expected from the regression equation. c. They are d. using less electricity based on the number of houses in the sample. They are e. using less electricity than expected from the regression equation. Their house is bigger than expected from the regression equation. 24. A random sample of records of electricit y usage of homes gives the amount of electricity used and size (in square feet) of 135 homes. A regression to predict the amount of electricity used (in kilowatt-hours) from size has an a linear model is appropriate. Write a sentence summarizing what =0.71. Assume that says about this regression. The prediction error __________________________________________________________________________ ___________________________________________________________________________________________ 25. A random sample of records of electricity usage of homes gives the amount of electricity used in July and size (in square feet) of 135 homes. A regression was done to predict the amount of electricity used (in kilowatt-hours) from size. Assume that a linear model is appropriate. What units does the slope have? Slope is measured in _______________________________ 26. Among the possible lines that can go through data points in a scatterplot, the regression line results from the least squares method and has the smallest value for the sum of the ____________________. Fill in the missing information. 27. = ______________________ 28. The fact direction of an association between two variables can change after we include a third variable and that the analyze the data at separate levels of that variable is known as ___________________________ 29. A study that the amount of chocolate consumed in Canada and the number of automobile accidents is shows positively related. Find a lurking variable, if there is one. a. Population b. growth Speed c. Vacation d. Children e. No lurking variable 30. True or recent television survey, participants were asked to answer "yes" or "no" to the question "Are you false. in favor of the death penalty?". Six thousand five hundred responded "yes" while 3700 responded In a "no". There was a fifty-cent charge for the call. This technique produces a random sample? a. True b. False a 1 . 2 . a d 3. 4 . 150 5. Righ Skewed t 6. pie t; bar graph char 7. a 8. e 9. 2.5 % 10. 34% 11. 1.8 12. 57.0 13. stan deviation dard 14. yield acre; rainfall in inches per 15. 11% 16. 50% 17. d 18. b 19. scat ot terpl 20. cont ncy table inge 21. c 22. 56.6 23. d 24. The n error using the regression line to predict electricity use is 71% smaller than the prediction error using pred to predict it. ictio 25. Slop measured in _______________________________ kilowatt-hours per square foot. e is 26. resi sum of squares dual 27. -168 4.4x + 28. Sim 's paradox. pson 29. a 30. b