10/2/2017Introduction Each day the results of some new study appear in the media with such headlines as ◦ Lung cancer deaths surge among women ◦ Passive smoking causes heart disease ◦ Hard workout raises heart attack risk ◦ New soldering process produces reliable hardware without harming the ozone. Experimental & Survey Design Sometimes the study is based on simple observations, but the results of such observational studies are often disputed. For the results of a study to be credible, the data used must be obtained following established scientific procedures (including the proper use of statistical methods). Then the study gains credibility, because other researchers can repeat the experiment and reproduce the results (subject only to sampling variability). 1 2 Experimental Design Example 1 To obtain objective and credible results from the study, the A dietician wants to conduct an experiment to researchers must design an experiment that is sensitive to the purpose of their particular study. determine if either a diet of oat bran or a diet of oatmeal is effective in reducing serum cholesterol An experimental design is a set of plans and instructions by levels. The dietician plans to randomly select 30 which the data in an experiment are collected, which should people whose cholesterol exceed the desirable level of include: 200 mg/dL. These individuals will be randomly assigned ◦ 1. A clear statement of the problem to be addressed to one of the two diets (15 each diet) for 12 weeks. ◦ 2. A determination of the experimental factors to be studied The dietician also wants to determine whether ◦ 3. An identification of the levels of each experimental factor cholesterol levels are affected by the amount of the daily ◦ 4. An identification of the response (outcome) of the experiment intake of either diet and considers the amounts of 30 oz, ◦ 5. A statement of how the experiment will be conducted 60 oz and 90 oz to be tried in the experiment. In ◦ 6. Appropriate methods of analysis addition, the interaction between the diet and the When should we do Experimental Design? amount of intake is also of interest to the dietician. 3 4 Terms and Concepts Analysis of variance (ANOVA) Experimental units (subjects) – The basic units for Main statistical method used in the analyze of which response measurements are collected. the data from experimental designs. Factors – Distinct types of conditions that are manipulated on the experimental units. Technique used to test for the significance of Levels – The different modes of presence of a factor. the difference between more than 2 sample Treatment – Specific combination of the levels of means and to make inferences about whether different factors. our samples are drawn from the same Replicates – The experimental units on which a populations having the same mean. particular treatment is applied. Interaction – The phenomenon that the variation in the ANOVA is one of the most powerful statistical response caused by one factor is affected by the level of another factor. methods developed by R.A Fisher. 5 6 1 10/2/2017 Analysis of variance (ANOVA) Assumptions in ANOVA The basic idea of ANOVA is to find out how the variation in the response is influenced by the factors of interest. • Each sample is drawn from a normal population Generally, if a factor makes a large contribution to the and the sample statistics tend to reflect the response variation, then it is a significant factor that characteristics of the population. influences the response, otherwise it may be ignored or dropped from the experiment. • Within each group/population, the response variable is normally distributed. For example, if the cholesterol level of the people on the oat bran diet differs greatly from that on oatmeal diet, then diet is a significant factor influencing the cholesterol • The populations from which the samples are level. Otherwise, if the cholesterol level is of little drawn have identical means and variances, i.e difference between the two diets, then the effect of diet is negligible, since using which diet does not really matter • μ1 = μ2 = μ3 =…= μn much. • σ1 = σ2 = σ3 = …= σ4 7 8 Applications of experimental designs: Hypotheses of ANOVA Experimental design has found broad H0: The (population) means of all groups under applications in many areas. For example, in the consideration are equal or the (population) engineering world, experimental design is a means are NOT significantly different. critically important tool for improving the performance of a manufacturing process and H1: The (population) means are not all equal or development of new processes. It can result in the (population) means are significantly different. ◦ improved process yields; H0 = Null Hypothesis ◦ reduced variability and closer conformance to nominal or target requirements; H1 = Alternative Hypothesis ◦ reduced development time and overall costs; and so on. 9 10 Decision Rule of ANOVA ANOVA - Steps Decision: Whether to accept or reject the Step 1: Check the mean column from Hypotheses. Descriptives table (Are the group means different?) If the significance value (which is usually labelled Step 2: From the ANOVA table, Check the sig. or p-value in the ANOVA table) is less than sig. value/p-value (if the Sig. value is less alpha (α), reject H0; if it's greater than alpha, do than 0.05, we reject H0) not reject H0 Step 3: Check the F-value (ratio) (If the F-value > F critical value, it means that there is statistically significant difference) 11 12 2 10/2/2017 ANOVA Table 13 14 Example: ANOVA ANOVA Table Descriptives BMD = BONE MINERAL DENSITY Suppose we have three groups of 60 athletes, N 95% Confidence Interval for Mean marathon, swimmer and weight lifter. Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum Objective: To test if the Average BMD* (Bone marathon 20 .9235 .19535 .04368 .8321 1.0149 .62 1.25 Swimmer 20 .9740 .29605 .06620 .8354 1.1126 .22 1.50 Mineral Density) is different for the three Weight Lifter 20 1.2100 .30253 .06765 1.0684 1.3516 .62 1.79 groups. Total 60 1.0358 .29299 .03783 .9601 1.1115 .22 1.79 Therefore we conduct an ANOVA test. ANOVA BMD Sum of Sig. *A Bone Mineral Density (BMD) test uses X-rays Squares df Mean Square F Between Groups .936 2 .468 6.457 .003 to measure how many grams of calcium and other Within Groups Total 4.129 5.065 57 59 .072 bone minerals are packed into a segment of bone. 15 16 Interpretation of ANOVA Exercise Descriptives Table: These are group statistics, which provide the EcoPaper Company Ltd makes grocery bags. They are means and standard deviations of the groups. interested in increasing the tensile strength of their Tell me about the sample size? product. It is thought that strength is a function of the What can we conclude about the results? hardwood concentration in the pulp. ANOVA Table: Note that you do not need to look up a critical value for F to decide if you should reject the null hypothesis or not. An investigation is carried out to compare four levels of Instead, just compare the "Sig." value to alpha (which is usually .05, as hardwood concentration: 5%, 10%, 15% and 20%. Six test you know). specimens are made at each level and all 24 specimens are Decision Rule: If the significance value is less than alpha, reject H0; if then tested in random order. it's greater than alpha, do not reject H0. So, in this case, because the significance value of 0.003 is less than alpha = 0.05, we reject the null We conduct the ANOVA test and the results are as hypothesis (i.e, accept H1). We would report the results of this follows. ANOVA by saying something like, "The Average BMD was significant differences between the groups, F (2, 57) = 6.457, p = 0.003.“ Tell me if the sig. value = 0.097, what would be your decision? 17 18 3 10/2/2017 Exercise I Interpretation of ANOVA Descriptives Table: These are group statistics, which provide the means and standard deviations of the groups. It indicates that the group means are different. On average, 20% hardwood concentration is higher (21.17) compare to 5% hardwood concentration which is lower (10.0). It can be said that the strength increases with concentration. ANOVA Table: Note that you do not need to look up a critical value for F to decide if you should reject the null hypothesis or not. Instead, just compare the "Sig." value to alpha (which is usually .05, as you know). Decision Rule: If the significance value is less than alpha, reject H0; if it's greater than alpha, do not reject H0. So, in this case, because the significance value of 0.001 is less than alpha = 0.05, we reject the null hypothesis (i.e, accept H1). We would report the results of this ANOVA by saying something like, "The Average hardwood concentration was significant differences between the groups, F (3,20) = 19.605, p = 0.001.“ 19 20 4