ISDS 361B REVIEW FOR EXAM #1I. Hypothesis Tests/Confidence Intervals of 1, 2 and More than Two Means Basic Concepts • • • “Can you conclude…HA?” -- Yes if p-value < α = .05 Requirement (one of the following): o Must be sampling from approximately normal distributions or o Must take large samples Begin by getting z or t-value o z or t is the number of standard errors the point estimate is from the hypothesized value – that is: Confidence Interval: x ± (z α/2 or t α/2, n -1 ) * (Standard Error) Then – σ known (given) – use z; σ unknown (not given) – use t (Point Estimate) - (Hypothesi zed Value) Standard Error • • Tests of One Population Mean • • • Point estimate = x Standard Error = σ/ n if σ known or s/ n if σ is unknown P-values By hand > Tests < Tests ≠ Tests Area above z or t value Area below z or t value 2* (Area in the tail) Excel z-test p-value/interval 1-NORMSDIST(z) NORMSDIST(z) Assuming z > 0: 2*(1-NORMSDIST(z)) Assuming z < 0: 2*NORMSDIST(z) x = AVERAGE(.. CONFIDENCE ..) (α, σ, n) zα/2 * (standard error) = Excel t-test p-value/interval Assuming t > 0: TDIST(t,df,1) Assuming t < 0: TDIST(-t,df,1) Assuming t > 0: TDIST(t,df,2) Assuming t < 0: TDIST(-t,df,2) Go to Descriptive Statistics x =M ean t α/2, n -1 = Confidence Interval x ±(z α/2 or t (S tandard α n -1 ) * /2, E r) rro Confidence Level DF)* SQRT(Var/Obs1+Var2/Obs2) (Mean1 – Mean2) ± TINV(α.Do z-test/z-interval 3. Determine if data is paired (Paired if problem is set up with something in common between each entry from one sample and a corresponding entry in the second sample) If YES -.DF)* SQRT(Pooled Var*(1/Obs1+1/Obs2)) • 2.Tests of the Differences Between 2 Population Means How to proceed: 1.Do t-test/t-interval with equal variances Standard Error Excel p-value t-test Paired 2 Sample for Means z-test 2 Sample for Means t-test 2 Sample Assuming Unequal Variances t-test 2 Sample Assuming Equal Variances Excel Confidence Interval BOLD/ITALIC from output 1.Create column of differences 2. Descriptive Statistics (Mean)±(Confidence Level) (Mean1 – Mean2) ± NORMSINV(1-α/2)* SQRT(Var/Obs1+Var2/Obs2) (Mean1 – Mean2) ± TINV(α. Can you conclude variances differ? (F-test – low p-value σ’s differ. p-value for F-test = 2*(one-tail p-value printed by Excel)) If YES – Do t-test/t-interval with unequal variances If NO -.Paired Difference Test/Confidence Interval Do you know the variances? (Variances or standard deviations given) If YES -. • • • Experimental Point Design Estimate Paired Difference Known Variances Unknown ≠ Variances xD x1 − x 2 sD nD 2 σ1 σ 2 + 2 n1 n 2 x1 − x 2 2 s1 s2 + 2 n1 n 2 Unknown = Variances x1 − x 2 1 1 s + n 1 n2 2 p . μi and μj differ if | x i .DFE)*SQRT(MSE*(1/n1+1/n2) o Comparing all pairs of k different means Any pair of means. proceed to the next steps below Can we conclude Factor A alone affects changes in mean values? -p-value of F-test for MSA/MSE Can we conclude Factor B alone affects changes in mean values? -p-value of F-test for MSB/MSE • Excel: Two Factor With Replication Must include one row and one column of labels Rows Per Sample = number of entries for each two factor treatment 1 1 MSE + n 1 n 2 where c = k(k-1)/2 1 1 MSE + n 1 n2 • • • • • .μ2: | x1 . μ1 and μ2 differ if | x1 .DFE EXCEL LSD = TINV(α.DFE)*SQRT(MSE*(1/n1+1/n2) o Confidence interval for μ1 .DFE EXCEL LSDEW = TINV(α/c.Tests for Differences of More Than 2 Means • Must Also Assume Variances are Equal How to proceed: Determine if there is: • One Factor With No Blocks Can we conclude Treatment means vary? -.x 2 | ± LSD One Factor With Blocks (only one entry for each factor/block pair) Can we conclude Treatment Means vary? -.x 2 | > LSD LSD = tα/2.x j | > LSDEW = tα/2c.p-value for F-test of MSB/MSE • Excel: Two Factor Without Replication Include one row and one column of labels and check Labels Two Factor (More than one entry for each combination of the two factors) Can we conclude the factors interact to affect changes in mean values? p-value of F-test for MSI/MSE – IF p-value low – conclude INTERACTION and STOP – if NOT.p-value for F-test of MSTr/MSE • Excel: Single Factor ANOVA Include one row labels and check Labels • Which means differ? o Comparing just 2 means Two means.p-value for F-test of MSTr/MSE Can we conclude Block Means vary? -. o Ft+k = (Cell containing b0) + (t+k)*(Cell containing b1) o o o o o • Ft+k+1 = (Cell containing b0) + (t+k+1)*(Cell containing b1) + (Cell containing S1) Ft+k+2 = (Cell containing b0) + (t+k+2)*(Cell containing b1) + (Cell containing S2) …. T3 = γ*(L3-L2)+(1-γ)*T2 o Drag down L3. F3 = α*y2 + (1. Ft+2k = (Cell containing b0) + (t+2k)*(Cell containing b1) Etc. F3 = L2+T2.Then MAD = AVERAGE(these values) • MSE: Square Error =(y-F)^2 -.. T2 = y2-y1.drag down to Ft+1 • Exponential smoothing: F2 = y1. Decomposition – See PowerPoint PERFORMANCE MEASURES o Do for all periods that have both a forecast and a time series value • MAD: Abs Dev = ABS(y – F) -. L3 = α*y3 + (1-α)*F3.drag down to Ft+1 • All approaches: Ft+k = Ft+1 LINEAR TREND ONLY MODELS • Regression: F1 = b0+b1*(cell denoting period = 1) – drag down to Ft+1 and beyond • Holts: L2 = y2.Then MSE = AVERAGE(these values) ..α)*F2 -. o Ft+2 = (cell with Ft+1) + (absolute reference to cell with Tt) o Drag cell containing Ft+2 down to Ft+3 and beyond LINEAR TREND AND SEASONAL MODELS (with k seasons) • Regression Approach o Add k-1 dummy variables and create columns 0’s and 1’s to denote season o REGRESSION on period and dummy variables o Ft+1 = (Cell containing b0) + (t+1)*(Cell containing b1) + (Cell containing S1) o Ft+2 = (Cell containing b0) + (t+2)*(Cell containing b1) + (Cell containing S2) o …. and F3 down to period t o Ft+1 = Lt + Tt.FORECASTING • • o o Do a scatterplot to determine if seasonality or cyclical effects exist To determine long term trend: Do Regression High p-value for β1 Stationary Model Low p-value for β1 Model with Linear Trend The following assumes there are n time series values: STATIONARY MODELS • Last period: F2 = y1 – drag down to Ft+1 • Moving average: Fn+1 = AVERAGE(highlight first n y’s) – drag down to Ft+1 • Weighted Moving Average: Fn+1 = wn*yn + wn-1*yn-1 … + w1y1 -. T3. • For either method: Choose the approach with lowest value .