Mann KendallTest (Reparado)

March 20, 2018 | Author: joseph_luis_3 | Category: Variance, P Value, Nonparametric Statistics, Spearman's Rank Correlation Coefficient, Data Analysis


Comments



Description

Kendall, M. G. (1970) Rank Correlation Methods, 4th ed.London: Griffin Statistical Methods for Environmental Pollution Monitoring, Richard O. Gilbert (1987) http://www.swrcb.ca.gov/water_issues/programs/tmdl/docs/303d_policydocs/205.pdf (Good intro, but lacks look-up table, pdf image cuts off last sentence on each page) Myles Hollander and Douglas A. Wolfe (1999) Nonparametric Statistical Methods, 2nd Edition Wiley-Interscience ISBN-10: 0471190454 ISBN-13: 978-0471190455 A User-Written SAS Program for Estimating Temporal Trends and Their Magnitude http://www.sjrwmd.com/technicalreports/pdfs/TP/SJ2004-4.pdf Techniques of Water-Resources Investigations of the United States Geological Survey Book 4, Hydrologic Analysis and Interpretation Chapter A3 Statistical Methods in Water Resources By D.R. Helsel and R.M. Hirsch http://pubs.usgs.gov/twri/twri4a3/pdf/twri4a3-new.pdf Detecting Trends of Annual Values of Atmospheric Pollutants by the Mann-Kendall Test and Sen’s Slope Estimates http://www.fmi.fi/kuvat/MAKESENS_MANUAL.pdf Statistical Sirens: The Allure of Nonparametrics, Ecology 76(6), 1995, Douglas H. Johnson, pp. 1998-2000 http://www.jstor.org/pss/1940733 Why Kendall tau? http://rsscse.org.uk/ts/bts/noether/text.html Kendall’s tau and Spearman’s Rho http://www.statisticssolutions.com/methods-chapter/statistical-tests/kendall-spearman-rank-correlation-coefficient/ Non-parametric Measures of Bivariate Relationships http://www.unesco.org/webworld/idams/advguide/Chapt4_2.htm Kendall's rank correlation http://www.statsdirect.com/help/nonparametric_methods/kend.htm (clearer description of how to handle ties) Powerpoint on nonparametric time series http://www.webs.uidaho.edu/envs541/Module_08/8_2.pdf ndall Test and Sen’s Slope Estimates -The Excel Template Application Makesens H. Johnson, pp. 1998-2000 ank-correlation-coefficient/ 05..This Excel file has been designed to calculate a Mann-Kendall trend statistic for ten data points (i. The worksheet will calculate the Mann-Kendall S statistic (FYI.. negative=downward). or yearJ > yearK Count the number of n(n-1)/2 cells that yielded a positive value (result > 0) and put the count value in the first column Count the number of n(n-1)/2 cells that yielded a negative value (result < 0) and put the count value in the first column Count the number of n(n-1)/2 cells that yielded a zero value (tied values) and put the count value in the row at the bott Sum all the plusses and all the minuses and subtract the total of minuses from the total of pluses.96 (positive or negative) is the critical value for Z. S=number of cells with positive values minus the number of cells with negative values. Evaluation # Positive diffs # Negative diffs S Variance(S) * ZS ** Zcrit. subtract the value in yearK from the value in yearJ in all n(n-1)/2 cases/cells where yearJ (Subtract the value on the left from the value on the top for all cells above the diagonal .00 165 1. If there are tied values. note that n>=5 is required to reach p < . Here is what the worksheet is doing: For every n*n pair of values.05. If n<10. then the following formula with the correction factor for the tied values should be used. The one -2. Y-axis title (C4) and the year labels (if necessary). Either way. below.. years) into cell C18. one if there are no tied values and another if there are tied values. then calculate variance and use the formula for the normal approximation of the probability of S There are two formulae. some authors refer to it as the K statistic).e. you must also… Enter the number of time periods (e.64689641 /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/ * Note: This variance formula assumes there are no tied values (i.00 80.00 -35. That's it. ten years).. then use the lookup table. at p < . Tied values may reduce the validity of the normal approximation when the number of data values is close to 10.e. The sign of S indicates the slope of the trend (positive=upward. Instructions Enter your data values into the green-highlighted cells C5:C14 of the sheet labeled "MannKendall" Change the slide title (B1).00 0 (n(n-1))(2n+5)/18) (This formula may be conservative in the presence of tied valu 0.top value minus left value for Above the diagonal will be values for which the column value is from a later year than the row value.05 Interpretation 45. no differences=0). If n>=10.g. two-tailed. for the critical value of S for various values of n. Clear the contents of any irrelevant cells from D26 to L34. If you have fewer than ten years of data. . our test is conservative. which increases the Z-score.00 # Negative diffs -35. /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/ Lookup Table for Significance of S: n Critical Values of Mann Kendall S Statistic for . at p < . the correction factor decreases the variance.96 1. So. /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/ ** The direction of Z indicates the direction of the trend.05 1. based on Gilbert's example (see last sheet in this file). If nyears=10. and three are tied.00 S 80. or 10*9/2 = 45.05. and the likelihood of significan So if there are tied values. So I'm thinking nyears>=10 is okay.. S must be at least 30 for significance at the p < . I think (according to how I am interpreting Gilbert's example) thatour 6. but others said only when n>=40. and we do not use the variance formula with the correction factor. The one Interpretation #NUM! In any case. we should use the lookup table to gauge the significa According to the table. But there was some ambiguity about the definition of n (#years versus #values in the matrix). q=1 and t1=3." It looks likeGilbe of years that share the same value. and 30 is what we have. then the number of values inside the matrix is n(n-1)/2.1 value con The value 6.05 level. But I will also recalculate the variance with the correction factor.Where q is the number of tied groups and t p is the number of data values in the p th group. but I think in this infant mortality example there is actually ONE tied "group.96 (positive or negative) is the critical value for Z. I'm not ENTIRELY sure.00 Variance(S) -102 using correction factor for tied data ZS #NUM! ** Zcrit. two-tailed. A positive(negative) value of Z indicates an upward(downward) Formula for ZS: if S > 0 then Z = S-1/SQRT(variance S) if S = 0 then Z = 0 if S < 0 then Z = S+1/SQRT(variance S) Some sources said the calculation for the normal approximation of the probability of S should only be used if n>=10.1 occurrs in three different years. Var(S) correction factor for tied values: Our variance after correction: = 3(3-1)((2*3)+5)) = =125-102 = 102 23 Evaluation of Tied Data # Positive diffs 45. Because we only had 10 years to begin with. years).g. years). p < .g. years). p < ... years).. p < .g.10 it requires 4 or more data points (e.05 it requires 5 or more data points (e.01 it requires 6 or more data points (e.5 10 15 20 25 30 35 40 Critical Value of S Critical Values of Mann Kendall S Statistic for alpha=.. .g.001 it requires 7 or more data points (e.05 and Varying Values of N 180 160 140 120 100 80 60 40 20 0 5 10 15 20 25 30 35 40 Number of Years (data points) /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/ Information of the Power of Mann-Kendall S Test: For MannKendall S to yield a significance level of: p < . . n(n-1)/2 cases/cells where yearJ > yearK al . ten years).top value minus left value for each cell. or yearJ > yearK. . f the probability of S ative in the presence of tied values. The one-tailed value is 1.05.e. two-tailed.65. he count value in the first column to the right. at p < .a points (i.) Z.) e count value in the first column to the right. the row value. efer to it as the K statistic). data values is close to 10. count value in the row at the bottom. d values should be used. of n (#years versus n-1)/2.65. at p < . ndicates an upward(downward) trend. Z. The one-tailed value is 1. n factor." It looks likeGilbert defines a "group" as a group s example) thatour 6.1 value constitutes one "group. 30 is what we have. our test is conservative. S should only be used if n>=10. and the likelihood of significance. critical value of S for alpha=. or 10*9/2 = 45.NE tied "group. two-tailed.05 .05. e." okup table to gauge the significance of S. 11 30 40 62 85 111 139 169 . . 55 -1.1 -0.1 5.62 6.4 6.14 0.0 2.1 5.3 -0.0 4.7 6.88 0 If n>=10.02 6.07 5.34 -0.22 6.6 6. Note: a significant p value is not possible with fewer than 4 time periods.03 4.68 6. 1999-2009 y-axis title: 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Deaths per 1000 Live Births 6.7 5.67 5.4 6. below.00 .6 -0.42 6.14 5.40 6.05.4 -0.0 n=number of time periods n= 11 Subtract each earlier year from each later year 1999 2000 2001 2002 6. then use the table.0 # ties (diff=0): 0 0 0 0 year J: year K: 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Source: NM Death Certifica 2003 6.4 6.0 6.00 -35. If n<10.74 5.0 0.00 7. then use the variance calculation to estimate probability.00 45. # Positive diffs # Negative diffs S 10.0 4.3 6.06 5. n>=5 is required to reach p<. New Mexico.36 -1.28 6.0 3.Infant Mortality.1 5.21 0.0 Deaths per 1000 Live Births Graph Title: 5.0 1.20 -0. 65) Sig. N>=10) Variance(S) ZS Zcrit. Decreasing Interpretation Evaluation (Lookup Table for Fewer Than 10 Years) If S>=S-crit..65 1.Evaluation (Normal Approximation.05 165 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence of tied values. -2.96 (two-tailed. then reject H0 S-crit (p<. For one-tailed test use 1.05) 1-tailed # Years 4 5 6 7 8 9 10 2-tailed 6 8 11 13 16 18 21 10 13 15 18 20 23 . 46 2007 5.00 4.0 6.66 -1.00 0.29 0 0 0 0 0 S= S= 10.00 -35.65 -1.00 0.11 2008 4.0 -1.Infant Mortality.0 -1.33 -0.00 0.26 #+ 0.0 2.94 -1.58 -1.00 minus .0 1.7 -1.03 0.00 1.54 -0.0 5.35 0.38 0.42 -1.00 1.1 -0.25 -1.75 -0.68 -1. New Mexico.07 0.36 -1.0 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Source: NM Death Certificate and Birth Certificate Data. NMDOH Bureau of Vital Records and Statistics.29 -0.7 -0.47 -0. 2004 6.0 0.39 2006 6.68 -1.68 -0.0 4.04 -0.08 0.28 0.74 -0.0 3.00 3.13 0. 1999-2009 7.39 -1.62 -1.93 -0.14 -0.40 -0.60 -0.54 -1.06 -0.40 -1.00 2009 5.00 0.88 -1.07 -0.33 -0.28 -1.1 -0.00 1.03 -0.00 10.01 0.21 2005 5. he presence of tied values. . 00 3.00 .00 3.00 0.00 9.00 3.00 6.00 4.00 2.00 45.2009 #10.00 5.00 45. . . N>=10) .1 5.50 -1.1 5 7 Inf Deaths per 1000 Live Births Graph Title: 6 5 4 3 2 1 0 2000 n=number of time periods n= 10 Subtract each earlier year from each later year 2000 2001 2002 2003 6.4 6.10 0.4 -0. # Positive diffs # Negative diffs S 6.00 -30.7 6.3 -0.70 5.4 6.1 5.0 # ties (diff=0): 0 0 0 0 year J: year K: 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2004 6.4 6.20 0.New Mexico Infant Mortality Rate from 1999-2009 y-axis title: 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Inf Deaths per 1000 Live Births 6.00 6.90 0 If n>=10. then use the variance calculation to estimate probability. below.05. If n<10.1 5.3 6.1 5.6 -0.6 6.6 6. Note: a significant p value is not possible with fewer than 4 time periods.00 Evaluation (Normal Approximation.1 5.1 -0.20 6.1 5.00 36. n>=5 is required to reach p<. then use the table.3 6.7 6.20 -0.30 -0.4 6.4 6.1 5.30 -1. -2.96 Sig.05 125 =(n(n-1))(2n+5)/18) This formula may be conservative in the presence of tied values.Variance(S) ZS Zcrit. Decreasing Interpretation Evaluation (Lookup Table for Fewer Than 10 Years) If S>=S-crit.. then reject H0 S-crit (p<.59 1.05) 1-tailed # Years 4 5 6 7 8 9 10 2-tailed 6 8 11 13 16 18 21 10 13 15 18 20 23 . 00 0.30 -1.50 -0.7 -0.40 0.00 0.70 -1.00 0.70 -0.70 -0.00 8.20 0.00 3.30 0.1 -0.20 -1.10 1 0 2 0 0 S= S= 6.1 -1.New Mexico Infant Mortality Rate from 1999-2009 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year 2005 6.00 1.00 .0 -1.00 2009 5.10 -0.00 36.00 6.40 -1.00 -0.00 1.10 -0.00 1.50 -1.00 0.00 2.40 -1.00 #+ #0.50 -0.60 -0.90 -0.00 9.00 4.1 -0.00 -30.70 -0.00 0.20 2006 5.30 -1.00 4.00 2.30 -0.30 0.60 -1.30 -1.60 -1.00 2.00 0.00 0.40 2007 6.00 minus 36.00 5.00 0.40 2008 5.00 -0.10 -0. .he presence of tied values. 108 0.281 0.12 (of Hollander and Wolfe) One-sided p = Prob [S ≥ x] = Prob [S ≤ −x] N = Number of time periods N=3 N=4 N=5 x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Tot # cells 0.242 0. For N>10 use the approximation given in section 8.5 0.0002 0.386 0.0002 <0.022 0.05 0.364 0.0124 0.408 0.0102 0.0083 0.01865 0.00015 <0.05 0.05 0.025 0.05 0.054 0.05 0.0001 28 36 0.05 0.36 0.0028 0.0014 0.00185 0.375 0.066 0.031 0.089 0.06 0.271 0.0012 0.274 0.0004 0.05 0.05 0.0083 0.0008 0.0023 0.05 0.0715 0.242 0.0233 0.3335 0.146 0.46 0.0001 0.05 0.0143 0.078 0.179 0.235 0.381 0.042 0.05 0.0005 0.00935 0.0011 0.119 0.05 0.0014 0.5 0.0113 0.136 0.01135 0.0935 0.102 0.406 0.05 0.317 0.1855 0.306 0.0001 <0.05 0.0001 <0.191 0.0054 0.00055 0.054 0.05 0.0425 0.3975 0.443 0.0515 0.238 0.05 0.0017 0.05 0.0008 0.035 0.00345 0.05 0.1135 0.093 0.0001 45 0.325 0.1795 0.05 0.431 0.05 0.19 0.0795 0.05 0.1685 0.0009 0.236 0.0083 6 10 N=7 N=8 0.09 0.43 0.00025 0.36 0.452 0.05 0.4655 0.049 0.271 0.00645 0.2365 0.3435 0.068 0.05 0.0034 0.028 0.0029 0.167 3 N=6 0.0002 0.05 0.05 0.0002 15 21 N=9 N=10 0.0046 0.023 0.00205 0.0046 0.0008 0.05 0.045 0.015 0.0172 0.05 .199 0.11 0.5 0.1045 0.0295 0.155 0.05 0.05 0.0156 0.3335 0.05 0.05 0.00035 0.05 0. Upper-tail Probabilities for the Null Distribution of the KendallK Statistic.03 0.05 0.048 0.036 0.042 0.068 0.168 0.075 0.3 0.138 0.5 0.05 0.0001 0.167 0.Table A 30.00495 0.272 0.1545 0.117 0.332 0.4205 0.05 0.0063 0.127 0.05 0.0001 <0.216 0.13 0.038 0.2975 0.01815 0.0071 0.05 0.2085 0.05 0.05 0.02515 0.00485 0. 0014 0.Quantiles (p-values) for Kendall's S statistic and tau correlation coefficient For N>10 use the approximation given in section 8.0071 0.235 0.281 0.548 0.5 0.usgs.08 0.R.usgs.06 0.089 0.167 0.0001 <0.167 0.0014 7 0.0028 0.625 0.09 0.0001 N = Number of time periods 3 6 9 0.306 0.119 0.0001 <0.031 0.36 0.0083 0.2 One-sided p = Prob [S ≥ x] = Prob [S ≤ −x] N = Number of time periods 4 5 x 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 0. Helsel and R.01 0 0 0 0 0 <0.54 0.0156 0.386 0.36 0.The table was adapted from D.3 0.gov/twri/twri4a3/pdf/endofreportnew.238 0.038 0. M.381 0.0083 8 0.46 0.592 0.0002 <0.054 0.242 0.24 0.0001 <0.452 0. Helsel http://pubs.gov/twri/twri4a3/pdf/twri4a3-new-11.022 0. Hirsch.2. Statistical Methods in Water Resources Helsel and Hirsch cited Table A30 in Myles Hollander and Douglas A.042 0.028 0.0063 0.068 0.0001 <0.05 0.0004 0.02 0.M.13 0.0012 0.pdf x 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 0.01 0.179 0.068 0.5 0.15 0.199 0.0029 0.5 0.375 0.136 0.36 0.04 0.0002 10 0.0001 <0.138 0.pdf Statistical Methods in Water Resources By D.274 0.11 0.117 0.015 0.191 0.042 0.035 0.0001 Table from D.0001 . Wolfe (1999) Original Table from Helsel & Hirsch: Table B8 -.0009 0.19 0.43 0. Helsel and R.0054 0.408 0.0124 0.5 0. Hirsch http://pubs. 03 0.03 These do not appear on the table in the textbook because they are impossible values .03 0.03 0.03 0.03 0.03 0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Value of S 17 18 19 20 21 .03 0.but they ARE possi Significant at p<.03 0.03 0.03 0.15 0.03 0.03 0.03 0.03 0.03 0.03 0.35 0.05 (one-tailed test) Significant at p<.03 0.03 0.03 0.03 0.45 0.03 0.03 0.03 0.03 0.03 0.03 0.05 (two-tailed test) The Probability of Mann-Kendall "S" for N-Years 3 through 10 0.20 0.30 0.03 0.03 0.25 0.03 0.40 0.03 0.03 0.03 0.50 0.Hollander and Wolfe) Probability of S 0.03 RED 0.03 0.03 0.05 0.10 0.03 0.03 0.03 0. 0001 <0. then reject H0 <0.05) # Years 1-tailed 2-tailed 4 6 5 8 6 11 7 13 8 16 9 18 10 21 If S>=S=crit.0001 <0.This is from Helsel & Hirsch S-crit (p<.05) # Years 1-tailed 2-tailed 4 6 5 8 9 6 10 12 7 12 14 8 15 17 9 17 20 10 20 23 If S>=S=crit.0001 <0.0001 10 13 15 18 20 23 This includes interpolated values (red text) S-crit (p<. then reject H0 . but they ARE possible if there are tied cells. We still need to figure out how to handle ties.sible values . rough 10 N=3 N=4 N=5 N=6 N=7 N=8 N=9 N=10 p=0.05 p=0.025 21 22 23 24 25 26 27 28 29 30 . 9 18 23.Gilbert. above.9 0.9 6 -17 -18 5. How on Earth am I supposed to ask SAS to do that! 0.1 24 24 0.1 -22. on Tied Values in Mann-Kendall Test 23 23 24 0.1 0.1 23 # ties (diff=0): 24 1 0.1 -22.1 6 23 23 24 24 24 The "value" of 23 happens twice The "value" of 24 happens three times The "value" of 23 happens three times .9 0 -5.9 24 1 0 23. But the number of unique values that happen to h matrix = 3.1 0.9 -23. with tied values. 1987.1 *Does he mean the number of different/unique values with a tie? There are five columns. Gilbert says: the number of tied groups=3 (!?)* t1=2 for the tied value 23 t2=3 for the tied value 24 t3=3 for the tied value .9 -23.1 6 0.9 0 0 0 1 1 This is from the Gilbert (1987) article. 9 0 -5.24 1 0 23.9 0 0.9 -1 -1 22.00 4.00 3.9 18 23.00 0.00 4.00 0.00 14.1 -22.9 2 2 1 mber of unique values that happen to have ties in the #+ #3.00 0.00 .00 2.00 -1.00 2.00 0.00 3.9 0 -23.9 -23.00 15.00 0.9 17 22.00 1.9 23 0 -1 22.00 2.00 5.00 0.9 -23.
Copyright © 2024 DOKUMEN.SITE Inc.