An Introduction to the Theory of Statistics

March 26, 2018 | Author: Rahul Biswas | Category: Correlation And Dependence, Standard Error, Normal Distribution, Statistics, Regression Analysis


Comments



Description

H^tt QfoUcge of Agticultarc At Cf^acnell Uniueraity Cornell University Library HA 29.Y82 1922 of statist An introduction to the theory 3 1924 013 993 187 The tlie original of tliis book is in Cornell University Library. There are no known copyright restrictions in the United States on the use of the text. http://www.archive.org/details/cu31924013993187 AN INTEODUCTION TO THE THEOEY OF STATISTICS. — — — — — Charles Griffin In Cloth. & Co., Ltd., net, Publishers Note.— All nrices are postage extra. Pp. i-xv. +518. TARIFFS: A STUDY IN By 25s. METHOD — T. E. G. GEEGOEY, B.Sc. (Boon.), Lond. OASSEL KBADEE IN OOMMBKOE IN THE UNIVEKSITr OF LONDON. Contents— Customs Areas Customs-making Bodies and Customs Law Taiiff-making Bodies— The Tariff as a Whole— The Internal Form of TariffDifferential Tariff Eate Differentiation and Specialisation of Commodities Duties Retaliation, Eeciprocity, and Colonial Preferences The Preferential System of the British Empire Valuation and Allied Problems Some Alleviations of the Protectionist Regime Free Ports and Bonded Warehouses The " Commercial and Drawbacks Improvement Trade Frontier Trade " — — — — — — — — ' Tariff 'Treaties Appendices Index. " Will be of special interest to business men and all interested in current economic problems." Chmnber of Commerce Journal, "It ought to be in the library of every legislature throughout the world." New Statesman, — — — — Second Edition. Eevised. In Crown 8vo, extra, with Diagrams and Folding-Plate. 9s. THE CALCULUS FOR ENGINEERS AND PHYSICISTS, INTEGRATION AND DIFFERENTIATION With Applications to Technical Problems, and Classified Reference List of Integrals. By Prof. ROBERT H. SMITH, A.M.Inst.C.E., M.I.Mech.E., Contents Introductory General Ideas and Principles Algebraic and Graphic Symbolism— Easy and ITaniiliar Examples of Integration and Differentiation Important General Laws Particular Laws— Transformations and Reductions— Successive Differentiation and Multiple Integration Independent Variables— Maxima and Minima Integration of Differential Equations— Appendices. — — Etc. — — — — "Interesting diagrams, with practical illustrations of actual occurrence, are to be in abundance. The very complete classified refekehce table will prove very useful in saving the time of those who want an integral in a hurry." The Engineer. foundhere Thirty-eighth Annual Issue. Cloth. 15s. and postage.) OW THE (To Subscribers, 12s. 6d. THE OFFICIAL YEAR-BOOK SCIENTIFIC AND LEARNED SOCIETIES OF GREAT BRITAIN AND IRELAND Gompiled from Official Sources. Comprising (together with other Official Information^ LISTS of the PAPERS read during the Session 1920-1921 before all the LEADING SOCIETIES throughout tlie Kingdom engaged in the following Departments of Research: § 1. Science Generally : i.e.. Societies occupying themselves with several Branches of Science, or with Science and Literature jointly. § 2. Astronomy, Meteorology, Mathematics, and Physics, g 3. Chemistry and Photography. § i. Geology, Geography, and Mineralogy. § 6. Biology, including Botany, Microscopy, and Anthropology. §6. Economic Science and Statistics. § 7. Mechanical Science, Engineering, and Architecture. § 8. Naval and Military Science. § 9. Agriculture and Horticulture. § 10. Law. § 11. Literature, History, and Music. § 12, Psychology. § 13. Archceology. § 14. Medicine. " This book : is a valuable compendium of indispensable woith."— Nautical Magazine. London CHAS. GRIFFIN & CO., LTD., Exbtee St., Strand, W.C.2 ; AN INTEODUCTION TO THE THEOEY OF STATISTICS BY G. UDNY YULE, C.B.E., M.A., F.R.S., FELLOW TOTITEB3ITT LECTDEER IN STATBTira, OAMBRIDOE; AND SOMETIIIE HOKORART SECRETARY 01 THE ROYAL STATISTICAL SOCIETY OP LONDON; MEMBER OF THE INTERNATIONAL STATISTICAL INSTUDTE FELLOW OF THE EOYAIi ANTHROPOLOGICAL INSTrrUTE. "BHlttb 53 jFlflures anD diagrams. SIXTH EDITION, ENLAEOED. LONDON: CHARLES GRIFFIN AND COMPANY, LIMITED, EXETER STREET, STRAND, 1922. \_All W.C. 2. Bights Seserved.] Nkill Printed in Great Britain by & Co., Ltd., Edinburgh. since the latter course would inevitably have necessitated a heavy increase in the to cover all selling price. Y. June 1922. dealing with the application of this method to Association and Contingency Tables. been enlarged The present has by a considerable addition to the Supplement on testing Goodness of Fit. U. the deduction of the regressions by the use of the differential calculus. That the large Fifth Edition of the Introduction. after ten years of service.PREFACE TO THE SIXTH EDITION. with the additional matter as Supplements. and including also results only recently published. it was thought better to maintain the present form. . the still volume continues to hold edition its own. still In view of the high costs of printing and composition ruling. Opportunity has also been taken to correct some minor errors in the text and the answers to questions. giving . rather than to revise the text. printed just after the close of the War. The Index has been revised new matter G. and the Supplementary List of Eeferences has been extended and brought up to date. A brief. Suppleme'nt has been inserted. has been exhausted in three years is gratifying evidence that. the Introduction of Statistics has continued to be of service. Owing to a serious impairment of vision. The present test of edition has been enlarged by the inclusion of two supplementary notes. drafting of the Lister Institute. R. on the Law of Small Chances and on the Goodness of Fit of an observed to a theoretical distribution respectively. by the end to the of last year. G. U. April 1919. Greenwood.M. and also of an extensive me. the references.A.C. Captain of M. was decided on as any extensive revision of the text. The rapid exhaustion of the Fourth Edition.T. it is will cause little inconvenience. . and list of additional references. reading difficult to and writing are at present I have to express my great indebtedness to my friend and frequent collaborator.PEEFACE TO THE FIFTH EDITION.. in spite of the slight revision which has Theory been possible in the last few years. is evidence that. Y. Their inclusion as instead of the incorporation of the matter in the text. been revised to cover all As the index has hoped that this the new matter. of for list the of these notes and the compilation supplements. owing to post-war conditions. would have been disproportionately slow and costly. been increased to render the book more suitable 'for the use of biologists and others besides those interested in economic and vital statistics. I hope that may prove of some service to the students of the diverse sciences in My which statistical methods are now employed. and some of the more difficult parts of the subject have been treated in greater detail than was possible in a sessional course of some thirty lectures. together with such elements of co-ordinate geometry as are that is now assumed. H. in the sessions 1902-1909. mere especially. statistical —the methods available data — suited : to those who possess only a limited knowledge of mathematics to an acquaintance with algebra up generally included therewith. the chapters follow closely the arrangement of the course. however. Hooker not only . For the rest. as distinct to work out a systematic introductory course on statistical methods from collecting. fairly detailed lists of references to the original : memoirs have been given at the end of each chapter exercises have also been added for the benefit. variety of illustrations the subject. To enable the student to proceed further with at University College. the three parts into which the volume is divided corresponding approximately to the work of the three terms. The volume represents an attempt for discussing. most grateful thanks are due to Mr K. it is all the binomial theorem. London.PREFACE TO THE FIRST EDITION. The following chapters are based on the courses of instruction given during my tenure of the Newmarch Lectureship in Statistics The and examples has. of the student who is working without the assistance of a teacher. to illustrate the frequency distribution on the § my probable error of the median. for and been making many criticisms : indeed greater than can well be expressed in a formal preface. and suggestions which have of tlie greatest service.Vlll PKEFACE. amI all can hardly hope that involved errors in the text or in the of arithmetic in examples biguities. mass and exercises have been eliminated. December 1910. My thanks are also due to for the Mr H. XVII. often delayed and interrupted by the pressure of other work. D. Y. . Vigor for some assistance in checking the arithmetic. and the proofs. or obscurities. U. G. and will feel indebted to any reader who directs my attention to any such mistakes. but also for much friendly help and encouragement without which the preparation of the volume. and Edgeworth example used in the influence of the form of acknowledgments to Professor 5 of Chap. or to any omissions. might never have been completed my debt to Mr Hooker is for reading the greater part of the manuscript. Derivation of complex from simple relations by 5-6. Inclusive and exclusive notations and terminologies — : — — — — — — 7-16 CHAPTER 1-8. Notation for single attributes and for combinations 8. of the positive class-frequencies 18. CONSISTENCE. better. and its specification by symbols 4." in accordance with present usage . INTRODUCTION. 1-3 The introdnction of the terms " statistics. The change in meaning of these terms during the nineteenth centuiy 7-9.— COI^TEKTS. Positive and negative attributes." "statistical methods. The present use of the terms 10. . FAQE3 — — — 1-6 PART I. contraries 10. 1-2. Sufficiency of the tabulation of the ultimate classfrequencies 16-17. Definitions of "statistics. The order of a. The class-frequencies chosen in the census for tabulation of statistics of infirmities 19." "theory of statistics. Conditions of consistence for three attributes .—THE THEORY OF ATTRIBUTES. Consistence specifying the universe 7-10. The aggregate 12. The observation or universe. class 11. field of — — — 17-24 . Conditions of consistence for one and for two attributes 11-14." " statistical. Statistics of attributes and statistics of variables fimdamental character of the former 3-5. The class-frequency 9. II. Or. NOTATION AND TEEMINOLOGY. CHAPTER I. Claasifioation by dichotomy 6-7." into the English language 4-6. The arrangement of classes by order and aggregate 13-14. Necessity for classification of observations frequency-distribution 3. VI. The general principle of a manifold classification 2-4. The table of double entry or contingency table and its treatment by fundamental methods— 5-8. 3-5. : — — — — 42-69 CHAPTER V. . The U-shaped distribution cal or J -shaped distribution The symmetrical — — 75-105 . the Introductory 2. Illustrations 4. Homogeneity of the classifications dealt with in the pre- — — — ceding chapters : heterogeneous classifications . PARTIAL ASSOCIATION. . independence— 5-10. 60-74 PART II.— — — X CONTENTS. CHAPTEE 1-4. Process of classification— 8. Uncertainty in interpretation of an observed association Source of the ambiguity partial associations 6-8. Treatment of Intermediate observations 9. CHAPTER 1. . The asymmetrical distribution 15. ASSOCIATION. Magnitude of class-intervals 6.—THE THEORY OF VARIABLES. Isotropic and anisotropic distributions— 14-15. The conception of and testing for the same by the comparison of PASE3 differences of percentages— 11-12. Position of intervals 7. . Illusory association due to the association of each of two attributes with a third 9. 1. 1-2. dependence values— 13. Numerical equality of the between the four second-order frequencies and their in14. 12.UENCY-DISTRIBUTION. THE FREQ. The exti'emely asymmetri16. analysis of a contingency table by tetrads 11-13. 13-14. The case of complete independence . Tables with 11. MANIFOLD CLASSIFICATION. Tabulation 10. Ideal frequency-distributions moderately distribution 14. attribute A being extended to include non-^'s — . . Estimation of the partial associations from the frequencies of the second order 10-12. 2S-41 CHAPTER IV. Method of forming the table 5. The total number of associations for a given number of attributes . Coefficients of association Necessity for an investigation into the causation of an . The criterion association. The coefficient of contingency 9-10. Graphical representation of the unequal intervals — — — — — : — — — — frequency-distribution 13. III. Measures of position (averages) and of dispersion 3. Illustration ii. The standard deviation its definition. The dimensions of an average the same as those of the variable 4. Measures of asymmetry or skewness 27-30. Illustration i. The line of means 6-7. : Inheritance of fertility — : — . CORRELATION. Desii'able properties for an average to possess 5. and the standard-deviations of arrays Numerical calculations 17. 19-20. . The simpler properties. correlation table and its formation 4-5. 9-10. The commoner forms of average 6-13. Certain points to be remembered in calculating and using the coefficient — — — — — — . 22-26. and properties 14-19. MEASURES OF DISPERSION. the re15-16. The correlation-coefficient. Summary which it is specially applicable — 27. 157-190 CHAPTEE X. ETC. and simpler properties — — — — — : — definition and relation to : mean and median — : comparison of the preceding forms of average geometric cases in mean its definition. The arithmetic mean its definition. XI CHAPTEE VII. and the — 21. 1-3. Measures of relative dispersion 26. The harmonic 106-132 mean : its definition and calculation CHAPTER Inadequacy VIII. The correlation 8-9. Causation of pauperism calculate r 11-13. The mean deviation : its definition. The quartile deviation calculation. and simpler properties 14-18. Necessity for quantitative detinition of the characters of a frequency-distribution 2. — : — — — — The method of grades or percentiles 133-156 CHAPTER The IX. calculation.— — CONTENTS. calculation. The general problem surface of rows and the line of means of columns : their relative positions in the case of independence and of varying degrees of correlation 10-14. 1. gressions. 20-24. and properties or semi-interquartile range 25. of dispersion of the range as a measure 2-13. The median : its definition. 1. PASES 1. Necessity for careful choice of variables before proceeding to 2-8. AVERAGES. The mode its calculation. CORRELATION: ILLUSTRATIONS AND PRACTICAL METHODS. Correthe movements of two variables: (a) Illustration iv. 15. Correlation-coefficient for a two x two- — — fold table — — — — 11. The coefficient of n-fold correlation 17. The correla: — : : — — tion-ratio 191-209 CHAPTER XI. Introductory explanation for Direct deduction of the formulse Special notation for the general generalised regressions 5. Influence of errors of observation on the correlation-coefficient (Spearman's theorems) 8. Certain rough methods of approximating to the correlation-coefficient 20-22. Mean and standard-deviation of an index 9. Influence of errors of observation and of grouping on the standard-deviation 6-7. The weighting of forms of average otlier than the arithmetic mean — — — — — . Standard-deviation of a sum or difference 3-5. Correlation due to heterogeneity of material 13..— XU Illustration iii. Example ii.deviations 7-8.Geometrical representation of correlation between three variables by means of a model 16. and standard . Correlation between indices 10. Reduction of the generalised regression 13. . Direct interpretation of the generalised regressions 10-11. Reduction of the generalised correlation-coefficient 14. 210-228 CHAPTER XII. Generalised correlations case Generalised deviations 6. Introductory 2. Fallacies work : . . . 1-2. Correlation-coefficient for all possible pairs of JV values of a variable 12. The weighted mean 18-19. the marriage-rate and foreign trade— 18. PARTIAL CORRELATION. Elementary methods of dealing with cases of non-linear regression 19. Application of weighting to the correction of death-rates. changes in Nop-periodio movements : — PAGE9 and general mortality 15-17. lation between infantile : CONTENTS. Reduction of correlation due to mingling of unoorrelated with correlated material 14-17. Expression of regressions and correlations of lower in terms of those of higher order 18. for varying sex and agedistributions 20. — — — — — — — 229-253 . Theorems concerning the generalised product-sums 9. Arithmetical two variables : — — 3. (6) Quasi-periodic movements Illustration v. MISCELLANEOUS THEOREMS INVOLVING THE USE OP THE CORKELATION-GOEFFICIENT. — — Example i.. The weather and the crops 14. Limiting inequalities between the values of correlation-coefficients necessary for consistence 19. etc. Reduction of the generalised standard-deviation 12. 4. 1. 1. CHAPTER SIMPLE SAMPLING OF ATTRIBUTES. Warning as to the assumption that three times the standard error gives the range for the majority of fluctuations of simple sampling of either sign 2. or standard error. Determination of the mean and standard-deviation of the number of successes in events— 6. Effect of divergences from the conditions of simple sampling : (a) effect of variation in p and q for the several universes from which the samples are drawn 11-12. Standard-deviation of simple sampling when the numbers of observations in the samples vary 12. for checking and controlling the interpretation of statistical results . Biological cases to which the theory is directly applicable 11. .—THEORY OF SAMPLING. Approximate value of the standarddeviation of simple sampling. XIII. SIMPLE SAMPLING CONTINUED: EPFEOT OP REMOVING THE LIMITATIONS OF SIMPLE SAMPLING. Summary — — — — — — 276-290 . and relation between mean and standard-deviation. Use of the standard-deviation of simple sampling. Warning as to the use of the observed for the true value of p in the formula for the standard error 3. when the chance of success or failure is very small^lS. 1. The inverse standard error. The same for the proportion of successes in n n events : the standard-deviation of simple sampling as a measure of unreliability or its reciprocal as a measure of precision 7. or standard en'or of the true proportion for a given observed proportion : equivalence of the direct and inverse standard 4-8. More detailed discussion of the assumptions on which the formula for the standard-deviation of simple sampling is based— 9-10. (6j EH'ect of variation in p and q from one sub-class to another within each universe 13-14. Verification of the theoretical results by experiment 8. The probleiti of the present Part chief divisions of the theory of sampling 3. — — PAGES 2. The two — — — — — — 254-275 CHAPTER XIV. Limitation of the discussion to the case of simple sampling 4. Definition of the chance of success or failure of a given event 5. (c) Effect of a correlation between the results of the several events 15. The importance of errors errors when n is large other than fluctuations of "simple sampling" in practice: unrepresentative or biassed samples 9-10.— CONTENTS. XUl PART III. 317-334 . PAGX8 1-2.: XIV CONTENTS. Determination of the frequency. The table of areas of the normal curve and its use 17. Investigation of Table III. The quartile deviation and the "probable error" 18.. The normal surface for two correlated variables regarded as a normal surface for uncorrelated variables rotated with respect to the axes of measurement arrays taken at any angle across the surface are normal distributions with constant standard-deviation : distribution of and correlation between linear functions of two normally correlated variables are normal : principal axes 7. Constancy of the standard-deviations of parallel arrays and linearity of the regression 5. Standarddeviations round the principal axes 8-11. q. Illustrations of the application of the normal curve and of the table of areas — — — — — — — — — — .. Outline of the principal properties of the normal distribution for n variables — — — — — — — . contour lines 12-13. The contour lines : a series of concentric and similar ellipses 6. Difficulty of a complete test of fit by elementary methods 16. Deduction of the general expression for the normal correlation surface from the case of independence 4..distribution for the number of successes in n events : the binomial distribution 3. Comparison with a binomial distribution for a moderate value of n 13... THE BINOMIAL DISTKIBUTION AND THE NOEMAL CURVE. The value of the central ordinate 12. Necessity of deducing. Deduction of the normal curve as a limit to the symmetrical binomial 10-11. Graphical and mechanical methods of forming re- — — presentations of the binomial distribution 6. Isotropy of the normal distribution for two variables 14. Direct calculation of the mean and the standard-deviation from the distribution 7-8. Dependence of the form of the distribution on p. normality of distribution obtained by diagonal addition. Fitting the curve to an actual series of observations 16. for large values of n. 1-3. for use in many practical cases. CHAPTER XV. constancy of standard-deviation of arrays. 291-316 CHAPTER XVI. to test normality: linearity of regression. and n 4-5. Outline of the more general conditions from which the curve can be deduced by advanced methods— 14... the terms of the binomial series 9. a continuous curve giving approximately. Chapter IX. NOEMAL COEEELATION.. Standard error of the interquartile range for the normal curve 9. THE SIMPLER CASES OE SAMPLING EOE VARIABLES PERCENTILES AND MEAN. and the Theory of Probability . 360-361 Supplements I. . Correlation between errors in two percentiles of the same distribution 8. The problem of sampling for variables : the conditions assumed 3. III. and regression 16. DiKEOT Deduction of the roEMtTL^ fok Eegebssions TiiE II. of Statistics. Simplified formula for the case of a grouped frequency-distribution 7. Effect of removing the restrictions of simple sampling.: CONTENTS. Statement of the standard errors of standard-deviation. . Standard error of the arithmetic mean 11. Eifect of the form of the distribution generally 6. Relative stability of mean and median in sampling 12. Short List of Works on the Mathematical Theory . the Exercises 393-400 401-415 GIVEN . Special values for the percentiles of a normal distribution 5. and Hints on the Solution of. — 362-383 363-367 367-386 . Law of Small Chances . Restatement of the limitations of interpretation if the sample be small — — — — — — — — — — — — — — . PAOES 1-2. 357-359 II. Standard error of a percentile 4. coefficient of variation. Goodness of Fit Additional Refeeences 387-392 Answers Index to. Standard error of the difference between two means 13. . The tendency to normality of a distribution of means 14. XV CHAPTER Xyil. and limitations of interpretation 10. 335-366 Appendix Appendix I. Effect of removing the restrictions of simple sampling 15. —Tables — for facilitating Statistical Work . correlation-coefficient. . The word "statist" is found.. INTKODUCTION. 5 Ziuimermann's work appears to have been written in English. 1914. chiefly by German writers. von Bielfeld. "To the several articles contained in this work. appear to be the sense that 2. .." * " Statistics " occurs again with a rather wider definition in the preface to A Political Swrvey of the Present State of Ev/rope. M. ii) . ' Act V. however.^ and in Paradise Regained The earliest occurrence of the word "statistics" yet (1671). the industry and civilisation of their inhabit and the wisdom of their governments. "It is about forty years ago. v). 2. ^ Act ii. Quarterly Publications of the American Statistical Associaiion.. — — — 1. and the adjective is also given (p. One of its chapters is entitled Statistics. 4. 287. iv. which has for its object the actual and relative power of the several modern states. " in accordance with present usage.'' "statistics." "theory of statistics. The introduction of the terms " statistics. W. the power arising from their natural advantages. sc. (3 vols. Defininineteenth century 7-9." "statistical. F. by E. Hooper." statistical. 1-3. 1 ." into the English language 4-6.. is become a favourite study in Germany" (p." says Zimmermann.^ Cymbeline (1610 or 1611). The change in meaning of these terms during the The present use of the terms 10. xiv. Professor of Natural Philosophy at Brunswick.. London. for instance. . A.D. from the Latin status. so.. acquired in mediisval Latin of a political The first term is. F. has been formed. more or less indirectly.THEOEY OF STATISTICS. in state. "that that branch of political knowledge. in Hamlet (1602). ' Bk. tions of "statistics." all derived. vol. 1770).. translated by W. of much earlier date than the two others. distinguished by the new-coined name of statistics." " statistical methods. * I cite from Dr W. by Baron J. Willcox. this science.^ issued in 1787. and contains a definition of the subject as "The science that teaches us what is the political arrangement of all the modern states of the known world. into a separate science. though he was a German. By the more convenient form it has now received . Zimmermann. The words it "statist. some respectable ants. . p.' noted is in The Elements of Universal Ervdition. " given at the end of the volume.. by Zimmermann and by Sir John Sinclair. as it was supposed that some term in our own language might have expressed the same meaning. I found that in Germany they were engaged in a species of political enquiry. is the volume in which the word * "statistik" appears to be first employed. through the northern parts of Eur^jpe. 4. meant simply the exposition of the noteworthy mode of exposition being almost inevitably at that time preponderantly verbal. he tells us. But in the course of a very extensive tour. Professor of Politics at Gottingeu.which they had given the name of Statistics . . p. Clergy of the Church of Scotland issued in May 1790. "Statistics" (statistik)." was certainly justified. the — — — but trustworthy figures were — scarce. ' Twenty-one . accordingly.^ he states Germany "' Statistical Inquiries.. the political circumstances. ^ Statistical Progress ' . xiii.' as they are called. and numerical statements. " Statistics " thus insensibly acquired a narrower signification. to. and other matters of state.. have been carried to a very great extent. notably by Sir John Sinclair. which I happened to take in 1786. Appendix to "The History of the Origin and . and I hope that it is now completely This hope naturalised and incorporated with our language." 3. their introIn the circular letter to the duction has been frequently ascribed.'^ to whom. dt. Account." 2 Statistical writers THEORY OF STATISTICS. The Abriss der Statswissenschaft der Earopdischen Keiclie (1749) of Gottfried Achenwall. the proIn the ductions of a country.*. have added a view of the principal epochas of the history of each country. After the commencement of official of the nineteenth century. The conciseness and definite character of numerical data were recognised at a comparatively early period more particularlj' by English writers characteristics of a state." and adds an explanatory "or inquiries footnote to the phrase "Statistical Inquiries" that in — respecting the population. indeed. the growth data was continuous. vol. xx. as I thought that a new word might attract more public attention. but the meaning of the word underwent rapid development during the half century or so following its introduction.. vols. however. began more and more to displace the verbal descriptions of earlier days. " Many people were at first surprised at my using the new words. Within the next few years the words were adopted by several writers." " History of the Origin and Progress " ^ of the work. Loc. but the adjective occurs at a somewhat earlier date in works written in Latin. 1791-99. viz>. I resolved on adopting it. the editor and organiser of the first Statistical Account of Scotland.. Statistics and Statistical.. " statiaticus . as the term is used by German writers of the eighteenth century. Stat. to be the ascertaining and bringing together of those facts which are calculated to It is. Symons' British Rainfall for 1899. further changes followed. Biometrika. poor-law statistics. Francis Galton. i. analogous to those which were originally formed for the study of the state." ^ admitted that " the statist commonly prefers to employ figures and tabular exhibitions. 1. Op. 1869). in a book on Latin verse. Scripture. 3 the exposition of the characteristics of a State by numerical methods. But similar data occur in many connections . but the transition appears to have been only half accomplished even after the foundaThe articles in the tion of the Royal Statistical Society in 1834. Cambridge Univ. p. but the official definition has no reference to method. as we speak of vital statistics. writing of " statistics concerning the mental characteristics of man. the characteristics of the Virgilian hexameter " are examined carefully with statistics. chap. in the words of the prospectus of this Society. 15. Such collections of numerical data were also termed " statistics. however. cit." INTKODUCTION." 5. E. Press. p. on almost any subject whatever." and consequently. Oct. for instance. under the headings bright average duU. in anthropology.^ and the author. is held to cover a collection of numerical data. illustrate the condition and prospects of society. issued in 1838-9. "may be said. ii."^ We are informed that. preface. Once. however. 18. vol. Soc." * 6. Thus we read of the inheritance of genius being treated "in a statistical manner. 1897. The New Psychology. the word was transferred to those series of figures with which it operated."* and we have now "a journal for the statistical Such phrases as " the statistical study of biological problems. at the present day. Athenceum. etc." "^ — — ' " * » " ' Jour. the firat number issued in 1901 . Hereditary Genius (Maomillan. 1903. We not only read of rainfall " statistics.^ We find a chapter headed " Statistics " in a book on psychology. the first change of meaning was accomplished. and so forth. From the name of a science or art of state-description by numerical methods. first volume of the Journal. It is difficult to say at what epoch the word came definitely to bear this quantitative meaning." even when applied to data trom other sources. "Statistics." we read. the word." " statistics of children. The development in meaning of the adjective " statistical was naturally similar. W." but of " statistics " showing the growth of an organisation for recording rainfall. p. in meteorology. The methods applied to the study of numerical data concerning the state were still termed " statistical methods. 3. are for the most part of a numerical character. to deal with highly complicated cases of multiple causation cases in which a given result may be due to any one of a number of alternative causes or to a number of — different causes acting conjointly. 9. but frequently cannot approximate closely to the experimental ideal . with physics or chemistry. but the the barometer. character 1 Let us turn to social science. The meteorologist.." for a moment. say. With the biologist. by Sir J. This simplification being impossible. in general. Larmor. ' investigation of the motion of molecules " the ordinary language of physicists. for A example. some community of character between them. apart from his control. the internal circumstances of animals and plants too easily evade complete control. or the same terms and the same methods would not be applied. What. Now the object of experiment is to replace the complex systems of causation usually occurring in nature by simple systems in which only one causal circumstance is permitted to vary at a time. anthropology. however. was on " the statistical and thermodynamical relations of radiant energy. that this is also precisely the characteristic of the observations in other fields to which statistical methods are applied." no longer bear any necessary They are applied indifferently in reference to " matters of state. as well as in the Diverse though these cases are. Willard Gibbs (Macmillan. 1902). stands out so markedly that attention has been repeatedly directed to it by " statistical " writers as the source of the peculiar the observer of social facts cannot exdifficulties of their science periment. matters are in somewhat better case. and consider its characteristics One characteristic as compared. Trans. He can and does apply experimental methods to a very large extent. have become part of a work entitled "the principles of statistical mechanics.— 4 THBOKY OF STATISTICS." physics."^ and the Bakerian lecture for 1909. the observer has. vol. (1878). FUl. . Camb. is of social science. biology. Hence a large field (notably the study of variation and heredity) is left.\ii. then. and "On Boltzmann's Theorem" ' By J. is this We find common 8. records of treated as they stand. It is unnecessary to multiply such instances to show that the words "statistics. of Clerk Maxwell. little consideration will show." "statistical. as the parent of the methods termed " statistical. and meteorology. thermometer. methods have either to aid or to replace the The physicist and chemist. there must be social sciences. "Theory Heat" (1871)." 7. in same position as the student points. but must deal with circumstances as they occur. and rain gauge have to be in almost precisely the He can experiment on minor which methods ' statistical of experiment. finally. . multiplicity of causes is of the essence of the case. ' p... 1855-58. V. Yule. absolutely perfect. of molecular physics. 1905. Roy. cannot be completely eliminated. Bj' far the best history of statistics down to the early years of the nineteenth century. KOBRRT VON. vibraFurther.Name'Statistik .. priiilSipaily latter half of vol. Oeschichte imd Liiteratv/r der Staatswissenschaften. in the problems tion. The History of the Words (1) (2) " Statistics. Weiss. and cannot be. Enke. draughts. or of moisture. . The History of (3) Statistics in General. statistical methods we mean methods specially adapted to the elucidation of quantitative data affected by a multiplicity of causes. A translation in Jour." Jour.' Statistical. mean the exposition of statistical The insertion in the first definition of some such words as " to a marked extent " is necessary. The motion of an atom or of a molecule in the middle of a swarm is dependent on that of every other atom or molecule in the swarm. referred to in the last sentences of § 6. as well as the observing instrument. Soc. Oeschichle der StntistiTc. Stat. In the light of this discussion. of pressure. Erlangen. statistical methods still find application. bis auf Quetelet ." " Statistical. the author died in 1900. whether the influence of many causes be large or not.) — INTKODUCTION. Teil. (4) MoHL. since the term " statistics " is not usually applied to data. Theirs are the sciences in which experiment has been brought to its greatest perfection. stand at the methods available for eliminating the effect though continually improved. The observer himself." John. iii. the efieots of changes of temperature. But even so.. Ixviii. Stuttgart. U. At the same time. is a source of error . of disturbing circumstances. 10. hoy. 1883. 391. for same year. like those of the physicist. " The Introduction of the Words 'Statistics. " statistical methods " are applicable to all such cases. V. we may accordingly give the In the first place. 1'° John. following definitions : By By statistics we mean quantitative data affected to a marked extent by a multiplicity of causes. Der.' into the English Language. Soc. are not. which are affected only by a relatively small residuum of disturbing causes. (All published . the other extremity of the scale.) . (For history of statistics see 3 vols. G. Enke. REFEKENCES. By theory of statistics we methods. 1884. Stat. Berne. vol. .) gives a very slight sketch. I... Theorie und Technik der Statistik Cnew edn. Westergaard's Die Grundzuge der Theorie der Statistik (Fischer.. e. reference may also be made to (6) Hull.. Falkner. and the biographical articles in Palgrave's Diiiionary of For its importance as regards the English Political Economy are useful. but the article 189)). (Gives an exceedingly useful outline of the history J. 1888. 1865. History of (8) Official Statistics. History of Theory of Statistics. . 1890). "Statistics" in the Encyclopcedia Britannica (11th edn. 1899. Hoepli. A History of the Mathematical Theory of Probability from the time of Pascal to that of Laplace .g. Meitzen's Geschichte. school of political arithmetic... Gatjaolio.—— 6 (5) TTIT^ORT OF STATISTICS.) Several works on theory of statistics include short histories. of official statistics in different countries. H. i. There is no detailed history in English. 1895. P. \ Bektillon. Milano. (Vol. Macmillan. Antonio. Parte storica. 2 vols. A. H.) . together with the Observations on the Bills of Mortality more probably by Captain John Graunt. Jena. 1903. Teoria generate ddla stalistica. Cambridge University Press. : From the purely mathematical side the following is important TODH0NTEK. The Economic Writings of Sir William Petty. and P. 2 vols. 2nd edn. American translation by R. Somewhat (7) slight information is given in the general works cited. Oours dUmentaire de statislique Soci^td d'dditions scientifiques. C. The quantitative the dumb and speaking. The arrangement of classes by order and aggiegate 13-14. The quantitative character may. The aggregate— 12. The methods of statistics. In the second place. He may record. contraries 10. Statistics of attributes and statistics of variables fundamental character 6-7.— PAET I. and Thus. arises solely in the counting. we may count the number of the blind and seeing. 2. The observations in these cases are quantitative ab initio. the numbers of petals in flowers. however. Sufficiency of the tabulation of the ultimate class15-17. for instance. and stating the numbers of tall and short (or A 7 . arise in two different ways. 1-2. or the insane and sane. Positive and negative attributes.— THE THEOEY OF ATTRIBUTES. as defined in the Introduction. The methods applicable to the former kind of observations. in a given count how many do or do not possess it. The class-frequency 9. deal with quantitative data alone. which may be termed statistics of attributes. of the positive class-frequencies 18. The order of a class 11. Classification by dichotomy single attributes and for combinations 8. population. CHAPTER I. the ages of persons at death. or statistics of variables. for example. the observer may note only the presence or some attribute in a series of objects or individuals. the prices of different samples of a commodity. In the absence of first place. Notation for of the former 3-5. in such cases. may be treated by simply counting all measurements as tall that exceed a certain limit. men. Or. character. — 1. frequencies — : — — — — — — — The class-frequencies chosen in the census for tabulation of statistics of infirmities 19. neglecting the magnitude of excess or defect. NOTATION ANI^ TERBIINOLOGY. are also applicable record of statures of to the latter. the observer may note or measure the actual magnitude of some variable character for each of the objects or individuals observed. the statures of men. better. Inclusive and exclusive notations and terminologies. the insane males. sane males. sight and blindness. Or. the members of each of the subclasses so formed according as they do or do not possess the third. Thus the members of the population of any district may be classified into males and females . It may be noticed that the fact of classification does not necessarily imply the existence of either a natural or a clearly defined boundary between the two classes. 3. and four similar classes of dwarf plants. The boundary may be wholly arbitrary. more strictly not-tall) on the basis of this classification. the observer classifying the objects or individuals observed. This and the next three chapters (Chapters I. however. we may treat the presence or absence of the attribute as corresponding to the changes of a variable which can only possess two values. and so on. tall with wrinkled green seeds. the members of each sex into sane and insane. If several attributes are noted. as suggested above. But the methods and principles developed for the case in which the observer only notes the presence or absence of attributes are the simplest and most fundamental. In the simplest case. pass into each other by such fine gradations that judgments may . and we may be able. so that we would have eight classes tall with round green seeds. where attention is paid to one attribute alone. Similarly. say and 1. with wrinkled seeds or round seeds. to draw further conclusions. and sane females into blind and seeing. 4. every class being divided into two at each step. with green seeds or yellow seeds. be continued indefinitely. Those that do and do not possess the first attribute may be reclassified according as they do or do not possess the second. The objects or individuals that possess the attribute. the methods that are specially adapted to the treatment of statistics of variables. by auxiliary hypotheses as to the nature of this variable. barometer readings as above or below some particular height. the process of classification may. insane females.8 THKOEY OF STATISTICS. tall with round yellow seeds. where prices are classified as above or — below some special value.g. tall with wrinkled yellow seeds.-IV. making use of each value recorded. and those that do not possess it. e. The division may also be vague and uncertain: sanity and insanity. may be said to be members of two distinct classes. they might be classified as tall or dwarf. If we were dealing with a number of peas {Pisvm sativum) of different varieties. only two mutually exclusive classes are formed. are available to a greater extent than might at first sight seem possible for dealing with statistics of attributes. For example. and are best considered first.) are accordingly devoted to the Theory of Attributes. we may assume that we have really to do with a variable character which has been crudely classified. i. division by dichotomy (cutting in two). the blind but not-deaf. include respectively the blind and deaf. 5. 6. the class ABG. AB. will be used to denote the several attributes. . AB. . The capitals A. the relation of dichotomy the fundamental case. Combinations of attributes will be represented by juxtaThus if. all the members of which possess the attribute A. if B stands for deafness. inthe neither blind nor deaf. The . are dealt with briefly. o. however. ^ sta. . B. Afi. to more elaborate (manifold. but dichotomy is In Chapter V. A represents blindness." or an object or individual not possessing the attribute A . the final judgment must be decisive any one object or individual must be held either to possess the given attribute or not. by means of which we specify the characters of the members of a class. Thus if A represents the attribute blindness. . G. 9 differ as the class in which a given individual should be possibility of uncertainties of this kind should always be borne in mind in considering statistics of attributes whatever the nature of the classification. aB. may be termed a class symbol. If the presence and absence of these attributes be noted. An object or individual possessing the attribute A will be termed simply A. . includes those who are at once deaf. the four classes so formed. For theoretical purposes it is necessary to have some simple notation for the classes formed. for most usually a class is divided into more than two sub-classes. and insane.nd%ior hearing. It is convenient to use single symbols also to denote the absence of the We shall employ the Greek letters. AB represents the combination blindness and deafness. ABy those who are deaf amd blind to entered. . The class. Generally "a" is equivalent to "non-. as above. has been termed by logicians classification. blind. viz. . instead of twofold or dichotomous) processes of classification. G. aB. —NOTATION AND TERMINOLOGY. B. y.: I. . and If a third attribute be noted. 7. natural or artificial. . B positions of letters. sanity. deafness. but not insani^ and so on. represents sight.c/. a j8. will be termed the class A. attributes A.4. and for the numbers of observations assigned to each. in which each class is divided into two sub-classes and no more. the deaf but not-blind. aj3. definite or uncertain.e. and the methods applicable to some such cases. ABy. The classifications of most statistics are not dichotomous. or. e. the class a is equivalent to the class none of the members of which possess the attribute A. Any letter or combination of letters like A. . non-blindness . denoted say by G. A classification of the simple kind considered. to use the more strictly applicable term. (a) . 6 of the first order.e.— I. i. may be . — — 11 —NOTATION AND TERMINOLOGY. statement. (A) = (AB) + (A/3) = (. and the twelve classes of the second order which can be formed where three attributes have been noted may be grouped into three such aggregates. (ABC) (ABy) (AjiC). Thus the frequencies for the case of three attributes should be grouped as given below . (AB) = (ABC) + (ABy) = etc. the whole number of observations denoted by the letter being reckoned as a frequency of order zero.e. for the case of three attributes. and 8 of the third. any class-frequency can always he expressed in terms of classfrequencies of higher order. been noted it would be sufficient to give the sixteen frequencies of the foiirth order. It however.4(7) + (Ay) = etc. gate of frequencies of the second order. the number of A's to the number of ^'s that are B together with the number of ^'s that are not B .'s together with the number of a's. In such a complete table for the case of three attributes. The whole number of observations must clearly be equal to the number of ^. twenty-seven distinct frequencies are given 1 of order zero. Class-frequencies should. in no case necessary to give such a complete is. {AB) {Afi) (AC) (Ay) (aC) (ay) (BC) (By) ifiO) (!)• (aB) Order 3. no more need be given. N {A) (a) (B) (/3) (G) (r) 2. (2) Hence.) = (B) + (P) = BU. instead of enumerating all the frequencies as under (1). N=^(A) + (a. Thus : — . 12 of the second. than If four attributes had the eight frequencies of the third order. i. since no attributes are N specified : Order Order Order 0. and so on. in tabulating. The classes specified by all the attributes noted in any case. be arranged so that frequencies of the same order and frequencies belonging to the same aggregate are kept together.j3) = etc. (aBG) (aBy) (a/3C) {Apy) 13. 1. classes of the rath order in the case of n attributes. = (AB) + (Aji) + (aB) + (a. 12. . {AB). C low — B nutrition. . the classsymbols for which both contain the same pair of letters {ABC) + {ABy) = {AB) = 338.g. i.g. the frequency of any second-order class. nerve signs. A development defects.4 or u. e.000. Hence we may say that it is never necessary to env/merate more than the ultimate frequencies. =2". {ABC) {ABy) {AfiC) {Afiy) 57 281 86 {aBG) {a. the number of order. . C or y).086 {AB) {AC) {BC) 338 143 135 57 286 {ABC) The number n attributes.) number of school children were examined for the presence or absence of certain defects of which three chief descriptions were noted. {A) is given by the the four third-order frequencies. . including the whole number of obser- vations N.— —— — 12 THEORY OF STATISTICS.By) (a/36') 453 (a^y) 78 670 65 8310 equal to the grand The whole number total: i\r= 10. Similarly. the class-symbols for which contain the same letter The frequency {ABC) -f {ABy) + {AjiG) + {Afiy) = {A) = 877. The complete results are N (-4) iB) (C) 14. e. Hence the whole number of ways in which the class-symbol may be written. (See reference 5 at the end of the chapter. Example i. or the of ultimate frequencies in the general case of number of classes in an aggregate of the nth is given by considering that each letter of the class-symbol be written in two ways (. is given by the total of the two third-order frequencies. total of of observations N is of any first-order class. find the frequencies the positive classes. may classes.e. 10. and that either way of writing one letter may be combined with either way of writing another. A termed the ultimate classes and their frequencies the ultimate frequencies.000 877 1. is 2x2x2x2 . All the others can be obtained from these by simple addition. B or ft. of Given the following ultimate frequencies. but are only indirectly given by the frequencies of the ultimate classes. Compare.8)-(a. the latter gives the second-order frequencies (AB). 2". B's. (The number of combinations of n things 2 together) (The number of combinations of on. for instance. The positive class-frequencies.. but any other set containing the same number of algebraically independent frequencies. Further. B.(a(7) + (a^C) = J!r-{A). The is 2". they become the symbols for the positive classes. series „ ^^-1) I "•" I n(n-l){n-2) 1. form one such set. and C.. {AG). ={a)-{aB) (a/3y) = N-{A)-{B) + {AB) = (a. thus (a/3) number of 16. 17. 1. Their number is. as given in Example i. viz. — AND TERMINOLOGY.(ABC) (3) (4) . 1 n n{n- 2.. and C's. none of these fundamentally important ligures without the performance of more or less lengthy additions. Otherwise the number N is made up 0.80 = N-{A).(B) . as may be readily seen from the fact that if the Greek letters are struck out of the symbols for the ultimate classes. which are necessary for discussing the relations subsisting between A. therefore the total set of positive class-frequencies is a most convenient one for both theoretical and practical purposes. Order Order Order Order (The whole number of observations) (The number of attributes noted) . 2". . 15.(B) + (AB) . including under this head the They are algetotal number of observations N.. braically independent . for which must be substituted.2 is + • • • • the binomial expansion of positive classes 1)" or 2". § 13.. no one positive class-frequency can be expressed wholly in terms of the others. moreover.. The expression of any class-frequency in terms of' the positive frequencies is most easily obtained by a process of stepby-step substitution .((7) + (AB) + (AC) + (BC) .2. in terms of the ultimate and the positive classes respectively... 13 —NOTATION The ultimate frequencies form one natural set in terms of which the data are completely given.. may be chosen instead. with the exception of a^y . n n(m-l)(ra-2) things 3 together) 1~2~3 and so But the T . — 1) ^r-o 3. and (BC). the two forms of statement. as follows .— — I. The latter gives directly the whole number of The former gives observations and the totals of A's.3 (1 -I- 1. work.{ABy) = (A) . of persons suffering from different " infirmities. iii.{Afiy) = 10.. used in an inclusive sense. but the following (neglecting minor distinctions amongst the mentally deranged and the returns of persons who are deaf but not dumb) :-^Dumb. others. the class-frequencies thus given are {A)..(C) + {BC) .57 = 281 {A^y) = {Ay) . (B). 14 Arithmetical THEORY OF STATISTICS.000 . The statement that the symbol A is used exclusively cannot mean. 19.135-1825 = 8310 and so on. blind or mentally deranged (lunatic. vol. the symbol A denoting that the object or individual possesses the attribute A. (ABy) = (AB) . § 13. (G). Examples of statistics of consideration are afforded by the census precisely the kind now under returns.1086 . blindness by B. (ABC) (cf. This set of p. but not or G or D. Check the work of Example i. blind. for England and Wales. table xlix. but at least one notation has been constructed on an exclusive basis {cf.. that the object referred to possesses only the attribute A and no others B . {A£y). obviously. and this list he must bear in mind." any individual who is deaf and dumb.. however. Census of England and Wales. The classes chosen for tabulation are.(AC) . e. for the reader cannot know what attributes a given symbol excludes until he has seen the whole list of attributes of which note has been taken. (aBC). Example ii. however. and deranged.286 + 135 . 18. dumb and deranged but not blind blind and deranged but not dumb . or whatever other attributes have been noted. for example. blind. = 877-143-281 = 453 (ajSy) = iPy) . The symbols of our notation are. An exclusive notation is apt to be relatively cumbrous and also ambiguous. {A/3C). signifying an object or individual possessing the attribute A with or without This seems to be the only natural use of the symbol. and mental derangement by C. should be executed from first and not by quoting formulae like the above. dumb and blind but not deranged . ref. Summary Tables. imbecile. or idiot) being required to be returned as such on the schedule.453 = 10.{Ajiy) = N-{B). If. frequencies does not appear to possess any special advantages. by finding the frequencies of the ultimate classes from the frequencies of the principles. tables 15 and 16. — positive classes.g.(ABC) = 338 . it should be remarked. neither the positive nor the ultimate classes. mentally deranged . in the symbolic notation. dumb. of 1891 or 1901.{ABy. Ivii.). the symbol A. deaf-mutism be denoted by A. Census of 1901. 1891. 5). 15 whatever. is excluded as well as what is included. as well as the symbols which may represent them. low nutrition. made to the notation used in Report on the Scientific Study of the Mental and Physical Conditions of Childhood" published by the Committee. (3) Yule.5. lix.. 1903. (6) Warner. 1901." Phil. cxciv. (1) Jevons. (5). Series A. Series A. (ABC) . are " Blind and forth. Parkes Museum." or " Blind. 257. cxcvii. Boy. "On the Theory of Consistence of Logical Class-frequencies and its Geometrical Representation. 2'rans. when classes are verbally described.. : Material has been cited from. "Notes on the Theory of Association of Attributes in Statistics. 125.) The following are the numbers of boys observed with certain classes of defects amongst a number of school-children. Soc." in the table relating been quite clear. " On vol. G. that the description is complete." Jour.. -S. Macmillan. etc. A. Boy. 189. with the notation slightly modified to that employed in the next three memoirs cited.. and Lunatic. U." is used in the sense " Blind and Dumb. ' ' . 91. EXERCISES. p.) (2) Yule. Stanley.. to " REFERENCES. and states what. Soc. Soc. 94 of (3). —NOTATION AND TERMINOLOGY. . nerve signs . "Mental and Physical Conditions among Fifty Thousand Children. 121. (The first three sections of (4) are an abstract of (2) and (3). Trans. Keprinted In Pure Logic and other Minor Works ." Biometrika. and reference (5) Wabner. p. F.ii. combined infirmities. Moy. p. The remarks made as regards the tabulation of class-frequencies at the end of (2) should be read in con- nection with the remarks made at the beginning of (3) and in this chapter cf. W... " On the Association of Attiibntes in Statistics.. but not Lunatic or Imbecile. . G. C. 1900. 1. vol. and others. and so on for the others. 3870. The " Blind " includes those who Dumb. In the first table the headings are inclusive.— I. footnote on p. are naturally used in an inclusive sense. Adjectives." and so But the heading " Blind and Dumb. p.. and Phil. it merely excludes the other attributes noted in the particular investigation. vol." Phil. (Figures from ref. D. (The method used in these chapters is that of Jevous.. a General System of Numerically Definite Heasoning. in the second exclusive. in this respect. G. 1896. and care should therefore be taken. Soc. Stat. denotes development defects . vol. The terminology of the English census has not. if anything. U. (4) Yule." Memoirs of the Manchester Lit." etc. eta. in the same way as our notation.. 1890. F. Dumb.. Tlie fallowing are the (5). (Figures from ref.— 16 THEOEY OF STATISTICS.) positive classes for the girls in the same investigation : jV . frequencies of the 2. . . say.') = Number — . For actual work on any given subject. used in this sense by writers on logic. and so on. we should strictly use living in 1901. and that is sufficient. 2. 1-3. denote the combination of attributes. the symbols : : — ( U — — B TI) \UA. to England in 1901. [. or simply the universe. Consistence 7-10. space. may be considered as specified by an enumeration of the attributes common to all its members.UB) = . of English males over 60 living in 1901. like any class. general. A insanity. however. over 60 years. living in 1901. the symbol ought to be written if. may be limited to England. those implied by the predicates It is not. or even to English males over 60 years of age in 1901. The of observation or univeree and its specification by symbols Derivation of complex from simple relations by specifying the universe 5-6. no term is required to denote the material to which the work is so confined the limits are specified. We know that such attributes must exist. 17 2 .— — CHAPTER IL CONSISTENCE. English male over 60 blindness. male. to take the illustration of § 1.. to English males in 1901. An investigation on the prevalence of insanity. insane English males over 00 living ill 1901. and the common symbol can be understood.g. Conditions of consistence for three field 4. may be adopted as familiar and convenient. But for theoretical purposes some term is almost essential to avoid circumlocution. e. In strictness. of age.. necessary to introduce a special letter into the classsymbols to denote the attributes common to all members of the universe. 1. Any statistical inquiry is necessarily confined to a certain time. {UAB)= blind blind and insane English males over 60 living in 1901. for instance. or material. — — — attributes. The universe. The expression the universe of discourse. Conditions of consistence for one and for two attributes 11-14. in' Ejujlish. = <UAB) + (UAm + (UaB) + {Ual3) = etc. Thus starting from the simple equation may : {a. A or B ox case U AB {ABG) = {ABGD) + {ABG^) and so on.(B) . (a/3y) we have = {y)-(Ay)-{By) + (ABy) = ]f-{A).— 18 — THEORY OF STATISTICS.(C) + (AB) + (AG) + (BG) . I. class-frequencies 5. writing in the latter of the universe. to denote the common general relations (2).4's that are not-.) = If-(A).. however. using attributes of all the members of the universe and {U) consequently the total number of observations If. =(UA) + (Ua) = (I7B) + (U/3) = eto. Specifying the universe. as y. Similarly. 3. the instead of the simpler symbols If (A) (JB) (AB). Chap. Hence any the attribute or comhirvatwn equatio7i the common to all the class-syniboU in an may be of attributes regarded as specifying universe within which equation holds good.ff. we have. The more complex may be derived from the simpler relations between class-frequencies very readily by the process of specifying the universe. should in strictness be written U in the form (CO (UA) = (UAB) + (UAI3) = {UAC) + (?7^-y) = eto. Clearly. § 13. (UA£) = (UABCr) + (UABy)== etc. again. e.5 within the iimiverse C." The equation {AG) = {ABG) + {Aj3G) be read "The number of A's is equal to the number of ^'s that are B together with the number of . /3.(ABC). by specifying the universe as {al3) = (/3)-{A/3) = If-{A}-{B) + (AB). we might have used any other symbol instead of to denote the attributes common to all the members or ABC. Any observed within one and the same universe which have been or might have been may IJe said to be ." i. Thus the equation just written may be read in words: "The number of objects or individuals in the universe ABC is equal to the number of D'b together with the number of not-i>'s within the same universe. 312 . for the case of n attributes. — They imply. any one frequency of the set may vary anywhere between and oo without becoming inconsistent with the others. all others must be so.470 + 42 + 147 + 86 . the conditions may be . (aySy) in fact. Clearly no class-frequency can be negative. (2) the positive class-frequencies. They might have been observed at different times. be limited by a simple consideration. and so verify the fact that they are positive. but others are by no means of an intuitive character. Apart from this. then. It follows from what we have just said that there is only one condition of consistence for the ultimate frequencies.525 . Suppose. For the positive class-frequencies. we may say that any given class-frequencies are inconsistent if they imply negative values for any of the unstated frequencies. the data are given N {A) {B) (C) 1000 525 312 470 (AB) {AC) (BC) (ABC) 42 147 86 25 Yet they there is nothing obviously wrong with the figures. (1) the ultimate. The conditions of consistence are some of them simple. 7.— II. Generally. that' they must all exceed zero. Hence we need only calculate the values of the ultimate class-frequencies in terms of those given. viz. definite universe. viz.25. As we saw in the last chapter. and do not in any way conflict. To test the consistence of any set of 2" algebraically independent frequencies. there must have been some miscount or misprint. —CONSISTENCE. being derived from the ultimate frequencies by simple addition. If the ultimate class-frequencies are positive. 19 consistent with one another. are alleged to be the result of an actual inquiry in a. but they cannot have been observed in one and the same universe. a negative value for {ajSy) — = 1000 . for instance. we should accordingly calculate the values of all the unstated frequencies. They conform with one another. are certainly inconsistent. there are two sets of 2" algebraically independent frequencies of practical importance. Otherwise they are consistent. however. and verify the fact that they exceed zero. This procedure may. = -57. If the figures. 6. consequently. = 1000-1307 + 275-25. in different places or on different material. — 20 — STATISTICS. THEORY OF expressed symbolically by expanding the ultimate in terms of the positive frequencies, and writing each such expansion not We will consider the cases of one, two, and less than zero. three attributes in turn. 8. If only one attribute be noted, say A, the positive frequencies The ultimate frequencies are (A) and (a), where are iV and (A). ia)^F-{A}. The conditions of consistence are therefore simply or, more conveniently expressed, (a) (4)^0 : (b) {A)i^]^ . . . (1) These conditions are obvious the number of A's cannot be less than zero, nor exceed the whole number of observations. 9. If two attributes be noted there are four ultimate frequencies The following conditions are given by (AB), {Afi), {aB), {ajS). expanding each in terms of the frequencies of positive classes (a) (^5)<t:0 (6) (c) (d) (a), (c), (AB)^{A) + {B)-N U^)>(^} (AB)1s>{B) or (.4^) would be negative „ „ (ap) „ „ (Afi) „ „ „ (aB) ' (2) . and (d) are obvious ; (b) is and is occasionally forgotten. It same type as the other three. None of these conditions are really of a new form, but may be derived at once from (1) (a) and or as jS respectively. The (1) (b) by specifying the universe as conditions (2) are therefore really covered by (1). 10. But a further point arises as regards such a system of limits as is given by (2). The conditions (a) and (6) give lower or minor limits to the value of (AB) ; (c) and (d) give upper or major limits. If either major limit be less than either minor limit the conditions are impossible, and it is necessary to see whether (A) and (B) can take such values that this may be the case. Expressing the condition that the major limits must be not less than the minor, we have perhaps a little less obvious, is, however, of precisely the ^ (yl)<0 1 (5)<0 of the (1), ( Those are simply the conditions {A) and (B) fulfil the conditions form (1). If, therefore the conditions (2) must be — II. —CONSISTENCE. (1) 21 possible. The conditions and ditions of consistence for the case of (2) therefore give all the contwo attributes, conditions of an extremely simple and obvious kind. 11. Now consider the case of three attributes. There are eight ultimate frequencies. Expanding the ultimate in terms of the positive frequencies, and expressing the condition that each expansion is not less than zero, we have — 22 {AB) and {BC). {ACT), — — — — THEORY OF STATISTICS. the conditions (4) give limits for the third, viz. replace, for statistical purposes, the ordinary rules of syllogistic inference. From data of the syllogistic form, they would, of course, lead to the same conclusion, though in a somewhat cumbrous fashion ; one or two cases are suggested as exercises for the student (Questions 6 and 7). The following will serve as illustrations of the statistical uses of the con- They thus ditions : Example i. — Given that {A) = {B) = {(J) = ^N The data are anA 80 per cent, of the ^'s are B, 75 per cent, of A's are G, find the limits to the percentage of B's that are C. and the conditions give (a) Ih) (c) ?(^<tl -0-8 -0-75 {d) {a) gives a negative limit <0-8 + 0-75-l >1 -0-8 +0-75 >.l +0-8 -0-75 and {d) hence they may be disregarded. From a limit greater than unity; (b) and (c) we have —that .55 per cent, nor more than 95 per can be G. Example ii. If a report give the following frequencies as actually observed, show that there must be a misprint or mistake of some sort, and that possibly the misprint consists in the dropping of a 1 before the 85 given as the frequency {BG). is to say, not less than cent, of the B'& — iVlOOO {A) {B) {G) 510 490 427 (^AB) {AG) {BG) 189 140 85 From (4) (a) we have <i:510 {BG) + 490 + 427 - 1000 - 189 - 140 H;98. But 85<98, therefore it cannot be the correct value If we read 185 for 85 all the conditions are fulfilled. of {BC). — II. — CONSISTENCE. 23 Example iii. In a certain set of 1000 observations (4) = 45, Show that whatever the percentages of ^'s (5) = 23, (C) = 14. that are A and of 6"s that are A, it cannot be inferred that any B's are C. ' — is conditions (a) and (6) give the lower limit of {BG), which find required. The We W (6) iv^ < i\r N ^^*'- <m^ (^) + (^)_.045 The first limit is clearly negative, since negative. The second must also be {AB)IN cannot exceed '023 nor (AC)/J!f '014. is any limit to (BC) greater This result is indeed immediately obvious when we consider that, even if all the B'a were A, and of the remaining 22 ^'s 14 were C's, there would still be 8 A'a that were neither B nor G. 14. The student should note the result of the last example, as it illustrates the sort of result at which one may often arrive by applying the conditions (4) to practical statistics. For given values of JV, (A), (B), (G), (AB), and {AC), it will often happen that any value of {BC) not less than zero (or, more generally, not less than either of the lower limits (2) (a) and (2) (6) ) will satisfy the conditions (4), and hence no true inference of a lower limit is possible. The argument of the type "So many ^'s are B and so many B's are C that we must expect some ^'s to be G " must be used with caution. Hence we cannot conclude that there 0. than REFERENCES. Logic, 1847 (chapter viii., "On the Numerically Definite Syllogism"). Laws of Thought, 1854 (chapter xix., "Of Statistical Condi(2) BooLB, G., tions"). The above are the classical works with respect to the general theory The student will iind both difficult to follow of numerical consistence. on account of their special notation, and, in the case of Boole's work, the special method employed. (3) YiiLK, G. U., "On the Theory of Consistence of Logical Class-frequencies and its Geometrical Representation," Phil. Trans., A, vol. cxovii. (Deals at length with the theory of consistence for (1901), p. 91. any number of attributes, using the notation of the present chapters.) (1) MoEBAN, A. HE, Formal — 24 THEORY OF STATISTICS. EXERCISES. 1. (For this and similar estimates c/. " Report by Miss Collet on the i Statistics of Employment of urban district of Bury, 817 per Women and years of age were returned as " thousand as married or widowed, what is the lowest proportion per thousand of the married or widowed that must have been occupied ? 2. If, in a series of houses actually invaded by small-pox, 70 per cent, of the inhabitants are attacked and 85 per cent, have been vaccinated, what is the lowest percentage of the vaccinated that must have been attacked 1 3. Given that 50 per cent, of the inmates of a workhouse are men, 60 per cent, are "aged "(over 60), 80 per cent, non-able-bodied, 35 per cent, aged men, 45 per cent, non-able-bodied men, and 42 per cent, non-able-bodied and aged, find the greatest If, in the Girls " 7564] 1894). thousand of the women between 20 and 25 occupied" at the census of 1891, and 263 per [C— and least possible proportions of non-able-bodied aged men. The following are the proportions 4. (Material from ref. 5 of Chap. I.) per 10,000 of boys observed, with certain classes of defects amongst a number .(i= development defects, 5=nerve signs, i5=mental of school-children, dulness. (A)= IB)= iV =10,000 877 1,086 (D) =789 {AS) = 3SS {BD) = i5o Show that some dull boys do not many at least do not do so. 5. exhibit development defects, and state : how The following are the corresponding figures for girls iV =10,000 682 iA)= 850 (£)= (D) =689 {AB) = 2i8 {£D) = S63 and state Show that some at least defectively developed girls are not dull, how many all B'a are C, therefore all A's are 6. C," express the premisses in terms of the notation of the preceding chapters, and deduce the conclusion by the use of the general conditions of consistence. 7. Do the same for the syllogism "All A's are B, no B's are C, therefore no .4's are C." 8. Given that {A) = {B) = {C) = iiN; and that {AB)/2ir={A0/N'=p, find what must be the greatest or least values otp in oi-der that we may infer that ( BC)IN exceeds any given value, say q. 9. Show that if must be so. Take the syllogism " All ^'s are B, (^=^ A' (.-B)_2^ (C)_ (AB)_{ACI)_(BO)_ JSf ~ N =^i the value of neither x nor y can exceed J. — - CHAPTER III. ASSOCIATION. 1-4. The 6-10. The conception of association and criterion of independence. 11-12. testing for the same by the comparison of percentages Numerical equality of the diiferences between the four second-order 13. Coefficients of associafrequencies and their independence values tion 14. Necessity for an investigation into the causation of an — — — — attribute A being extended to include uon-A's. of 1. If there is no sort of relationship, any kind, between two B, we expect to find the same proportion of A'& amongst the B'& as amongst the non-5's. We may anticipate, for instance, the same proportion of abnormally wet seasons in leap years as in ordinary years, the same proportion of male to total births when the moon is waxing as when it is waning, the same proportion of heads whether a coin be tossed with the right hand or the left. Two such unrelated attributes may be termed independent, and wG have accordingly as the criterion of independence for A and B attributes A and — {AE)_{m (^) If this relation ~ m ^' (/8) hold good, the corresponding relations (a^_(a^) (^) (/«) {AB) ^ (oB) (A) - (a) (^)_(a£) must also hold. For it follows at once from (1) that (B)-(AB) J/3)-iAft) (^) 25 (/8) ' — THEORY OF STATISTICS. ia • 26 that (aB)Jap) identities and the other two 2. may be similarly deduced. however, be put into a somewhat The equation different and theoretically more convenient form. second-order fre (1) expresses (AB) in terms of (B), (j8), and a quency (A^) ; eliminating this second-order frequency we have The criterion may, — {AB) _ (AB) + {Ap) _{A) (B) i.e. - (B) + {fi) N' is in words, " the proportion of A's, amongst the B's the same to recog- as in the universe at large." nise this equation at sight in The student should learn any of the forms ^^^ {AE)_iA) (B) (*) («) attributes The equation (d) gives the important fundamental rule If A and B are independent, the proportion of AB's in : the the universe is equal to the proportion of A'a midtiplied by the proportion of B's. The advantage of the forms (2) over the form (1) is that they give expressions for the second-order frequency in terms of the frequencies of the first order and the whole number of observations alone; the form (1) does not. Example i. If there are 144 A's and 384 B's in 1024 observations, how many AB'a will there be, A and being independent 1 — B — III. — —ASSOCIATION. cent, 27 less closely, c/. and therefore there must be 21 per §§ 7, (more or 8 below) of AB'a in the universe to justify the conclusion that 3. A and B are independent. It follows from § 1 that if of the foiir second-order frequencies, the relation (2) holds for any one e.ff. {AB), similar relations three. must hold from (1) for the remaining jAli) _ (y3) Thus we have (A) directly {AB) + {AP) {B) + {p) #' givmg And again, (aB)_{afi)_ {aB) (B) ip) + (ali) + (fi) (a) i\r' {B) which gives of Example iii. In Example i. above, what would be the number ajS'B, A and £ being independent ? (a) (/8) , — = 1024 -144 = 880 = 1024 -384 = 640 880x640 „ •• <"^>= 1-02^ = ^^^- _,„ The theorem is an important one, and the result may be deduced more directly from first principles, replacing (AB) by its value {A){B)IN in the expansions— {aB)={B).~{AB). = {A)-{AB). (aP) = {]r)-{A)-{B) + (AB). {Ap) as an exercise for the student. Finally, the criterion of independence may be expressed in yet a third form, viz. in terms of the second-order frequencies are independent, it follows at once from and alone. If This 4. is left A B equation (2) and the work of the preceding section that iABKap)JMmm, And evidently (aB)(Ap) is equal to the same fraction. {AB)(a. B more briefly. This form of criterion is a convenient one if all the four second-order frequencies are given. are not independent. When A and B are said to be associated. Suppose now that A and B are not independent.8) = 510. A and B are said to be negatively associated or disassociated. it is not meant merely that iome . however complicated.— 28 Therefore THEORY OF STATISTICS. are A and B independent or not f — (^5) = 110 Clearly so in (a5) = 90 (^/3) = 290 (a. it is not meant that no A's are B's. The student should notice that these words are not used exactly in their ordinary senses.B){A^). (3) _{AB) (Afi) _ '- The equation the (c) -S's is " The ratio of A's to a's amongst (6) may be read equal to the ratio of ^'s to a's amongst the /3's.^)> (a. but that the number of A's which are B's exceeds number to be exjyected if A and B are independent. Example iv.il's are B's." and similarly. but related some way or other. 5. or sometimes simply associated. If the second-order frequencies have the following values. but that the number of A'a which are B's falls short of the number to be expected if A and £ the when . A and are said to be positively associated. but in a technical sense. on the other hand. enabling one to recognise almost at a glance whether or not the two attributes are independent. {AB){aP) = {aB){Ap) {a) (AB) (IB) _ - {Afi) W) (oB) (a^e) ^^\ ^''^J . If. disassociated. A and B Then if (AB)>'^^. Similarly. B A and are said to be negatively associated or. — III.. When {AB) falls to either of these values.e. {A){B)IN' = 44 X 53 \qq = 23-32. This conception of degrees of association. so that we may speak of attributes being more or less. 6. 26 18 27 ^. the greater. highly or slightly associated. mere fact that sorne A'& are B's. 7. that such association cannot indicate any real connection between the results of the . is important. {A)." or more narrowly to the case when both these statements are true. The greater the divergence of {AB) from the value {A){B)IN towards the limiting value in either direction. N be said to be completely disassociated. from the nature of the case. while actually {AB) is 26. The lowest possible value of (which{AB).. where {AB) only differs from {A){B)/2f by a few units or by a small proportion. . If we use A to denote " heads " in the first toss. . between But it the result of the first throw and the result of the second. {B) = 53. t First toss tails and second heads . it may'be that such association. § 13). B " heads '' in Hence the second. and the tosses noted in pairs . is either zero or {A) + {B) ever is the greater). is not really significant of any definite relationship. Complete generally understood to correspond to one or other " or " All B's are A. „ J.. then 100 pairs may give such results as the following (taken from an actual record) association is A and B may B : First toss heads and second heads „ tails „ . Complete disassociation may be similarly taken as corresponding to one or other of the cases. —ASSOCIATION. When {AB) attains either of these values.7 talis . however great that proportion this principle is fundamental. on the other hand. is the intensity of association or of disassociation. and {B) is either {A) or {B) (whichever is the less). The greatest possible value of {AB) for given values of N. Hence there is a positive association." or it may be of the cases. 29 " Association '' cannot be inferred from the are independent. degrees which may in fact be measvired by certain formulae (c/.. . in the given record. . A and B may be said to be completely or perfectly associated. To give an illustration." or "no a's are /S. is fairly certain. and should be always borne in mind. When the association is very slight. we have from the above (-4) = 44. we may say. " All A's are more narrowly defined as corresponding only to the case when both these statements were true. "No A's are B. suppose that a coin is tossed a number of times. i. . The definition of § 5 suggests that we are to test the existence or the intensity of association between two attributes by a comparison value (as it (AB) with its independence{A){B)/M. for example. but others a negative association. . which all hold good for the case of positive association between A and B. is sometimes said to be due to chance. to an extremely complex system of causes of the general nature of which we are aware. how great the divergence must be before we can consider it as " well-marked. but of the detailed operation of which we are ignorant. to differences between small samples drawn from the same material. 8. follow at once from the definition of § 5. The conclusion is confirmed by the fact that." must be postponed to the chapters dealing with the theory of sampling. (c) and (d) follow from (a) and (6). 9. impossible to analyse. or dependence as at least unproved. and to the serious risk of interpreting a " cHance association " as physically significant. At present the attention of the student can only be directed to the existence of the difficulty. nor exactly the same proportion of male births when the moon is waxing as when it is waning. for instance. in practice. to adopt a method of comparing proportions. of a number of such records. e. (a) and (b). as leads. like the above occurrence of positive association. The discussion of the question. A little consideration will suggest that such associations due to the fluctuations Of sampling must be met with in all classes of statistics. or better to the chances or fluctuations of sampling. (B) and JV in the second. on multiplying across and expanding (A) and iV in the first case. To quote.30 . the two illustrations there given of independent attributes. we know that in any actual record we would not be likely to find exactly the same proportion of abnormally wet seasons in leap years as in ordinary years. from § 1. two throws it must therefore be due merely to such a complex system ot causes. The procedure is from the theoretical standpoint perhaps the most natural. but it is usual. THEORY OF STATISTICS. the proportion of A's amongst the B'& with the proportion in the universe at large. A large number of such comparisons are available for the purpose. But so long as the divergence from independence is not well-marked we must regard such attributes as practically independent.g. Au event due. as indicated by the inequalities (4) below. The first two. The deduction of the remainder is left to the of the actual value of may be termed) student. Such proportions are usually expressed in the form of percentiages or proportions per thousand. some give a positive association (like the above). . 1230] 1903. i. (Material from 64th Annual Report Reg.000.e. and between male-sex and death. [Cd.^^gj •0158. It is there is positive association usual to express proportions . or between (c) and {d). We find (AB) _J85fil8 (A) "15. Example vi. for then we have {A)IN= The meaning -7 X -9 + -4 X -1 = -67 of {a) or (6). with regard to the more important aspect of the problem under discussion. (a^)_ (a) 265. Where no definite question has to be answered or hypothesis tested both pairs of proportions may be tabulated. ments are equivalent if {B)IN=0'^. or the hypothesis to be tested. 1901 Females „ „ „ Of the Males died Of the Females died . however.618 265.e.848. Association between sex and death.000 = Therefore {AB)j{A)>{aB)/{a)." ^. .773. cannot be fully realised unless the value of {B)/]V (or {A)IN in the second case) is known. the number of deaths by (B) . . below). General. There still remains the choice between (a) and (b). again. the exact question to be answered. in fact.000 16.967 16.848. so that {A)IN approaches closely to {Aft)/{ft) or (B)/M to {aB)/{a) (cf.000 285.) — Males in England and Wales. This must be decided with reference to the second principle. An exception may. i.— 32 this — — THEORY <)F STATISTICS. the proportion of males that died and the proportion of females. and therefore (c) is to be preferred to (a). and {d) to {b). . But if it were only stated that {AB)I{B) between = -70 (A)IN= -67 Yet the two statethe association would appear to be small. then the natural comparison is between {AB)/(A) and {aB)/{a). be made in cases where the proportion of B'& (or A's) in the universe is very small.967 We may denote the number of males by (A). 15. A would mean a considerable positive association and B. as in Example vi. as illustrated by the examples below.773. Example v. to recognise such forms of statement as the following. 15'8 . /The student should learn. [Cd. . which is the point to be investigated.. but it should be remembered that the latter depends on the sex-ratio as well as on the causes that determine the death-rates amongst males and females.528. etc. as equivalent to the above : Proportion of males amongst those that died in the year . . bo that the above figures Death-rate „ among Males „ Females . however. IS'l per thousand. seeing that (A)/J!^ and (5)/iV' differ very (B).246 451 associated Eequired. 32. . of deafOne of the comparisons (a) or (b) may very well be used in this case. 1523. but the form is not desirable. and not with the sex-ratios of the living and the dead. 1 g^g '^ thousand. . Total population of England and Wales Number of the imbecile (or feeble-minded) Number of deaf-mutes Number of imbecile deaf-mutes .4/3)/(j8) and {aB)j(a) respectively. We may from (. imbecility. to find whether deaf-mutism is with mutes by little denote the number of the imbecile by (A). " Since (AB)/{S)>{AP)/(^). but is not so clear an indication of the difference between males and females. . with death-rates. illustrating very well Statisticians are concerned the remarks on the opposite page.000 48. 16'9 „ This brings out the difference between the death-rates of males and of the whole population. j Proportion of males amongst those {too that did not die in the year ] . to the population as rates per thousand .882 15. A comparison of the form (4) (c) is again valid for testing the association. would be written . —Deaf-mutism Summary (Material from Census of 1901. 18-1 per thousand. A comparison of the death-rate among males with the deathfor the whole population would be equally valid. that there is between A and £.— III. of deaths. and Imbecility. Tables. The question 3 .. .]) . whole population . positive association it follows. marriages. . . births. . Example vi. The above figures give rate — Death-rate „ among males for . as before. — 33 —ASSOCIATION. would be reckoned as giving two to the class AB and one to the class A^. ) | ) „„ 76 per cent. . amongst the imbecile.. .—— . 230 Required to find whether the colour of the son's eyes is associated with that of the father's. ._ whole population {£)/JV f . " . — 34 THEORY OF STATISTICS. . Either comparison exhibits very clearly the high degree of assobetween the attributes. . p. ) „„ „ | 'o per cent. 1 Proportion of deaf-mutes amongst the imbecile (AB)/(A) . . . Troms. Example vii.g. Proportion of deaf-mutes in the „. e. 2. (6) it is desired to exhibit the conditions will be preferable. In cases of this kind the father is reckoned once for each son . The best comparison here is Percentage of light-eyed amongst the sons of light-eyed fathers ..J " But the following Percentage Percentage is equally valid of light-eyed fathers of light-eyed sons amongst the . whether to give the preference to (a) or to (b) depends on the nature of the investigation we wish to make. Phil. If it is desired to exhibit the conditions among deaf-mutes (a) may be used : Proportion of imbeciles among deaf. J Proportion of imbeciles in the whole population = (^)/iV . that census data as to such infirmities are very untrustworthy. as given by Professor Karl Pearson. two sons light-eyed and one not. T I „ „ "^ P^'^ thousand. A. . however. lie j If. the . Eye-colour of father and son (material due to Sir Francis Galton. vol. „ (a/3) » . . cxov. . a family in which the father was light-eyed. ciation — Fathers with light eyes and sons with light eyes (AB) not light (A^) :. (1900). on the other hand. ill 151 148 . ) mutes = {AB)I{B) 39 ^ thousand. . . . and 3 of the memoir treated as light). 138 . „„ Percentage of light-eyed amongst the sons of not-light-eyed fathers . . of light-eyed amongst fathers of not-light-eyed sons .. . „ „ » not light light (aB) „ „ not light . It may be pointed out. the classes 1. ) " J . the power of estimating the character of parents from that of their offspring. . that it is often desirable to shall use the symbols employ single symbols to d«note them.(a/3)„ = {A^). =(«. = (a5)„ .8)o +8.{AB) = {B) .-. be shown that Similarly it may m Therefore. = 21 {aB). indicate equally clearly the tendency to resemblance between We father 11.{AB). We If 8 denote the excess of {AB) over {AB\.{cB). then we have (aJB) = {B) . and son.4^) (aB) {Aji) and {aj3). (a)(B) ' N N ' {jm i^m N ' ' jsr are of such great theoretical importance.-{aB). of parents to offspring. {A){B) . Supposing. want to make use of offspring to parents. The values that the four second-order frequencies take viz.8.—— — III. = {aB). for example. nor do we define heredity in terms of the resemblance Both modes of statement. we have {AjS) {AB) . — — 35 —ASSOCIATION. in the case of independence. {AB)-{AB).S JJ!f-{A)]{B) N = {aB)^ .{AB). .-h. = (a/3) . however. quite generally {Ap)^{AP). The reason why the former comparison is preferable is. that we usually wish to estimate the character of offspring from that of the parents. iV=100 then (^) = 60 (^) = 45 (a^)o {AB). = l8 (^/i)o=33 = 22. and of so much use as reference-values for comparing with the actual values of the frequencies (. . as a rule. and define heredity in terms of the resemblance of do not. or it will be liable to suggest a higher degree of association than actually exists. and we have 35-27 = 30-22 = 18-10 = 33-25 = 8.35 = 10.3) = 100 .g. (a. if A and B be disassociated and {AB) = say 19.j 35. and 8 I It is evident that the difference of the cross-products may be very large if be large. and {AB) — ^a.— — — — 36 If. Bring the terms on the right to a common denominator. Similarly. Example viii. we have 8 = jqq| 35x30-25x10 = ^/19x14-26x41 I =8 8. now. (^/5) = 60 .45 + 35 = 30. in = 2Q {Ap) = il (a/?) =14 19-27 = 14-22 = 18-26 = 33-41= -8.60 . then {aB) = 45 . We have by definition 8 = {AB)-{AB).35 = 25. the student {aB) will find that {AB) = \^ and 12. A and B are positively associated. = (A£)-i^. although S is really very small. The following data were observed for hybrids of N : — . taking the examples of § 11. That is to say. then we have N\ -[{AB) + {Ap)l{AB) + {aB)-\ ] = ^{{AB){alS)~{aB){Ap)\. e. THEOllY OF STATISTICS. and express all the frequencies of the numerator in terms of those of the second order . the common difference is equal to l/iVth of the difference of the " cross products " {AB){a^) and (aiB){Ap) . The value of this common difference 8 may be expressed a form that it is useful to note. this should be remembered the difference should be compared with N. In using the difference of the cross-products to test mentally the sign of the association in a case where all the four second-order frequencies are given. there is clearly a negative association.1 13. If we use the term "complete association" in the wider sense there defined. prickly laB) (aj8) „ 12 21 3 Investigate the association between colour of flower and character of fruit. i. 1 80 Per cent. 12x21 = 252.e. But S= 111/83 = 1 '3 only. 1 „„ Percentage of white-flowered plants with . . and . „„ " . so that in point of fact the association is small. 1902) Flowers „ violet. Working out the percentages we have Percentage of violet-flowered plants with prickly fruits prickly fruits . {AB) (-4/3) 47- smooth smooth Flowers white. — — —ASSOCIATION. Bateson and Miss Saunders. grouping the frequencies in a small table in a way that is sometimes convenient. + 1 when they are completely associated. it is often very convenient to measure the intensities of association in different cases by means of some formula or " coefficient. 1 . While the methods used in the preceding pages suffice for most practical purposes. Since 3x47 = 141." so devised as to be zero when the attributes are independent. 252-141 = 111. . . in the sense of § 6. so small that no stress can be laid on it as indicating anything but a fluctuation of sampling. fruits prickly .. we have. . the three cases of complete association : .1 when they are completely disassociated. Report Committee of the Eoyal Society. .. . „ . : 37 to the Evolution Datwra (W.— ril. {AB) {a^)<(aB} {AjS}. and at first sight this considerable difference is apt to suggest a considerable association. The {B) = {AB). SO that all A's are B and also all B'a are A. three corresponding oases of complete disassociation are (4) (5) (6) .— 38 THEORY OF STATISTICS. possessing the same properties but certain advantages. XJ.). § 5) the mere fact of 80. 1903. p. X. which has come into more general use. : £ £ : REFERJ^lSrCES. a measure which serves a similar purpose in the case of attributes to that served by cases of manifold (c/. For further illustrations of the use of this coefficient the reader is referred to the reference (1) at the end of this chapter. The question of the best coefficient to use as a measure of association is still the subject of controversy for a discussion the student is referred to refs.." Biometrika. "Notes on the Theory of Association of Attributes in (Contains an abstract Statistics. classification certain other coefficients in the (e/. that (c/. vol. the criterion of independence for two attributes A and B. It would be no use to obtain with great pains the result (c/. for the sake of emphasis. (1) Yule. j8's and £'s. V. (3) . based on theorems in the theory of variables. 257. and (6).) III. relations of an attribute A must not be confined to ^'s. G. (3). 90. vol.) that 29-6 per thousand of deaf-mutes were imbecile unless we knew that the proportion of imbeciles in the whole population was only 1 '5 per thousand . in the absence of information. it may be well to repeat. (4). (Deals fully with the theory of association : the association coefiBoient of § 13 suggested. to ref. (2) Yule.) and of variables Chap. 90. "On the Association of Attributes in Statistics. or 99 per cent. —ASSOCIATION. In concluding this chapter.. or 99 per implies nothing as to the association of A cent. XI. we can but assume that In order to apply 80. of the principal portions of (1) and other matter. U.. nor would it contribute anything to our knowledge of the heredity of deafmutism to find out the proportion of deaf-mutes amongst the oflFspring of deaf-mutes unless the proportions amongst the offspring of normal individuals were also investigated or known. Reference should also be made to the coefficient described in § 10 of Chap. of A's being with . the necessary information no comparison is otherwise as to a's is already obtainable) possible. though in the opinion of the present writer its use is of doubtful advantage.) . 1900. Hoy.. ii. and the references to Chaps. 14. but must be extended to a's (unless. and for a mode of deducing another coefficient. Example vi. p. 39 The coefBcient is only mentioned here to direct the attention of the student to the possibility of forming such a measure of association. cxciv. to ref. of a's may also be S. Trans. 121. (5). or concerning a universe that includes both a's and Hence an investigation as to the causal A's. for the modified form of the coefficient. G. of course.'' Phil.. Chap. Soc. it is necessary to have information concerning a's and /3's as well as A'a and ^'s. and XVI. IX. Series A. 000 males and 16. or negatively associated in each of the following cases B : (a) (6) N {A) =5000 (A) =2350 (B) (o) =3100 = (AB)= 3. Tcgl.) Lipps. vol. whether A and are independent.799. the coefficient of association of § 13 is again suggested independently. Peakson. math. Soc." Proc. 490 256 {AB)= 294 (. ix. Roy. 1.729. Series A.nAes. (Deals with the problem of measurement of intensity of association from the standpoint of the theory of variables. Siat. 1905. 3.000 females. (Deals with the general theory of the dependence between two characters. F. Karl. p. vol. 1913. and draw attention to any special points you notice. Trans.e. For a criticism see ref.) : .. 1915. Yui. How many of each sex for the same total number would have been deaf-mutes if there had been no association ? 2. viii. oxov. ref. Feb. critical survey of the various coefficients that have been suggested for measuring association and their properties : a modified form of the coefficient of § 1 3 given which possesses marked advantages. Roy. 113. At the census of State proportions exhibiting the association between deaf-mutism from childhood and sex. .) species that were above or below the average height.) Gkeenwood. and 3072 females. M. Soc. Soc. as briefly as possible. and David Heron. Ixxv..8)= 570 48 (^5) = 1600 (a£)= 380 (a. pp. and 6. and the conclusion that none of these coefficients are of much value for comparative purposes in interpreting statistics of the type considered. England and Wales in 1901 there were (to the nearest 1000) 15. Kakl... G. Pbaeson. and the interpretation of such statistics in general. 3. 159-332.aB)= 768 = (^.fertilised parentage Investigate the association between height and cross-fertilisation of parentage.and Self-fertilisation of Plants. Klassed. saclisischen Gesellschafl d. (Cited for the discussion of association coefficients in § 4. Leipzig. U." PAi'Z. p. p.-phys. 1. vol." Jour. TJ. (A reply to criticisms in ref. "On the Methods of Measuring the Association between Two (A Attributes. vol. 579-642." Bwmetrika.) ) — 40 (3) THEORY OF STATISTICS. 1900. Soy. (4) (5) (6) (7) EXERCISES.8)= 144 (Figures derived from Darwin's Gross. however classified .. Show. G. 294. 1.. Yule." Berichte d. "On Theories of Association. 3497 males were returned as deaf-mutes from childhood."DieBestimmungderAbhangigkeitzwischendenMerkmalen eines Gegenata. pp. stating separately those that were derived from cross-fertilised and from self. positively associated. 1912. Wissenschaften. Species. " On the Con-elation of Charactere not Quantitatively Measurable. The table below gives the numbers of plants of certain cf. " The Statistics of Anti-typhoid and Anti-cholera Inoculations. giving a method which has since been largely used only the advanced student will be able to follow the work. of Medicine. Trace the association between blindness and mental derangement from childhood to old age. Give a short verbal statement of your results.. tabulating the proportions of insane amongst the whole population and amongst the blind. together with the number of the blind {A).— III. the values oi{AB)f„ etc. 50 79 89 . . ) Investigate the association between eye colour' of husband and eye colour of wife ("assortative mating") from the data given below. 41 (Figures from same source as Example vii.dark eyes M/S) . . but material differently Investigate the classes 7 and 8 of the memoir treated as " dark. 6. Husbands „ not-light eyes {A$) with not-light eyes and wives with light eyes {oB) not-light eyes {a$) . i. „ .") ..) The figures given below show the number of males in successive age groups. (oif) Fathers with not-dark eyes and sons with dark eyes not-dark eyes (o/3) . .. i. 5. and also the association coefficient Q of § 13. etc. 309 214 132 119 Also tabulate for comparison the frequencies that would have been observed had there been strict independence between eye colour of husband and eye colour of wife. grouped Fathers with dark eyes and sons with dark eyes i-^^) not.e. iii. association between darkness of eye-colour in father and son from the following data: 4.. of the mentally-deranged (5). . (^S)o..... .e. and the blind mentally -deranged (AB). as in question 4. (AB) . . vol. .. „ . (§ 11). the values of {A-B)^. p. . 1891. (Figures from same source as above. . —ASSOCIATION. Husbands with light eyes and wives with light eyes . 782 Also tabulate for comparison the frequencies that would haTC been observed had there been no heredity. : the data cannot be regarded as trustworthy.. (Figures from the Census of Englwnd and Wales. 34. to the association of A with C and of B with C. Uncertainty in interpretation of an observed association 3-5. or whether it' is of any other particular kind that we may happen to have in our minds at the moment. If we find that in any given case iAB)> all or <(^. the argument is that the observed association between A and B is due to the associations of both with C that is known is that there is between A and B. Any interpretation of the meaning of the association is necessarily hypothetical. Chap. a relation itself of some sort or kind cannot tell as whether the relation is direct. living in very unhygienic conditions. 2. but due. PARTIAL ASSOCIATION. The result by : : i2 . hygienic conditions by C. Source of the ambiguity : partial associations 6-8. third 9.e. The commonest of all forms of alternative hypothesis is of this kind it is argued that the relation between the two attributes A and B is not direct. Estimation of the partial associations from the frequencies of the second order 10-12. An illustration or two will make the matter clearer (1) An association is observed between "vaccination" and " exemption from attack by small-pox. and the number of possible alternative hypotheses is in general considerable. more of the vaccinated than of the unvaccinated are exempt from attack. g§ 7-8). 1-2. whether possibly it is only due to " fluctuations of sampling" (c/. but is wholly due to the fact that most of the unvaccinated are drawn from the lowest classes. III. Illusory association due to the association of each of two attributes with a. It is argued that this does not imply a protective effect of vaccination." i. The case of complete independence. Denoting vaccination by A. exemption from attack by B. The total number of associations for a given number of attributes 13-14. — — — — — 1.— CHAPTEE IV. in some way. it cannot be for the reasons suggested in § 2. —PARTIAL ASSOCIATION. It is argued that this does not mean an influence of expenditure on the result of elections. and C. Denoting winning hj A. The fact would prove that the association between vaccination and exemption could not be wholly due to the association of both with hygienic conditions. (3) An association is observed between the presence of some attribute in the father and its presence in the son. and compare the percentages of Conservatives winning elections when they spend more than their opponents and when they spend less. and grandfather by A.IT. Thus. The biological case of the third illustration should be similarly treated. The ambiguity in such cases evidently arises from the fact that the universe of observation. B and C. and an association were still observed between vaccination and exemption from attack. but both. 3. or else all do not. and it is still sensible. in the second illustration. say. 43 (2) It is observed. or objects not possessing it. and Conservative by C. and that the Conservatives generally spent more than the Liberals. if the statistics of vaccination and attack were drawn from one narrow section of the population living under approximately the same hygienic conditions. possess the attribute. father. then the association first observed between A and G for the whole universe cannot have been due solely to the observed associations between A and B. B and C . but is due to the fact that Conservative principles generally carried the day. If the association between A and C be observed for those cases in which all the parents. and also between the presence of the attribute in the grandfather and its presence in the grandson. though of course others might remain. that a greater proportion of the candidates who spent more money than their opponents won their elections than of those who spent less. the supposed argument would be refuted. Again. in each case. contains not merely objects possessing the third attribute alone. If the percentage is greater in the former case than in the latter. spending more than the opponent by B. if we confine our attention to the " universe " of Conservatives (instead of dealing with candidates of both parties together). the argument is the same as the above (c/. If the universe were restricted to either class alone the given ambiguity would not arise. the question arises whether the association between A and C may not be due solely to the associations between A and B. Denoting the presence of the attribute in son. B. at a general election. Question 9 at the end of the chapter). in the first illustration. respectively. we shall avoid the possible fallacy. ^ (ySC) (AG) ^°'> These inequalities may easily be rewritten for any other case by making the proper substitutions in the symbols . throughout. thus to obtain the inequalities for testing the association between A and G in the universe of B's. (ABC)>(^^ and negatively associated (1) As in the converse case. we must have. III. (ABC) (AG) ^"'' (ABC) (AC) (BC) . If ^ABGn). . apply of course equally to the present case. or where the relations of more than three have to be taken into account.. it is argued that this is due to the association of both A and B with D. III. THEOKY OF STATISTICS.) when . If. 31).^Aomm. and vice versa. The remarks of § 10. as to the choice of the comparison to be used.). A and B will be said to be posiin the universe of C's tively associated in the universe of C's (c/. Chap. The associations observed between the attributes A and B and the universe of y's may be termed partial associations. it being remembered that the order of the letters in the class-symbol is immaterial. Chap. when it is observed that A and B are still associated within the universe of C's. the argument may be tested by still further limiting the £eld of observation to the universe GD. the association is most simply tested by a comparison of percentages or proportions (§ 9. B must be written for C. if A and B are positively associated within the universe of C's. A and B are positively associated within the universe of GD's. to quote only the four most convenient comparisons (c/. . II. Confining our attention to the more fundamental method.. III. Chap. and the association cannot be wholly ascribed to the presence and . ji for y. § 4 of Chap.44 4. ^"^ {BCy^ JCJ (ABC) UpC) (BG) > ^ (C) (ABC) ^' (aBG) (aC) (2) . Though we shall confine ourselves in the present work to the detailed discussion of the case of three attributes. 5. the deiinition of § 5 of Chap. III. it should be noticed that precisely similar conceptions and formulse to the above apply in the general case where more than three attributes have been noted. in the simpler case. although for some purposes a "coefficient of association" of some kind may be useful. p. to distinguish them from the total associations In terms of observed between A and S in the universe at large. . (4) (a)-(d).. — IV. however complicated.) The following are the proportions per 10. and also to mental-dulness . (Material from ref. The following figures illustrate. A and being thus common effects of the same cause B (or another attribute necessarily indicated by B).A) who 1 _ 338 _5o. The phrase " connecting link " is a little vague.B-universe and the /8A and universe B B D B B : For the entire material dull =(Z))/JV Proportion of the dull=(I>' roportion . Partial associations thus form the basis of discussion for any case. (B) the number of the "dull. : —PARTIAL If it ASSOCIATION. (p.. The two following examples will serve as illustrations for the case of three attributes. Example i.000 of boys observed with certain classes of defects. . . I. weie dvM=(. 31-2). may very well be used (cf. (B)/]^." Discuss this conclusion.AD)/(. the association of A and B being tested for the universe ODE. (B)/JV are small. . (A) denotes the number with development defects. and so on as far as practicable. then. ." — N (A) (B) (Z)) 10. or (2) (a) (b) above. comparisons of the form (4) (a) or (b) of Chap. and may be similarly treated by investigation of the for the universes and /3.k ~ ~ ^SfT .086 (AB) (AD) (BB) 789 (ABB) 338 338 455 153 The Report from which the figures are drawn concludes that " the connecting link between defects of body and mental dulness is the coincidept defect of brain which may be known by observation of abnormal nerve-signs. (£) with nerve-signs. pp. the association between for the whole universe. The case is thus similar to that of the first illustration of § 2 (liability to small-pox and to nonvaccination being held to be common effects of the same circumstances). III. . and not directly influencing each other. 31). . partial associations between A and As the ratios (A)/]!^. of and absence of be then argued that the presence and absence E is the source of association. = ) '°" = 7-9 per cent. but it may mean that the mental defects indicated by nerve-signs B may give rise to development-defects A.. defectively developed . the remarks in § 10 of the same chapter. 45 absence of D as suggested. nor to the presence C and D conjointly. 5 of Chap. amongst a number of school children.000 877 1. the process may be repeated as before. the . ) The original data are treated as in Example vii. ^Is^t-^y^i. This result does not appear to be in accord with the conclusion of the Report. families with not less than 6 brothers or sisters. the second would give 3 x 1 x 4 = 12 to the class ABC. .. p. every possible line of descent is taken into account. (ABC) (ABy) (AI3C) (Apy) 1928 596 552 508 (aJBG) (aJBy) (al3G) (oySy) .— — 46 THEORY OF STATISTICS. tight-eyed. The table only gives particulars for 78 large table 20. . parent by B. of the last chapter (p. ugUt^ei. A D Exwmple ii. 16 to aBC. The class-frequencies so derived from the whole table are. who \ . the association between A very high indeed both for the material as a whole (the universe at large) and for those not exhibiting nerve-signs (the )S-universe). . 216. „gjityed. — Grandparents p. taking the following two lines of the table.. = 12 to the class ABy.. were dull =(^£Z))/(^5). 4 13 4x1x3 303 225 395 501 7 LiglS-°eyed the first would give 4 x 1 x 1 = 4 to the class ABC. who \ _ 153 : .^ /3) 185 ~ _q>./~~338 For those not exhibiting nerve signs Proportion of the dull =(flZ))/(e) . Light-eyed. . 16 to aftC. as we have interpreted it. 5 to a/JC. .o 539" . 15 to oBy. so that the material is hardly entirely representative. Children Parents a. and none to the remainder. =_!^=41-9 _jp. =xpT = _ 3*7 . 12 to A^C. 12 to A/Sy. parent and child. 4 3 5 4 11 11 B. grandparent by C. but serves as a good illustration of the method. . C. defectively developed „ .o per cent.. Denoting a light-eyed child by A. 4 to A/SC. A. . for the association between and in the ^-universe should in that case have been very low instead of very high. defectively developed . Eye-colour of grandparent. (Material from Sir Francis Galton's Natwral Inheritance (1889). : For those exhibiting nerve signs Proportion of the dull =(5i)/(B) . 5 to aBC. and 15 to a^Sy . 34)./ were dull = (^ei?)/. but it is very small for those who do exhibit nerve- The results are extremely striking is and D signs (the 5-universe). Thus. / _ (-^3) _^^0_ (yS) 1956 " the above cases we are really dealing with the between parent and oiFspring. as might be expected. parents and children. —PARTIAL ASSOCIATION.„„. tested (1) where the parents are light-eyed. in order to throw light on the real nature of the There are two such partial associations to be resemblance.= * (ABy) ^' ggf^ 72 6 „ . respectively :— Gramdparents a/nd Parents. = -T^-r.= jggQ = 44 '9 l7J (By) 821 „ Parents and Children. . J . (2) where they are The foUowing are the comparisons not-light-eyed. : : Crrandpa/rents a/nd Grandchildren Proportion of light-eyed amongst the grandchildren of light-eyed grandparents ) : Parents light-eyed. [ 1 = (ABC) . = gSoT = 86 '4 596 1928 per cent.iparents J _ (-SC) _ 2231 _ ^^ percent ^ (C) 3178 = -j-i.) .1 We proceed now to test the pa/rtial associations between grandparents and grandchildren. .}parents J Proportion of light-eyed amongst the ^ grandchildren of not-liglit-eyed [grandparents . Proportion of light-eyed amongst the \ children of light-eyedgrandparentsj Proportion of light-eyed amongst the ) children of not-light-eyed grand. <^7 The following comparisons indicate the association between grandparents and parents. approximately the same . Proportion of light-eyed amongst the "j grandchildren of light-eyed grand. Proportion of light-eyed amongst the ") grandchildren of not-light-eyed |giandparonts . = -7--y — roo(5 = 60'3 ^'^' (Ay) 1104 . .—— IV. and grandparents and grandchildren. in the next case it is naturally lower In both association : Grandparents a/nd Gramdchildren. = (AC) = 5770 = ^8 '0 ttft 2480 l'') ^i'» percent. as distinct from the total associations given above. . \ _ (^^) _ 2524 _ '^^ " (-B) "3052 J ' P^ ° Proportion of light-eyed amongst the 1 children of not-light-eyed parents. and consequently the intensity of association is. Proportion of light-eyed amongst the children of light-eyed parents . and {ABC)/{BC) If If form an ascending series. It may be noted. unless C is independent of either A or B or both. (AB)/{B) > {A)/JV. If A and B . The general nature of the fallacies involved in interpreting associations between two attributes as if they were necessarily due to the most obvious form of direct causation is more clearly exhibited by the following theorem : are independent within the universe of C's and also within the universe of y's. Hence {A)IN.. _ -S^ . In both oases the partial association is quite well-marked and . giving {ABCI))/{BCD). as well as a parental heredity.=i-nAq = 50 '3 grandohildren of not-light-eyed J1009 Irindpr... etc. —--^-j . =71-6 per cent. however.. and so forth. (^^)/(-«) . then. be due wholly to the total associations between grandparents and parents. as it is comparatively of little consequence. Proportion of light-eyed amongst the \ children of light-eyed parents J- _ - „. The series would probably ascend continuously though with smaller intervals.. {AB)/(B).— 48 — THBOKY OF STATISTICS. A and C are positively associated in the universe of B's. amongst the 1 grandohildren of light-eyed grand. Thus we have from the given data '"°°^^!}= ^XSingen^r^^' (^V^ . „. etc. respectively There is an ancestral heredity. A and D being positively associated in the universe of BG's. {ABC)l{BCr) > {AB)I{B). The above examples will serve to illustrate the practical application of partial associations to concrete cases.. only. the series might be continued. (ABO) . as it is termed. . {ABCDE)I {BODE).= 58 ***' '3 per cent. the total association between grandparents and grandchildren cannot. etc.. Grandparents and Grandchildren Proiroi'tion of light-eyed : Parents not-light-eyed. 7 „ Proportion of light-eyed amongst the") children of light-eyed parents and h ={ABO)/(£C) = 86'i „ If the great-grandparents. A and E in the universe of BGD's.\ parents J = 6B2 ^ = -rj=. were also known. tliey will nevertheless be associated within the universe at large. m . A and B are positively associated. We need not discuss the partial association between children and parents. parents and children. that the most important feature may be brought out by stating three ratios positive . 6. as regards the above results. ^'"'J Proportion of light-eyed amongst the (ABy) 608 "I =-7^-?. for the right-hand side will not be zero unless either (AC) = (AC)^ or (BG) = (BC)^. negative. subtract (AB)^ from both sides of the above equation. In (2) it is argued that the positive associations between conservative and winning. give rise to an illusory positive association between In (3) the question is raised whether viinning and spending more. exemption from attack and hygienic conditions. — ASSOCIATION. while no degree of heterogeneity in the universe can influence the association between A and B if all other attributes are independent of either A or B or both. In (1) it is argued that the positive associations between vaccination and hygienic conditions. the resulting illusory association between A and B will be positive .^0. III. parent and child. The result indicates that. If both associations are of the same sign. 7. 35)— (AB). ^Misleading associations of this kind may easily arise through 4 . as in § 11 of Chap. an illusory or misleading association may arise in any case where there exists in the given universe a third attribute G with which both A and B are associated (positively or negatively). conservative and spending more. = (§9. (4) This proves the theorem . and we have {AB)-{AB). simplify. = ^[(AC)-{ACmBC)-{BC).] . if of opposite sign. WC).= fflQ. —PARTIAL 49 Tlie two data give (3) Adding them together we have (^B) = ~^^N{AC){BO-{A){Crj{B0)-[B){Cr){AG) + U){S){0)j Write.— IV. the positive association between grandparent and grandchild may not be due solely to the positive associations between grandparent and parent. .J'-f>. give rise to an illusory positive association between vaccination and exemjption from attack. (p. The three illustrations of § 2 are all of the first kind. that the death-rate for males (the case mortality) has been 30 per cent. It follows that there the second positively. be an illusory negative association between the first two death and treatment. with the relations of which we are The data show here concerned. have the following results tively. for example. further. on 80 per cent. the mingling of records. 100 males and 100 females. and m^ore females died than males . suffering from some disease. Take the following case. for females 60 per cent. Suppose. therefore the first attribute is associated nega- A new treatment is tried cent.. Suppose there have been 200 patients in a hospital. in fact. which a careful worker would keep distinct. of the males and 40 per and the results published without distinction of sex. will : . trealTnent and vnale sex. that more males were treated than females. respecting the two sexes. of the females.g.— 50 — THEORY OF STATISTICS. If the treatment were completely inefficient we would. with the third. e. are death. The three attributes. ASSOCIATION. — IV.io 25 per cent. and vice versd I in such a case A and (so far as the record goes) will both be associated with the observer's attention C. but only 17/70 = 24 per cent. ^ ^ ^ 1 ^ Male Female Mixed record. of the offspring of parents with the attribute possess the attribute themselves.. children with . 8. per cent.. we can make some conjecture as to their sign from the values of the second-order frequencies. 13 per cent. „ « „. 9. 51 line. _ . that {abg)J^^)^^^Kk (5) .. for instance.. the illusory association would be negative.^ . Again.. I „" " f.„ " J Here 13/30 = 43 per cent... -. —PARTIAL „c. Parents with attribute and „irentswith . C.. though we cannot actually determine the partial associations unless the third-order frequency {ABC) is given. Parents with attribute and irents . of the ofifspring of parents without the attribute. . > J \ .. and consequently an illusory association will be created. and even one observer may fluctuate in the generosity of his marking. In this case the recording of A and the recording of will both be associated with the generosity of the observer in recording their presence. i' Parents without attribute and children without . The student will see that if records for male-female and femalemale lines were mixed. due solely to the association of both with male sex. Suppose. a ^ " 17 ^ ' " Parents without attribute „f. If the observer's attention fluctuates. line. one observer may be more generous than another in deciding when to record the presence of A and also the presence of B. . . Illusory associations may also arise in a different way through the personality of the observer or observers. if the attributes are not well defined. The association between attribute in parent and attribute in offspring is. association between A and as B B B before. > ^0 and chUdren with } ../"'''" children without . however. and that if all four lines were combined there would be no illusory association at all. It is important to notice that. and an illusory will consequently arise. he may be more likely to notice the presence of A when he notices the presence of B. the universe of y's.(C) • (BC) (B) + (Ayl (y) • (Byl.s. concrete case. 1897. if (AB) be equal to the value of the first two terms. If.e. say. A and B must be positively associated in the one partial universe and negatively in the other. below.. and the proportions over 65 years of age. if S^ first be positive). or both.) The following are the death-rates per thousand per annum. (B) two terms + Sj — {AB)_(A£) (B) . obtained by dividing through by. (Figures compiled from Supplement to the Fiftyfifth Annual Report of tlie Registrar-General [C. (AB) fall short of the value given by the first two terms.^. is perhaps clearer than the general formula. and judge of the sign of the partial associations between A and accordingly. or both. Hence if . Then we have by addition B ^^^^JAO)m. Finally. and glass workers (over 15 years of age in each case) during the decade 1891-1900 in England and Wales. of occupied males in general. 8503]. as iu Example iii. A and B must be positively associated either in the universe of C"s. the universe of y's. SO that 8j and Sg are positive or negative according as A and are positively or negatively associated in the universes of C and y respectively.iAym.h+li (B) ^ (B) m ' ^'^ In using this expression we make use solely of proportions or percentages. Example iii. . textile workers. (e) the value of (AB) exceed the value given by the (i. A and B must be negatively associated in the universe of C's. The expression (6) may often be used in the following form. farmers. on the other hand. B A — — .— 52 THBOEY OF STATISTICS. or else independent in both. parent and child. must be less than for occupied males in general (the last is actually the case) . the nature of the fallacy involved in assuming that crude deathrates are measures of healthiness. Textile working. — Eye-colour . It is evident that age-distributions vary so largely from one occupation to another that total death-rates are liable to be very misleading so misleading. ii. over 65). light-eyed parent (4. the above procedure for calculating the comparative death-rates is extremely rough. if it falls short. unhealthy. . Thus we have the following calculated death-rates : Farmers. light-eyed grand- (The figures are those of Example A. on the other hand. appears to be unhealthy (14'6 15-9). that they are not tabulated at all by the Registrar-General . be The death-rate for either young farmers a healthy occupation. as one would expect. 102-3) apply to each of its separate. however. {A) B. but also iu the relative proportions of the two sexes. The simpler procedure brings out. light-eyed child parent. must on the whole.— IV.S) iV=5008 = 3584 3052 (i?) =:3052 (C) = 3178 = 2524 (AC) = 2480 -' (£C) = 2231 . or both. . only death-rates for narrow limits of age Similar fallacies are (5 or 10 year age-classes) are worked out. and glass working still more so (13-0<16"6).) . It is hardly necessary to observe that as age is a variable quantity. C. 53 to apply the principle of equation (7). . .] The < — Example iv. The death-rate of those engaged in any occupation depends not only on the mere proportions over and under 65. . age-groups (under 65. 102-3 x -016 = 13-0. Calculate what would be the death-rate for each occupation on the supposition that the death-rates for occupied males in general (11 '5. occupation must on the whole be healthy . Textile workers Glass workers . §§ 17-19. liable to occur in comparisons of local death-rates. or old farmers. the of the actual death-rate. 11-5 x -868 11-5 x -966 11-5 x -984 -t- -f -h 102-3 x •132 = 23-5. XI. the actual low total death-rates are due merely to low proportions of the aged. —PARTIAL ASSOCIATION. and see whether the total death-rate so calculated exceeds or falls short If it exceeds the actual rate. in grandparent. 102-3 x •034 = 14-6. the high death-rate observed is due solely to the large proportion of the aged. then. calculated rate for farmers largely exceeds the actual rate farming. but on the relative numbers at every single year of age. [See also Chap. better than a more complex one. in fact. owing to variations not only in the relative proportions of the old. n 10. B and C were omitted as unnecessary.e. Actually (AC) = 2480 . and 4 partials with the universe specified by two). there are 6 pairs to be formed from four attributes. D . As suggested by Examples i. be partial association In the absence either in the . possible associations — three For three attributes there are 9 three partials in totals. and we can find 9 associations for each pair (1 total.— 54 THEOKY OF STATISTICS. but this could not be proved without a knowledge of the class-frequency {ABC). however. then. If there were no partial association we would have probably {AB){BG) {APXPC) 1060 x 947 1956 2524x22 31 3052 "^ = 1845-0 + 513-2 = 2358-2. and for six attributes 1215 associations. the three total associations and the partial association between A and C were worked out..B-universe. for and three partials in negative universes. the total and partial associations between A and D were alone investigated . a partial association with the grandparent whether the line of descent passes through " light-eyed " or " not-light-eyed " parents. In Example i. for instance. or both. The total possible number of associations to be derived from attributes grows so rapidly with the value of n that the evaluation of them all for any case in which n is greater than four becomes almost unmanageable. Practical considerations of this kind will always lessen the amount of necessary labour. above. is Given only the above data. For five attributes the student will find that there are no less than 270. the nature of the problem indicates those that are required. the y8-universe. there must. and ii. In Example ii. the associations between A and B. B and were not essential for answering the question that was asked. it is not necessary in any actual case to investigate all the associations that are theoretically possible . investigate whether there a partial association betvireen child and grandparent. For four the number of possible associations rises to 54. of any reason to the contrary. but the partial associations between A and B. positive universes. that there is is a partial association in both .. attributes. it would be natural to suppose there i. again. 4 partials with the universe specified by one attribute. m of therefore the number algebraically independent is associations derived from n attributes values of n this gives only 2''-ra-|-l. of which there are 2" in the case of n For given values of the m-h 1 frequencies N. that theoretical considerawould enable us to lessen it still further. {A). —PARTIAL ASSOCIATION. can be For successive that n .. of order lower than the second. (G)... attributes. determinate values of all the possible class-frequencies 1 . It tions therefore correspond to associations. (B). assigned values of the positive class-frequencies of the second and higher orders must 11. all class-frequencies can be expressed in terms of those of the positive classes. .— IV. at first sight. 55 might appear. -1- of But the number of these positive the second and higher orders is only 2" . As we saw in Chapter I. for instance. that (8) is not a criterion for the . in general. regard must be had to practical considerations rather than to theoretical relations. B quantities in the brackets on the right represent. sidering the choice and number of associations to be actually tabulated. Therefore A B A B In the ne. Suppose. It must be noted.xt place it is evident from the above that relations of the general form (to write the equation symmetrically) {ABC)JA) {B) (C) • ^ must hold iV" iT TV • • (^. Again. (2) {d).e. This relation is the general form of the equation of independence. B and C in the universe of A's. the four known associations. in the first place. is not of such a simple kind that the term on Hence in conthe left can be. it may be shown that and C are independent in the universe of ^'s. Clearly. (p. 26). the bracket on the left the unknown association. ^j^. {ABy)^{AB)-{AB0J^S.'m0) __ {A){B){y) _ {Ay){By) and are independent in the universe of y's. 14. {AC). The four were independent in the universe of C's. that a state of complete independence. say. and C in the universe of a's. differences for the frequencies (AS). and A and C are independent in the universe of B's. i. the relation particular case in which all the 2" . as we may term it. III. that all other possible associations must be zero. exists.w + 1 given associais worth some special investigation. Chap.— 56 — THEORY OF STATISTICS. and the corresponding and {BG).' for every class-frequency. that we are given 13.e. It follows. however. The tions are zero Then it follows at once that we have ^^^ also {ABC)i. Similarly. mentally evaluated. ) . 257. 2" - classes It is only because n+ l =1 when 71 =2 that the relation N may be all " N ' N' treated as a criterion for the independent/e of A and B. and (a/3). (Q).. . and the fallacies due to (1) Yule. with numei-ous illustrations : a notation suggested for the partial co'efSoients. {A). and C iV in the sense that the equation imjj) iV is (J) iV a criterion for the complete independence of A and B. 1903." Phil. (A). if N.. —PARTIAL of ASSOCIATION.. especially §§ 4 and 5. 57 complete independence A. cxciv.. (G) are only four the equation (8) must therefore be shown to hold good ior fov/r frequencies of the third order before the conclusion can be drawn that it holds good for the remainder. p. but it does not follow that if the relation (9) hold good they are all independent. {Of. vol. G. Series A. we can draw no conchision without further information . and (C) be given. "On mixing of records. we know that similar relations must hold for (AfS). vol. (aB). "Notes on the Theory of Association of Attributes in Statistics. U. If REFERENCES.. (B). be given. U... (A). (A). Soc. i.. and the last relation quoted holds good. that a state of complete independence subsists. and (B). There are eight algebraically independent class-frequencies in the case of three attributes. .e. ii. Soy. and the equation (8) hold good. The direct verification of this result is left for the student. G. (Deals fully with the theory of partial as well as of total association. however. Trans.) IV. the relation (9) holds good . 1900. on the theory of complete independence. while If. the n (m>2) attributes are completely independent. B. . the Association of Attributes in Statistics. (B). (2) YuLB." Biometrika. Quite generally. the data are insufficient.) (A) ' (B) - (C)_ J\r N N N ^ ' before must be shown to hold good for 2" . the relation : (ABC . If we are given i\^. (B). If i\^. p.71+ 1 of the nth order it may be assumed to hold good for the remainder. 121. p. nerve signs. 45. D.58 THEORY OF STATISTICS. A. 1. and discuss them similarly. EXERCISES. but not necessarily using exactly the same comparisons. iV B. mental dulness . development defects. to see whether the conclusion that " the connecting link between defects of body and mental dulness is the coincident " defect of brain which may be known by observation of abnormal nerve signs seems to hold good. Take the following figures for girls corresponding to those for boys in Example L. N . —PAETIAL ASSOCIATION. B in wife.— IV. in son 59 A light-eye colour in husband. . The table of principle of a manifold classification double-entry or contingency table and its treatment by fundamental methods 5-8. I. i. and II. Homogeneity of the classifications dealt with distributions in this and the preceding chapters : heterogeneous classifications. so on. is entered in the compartment common to the mth column and the nth row. : 60 . The general theory of such a manifold as distinct from a twofold or dichotomous classification. each of these under u heads. The number of the objects or individuals possessing any combination of the two characters. . s heads. say A^ and B„. A^ A^ As. . and finally the grand total at the right-hand bottom corner gives the whole number of observations. . B^ C„. The general 2-4. . . below will serve as illustrations of such tables of double-entry or contingency tables. . the frequency of the class A^B„. Isotropic and anisotropic 14-15. The coefficient of contingency 9-10. each of the classes so obtained then subdivided under t heads. and Bt. the st compartments thus giving all The totals at the ends of rows the second-order frequencies. in the case of n attributes JV. — — — — — 1. and the feet of columns give the first-order frequencies. say. t. . 1. Tables I. MANIFOLD CLASSIFICATION. Classification by dichotomy is. u 2. and t rows headed i?j to B. Cj ultimate classes altogether..CHAPTER V. thus giving rise to s. . . .e. . § 5. only. . Cj. the numbers of . B^. Analysis of a contingency table by tetrads 11-13. If the classification of the A's be sfold and of the B's i-fold. would be extremely complex or characters ABC in the present chapter the discussion will be confined to the case of two characters. as they have been termed by Professor Pearson (ref.e. a special case of a more general form in which the individuals or objects observed are first divided under. A and B. 1).^^'s and B^'s. as was briefly pointed out in Chap. a simpler form of classification than usually occurs It may be regarded as in the tabulation of practical statistics. . . the frequencies of the st classes of the second order may be most simply given by forming a table with s columns headed Ay to Ag. . i. 000 were in London. of which 571.e.000 houses. 616. uninhabited but completed houses. — MANIFOLD CLASSIFICATION. in round numbers. (3) rural districts. Thus from the first row we see that there were in London.) . In Table I.— V.000 were inhabited.000 inhabited houses in England and Wales. and the houses in each into (1) inhabited houses." i. and 5000 in course of erection from the first column. : 61 3. Summary Houses in England and Wales. there were 6.260. 4. Table X.625. in course of erection. (3) houses that are " building.) (OOO's omitted. (2) : of these divisions are again classified Table I. 40. and 1.064. (Census of 1901. the division is 3 x 3-fold the houses in England and Wales are divided into those which are in (1) London.000 in rural districts.000 uninhabited. (2) other urban districts.000 in other urban districts. of which 571. Taking the first row. Rows 1 and 2 may be added together as but column 3 may be omitted altogether. . . as association.— — — 62 THEORY OF" STATISTICS. It then becomes possible to trace the association between any one or moro of the A's and any one or more of the B'a. as another illustration. of the £'s. a larger might be expected. ) The association is therefore negative. Similarly. pooling London district. and the other urban districts together and similarly adding the first two columns. any such table may be treated on fold (2 4. ) Proportion of all houses which j are in course of erection in > 12/1761 rural districts . from the first column. J. the proportion of houses uninhabited being greater in rural than in urban districts. it tells us that read similarly to the last. III. . a distinct positive proportion of houses being in course of erection in urban than in rural districts. the principles of the preceding chapters by reducing it in different ways to 2 X 2-fold form. The tables are a generalised form of the fourX 2-fold) tables in § 13. =7 . and 47 red hair. Proportion of all houses which are uninhabited in urban districts . 946 grey or green eyes. If. for example. of whom 1768 had blue eyes. 189 black hair. . or Taking Table I. we find — Proportion of all houses which j are in course of erection in I 50/5010= 10 per thousand.. of whom 1768 had fair hair. ) There is therefore. For the purpose of discussing the nature of the relation between the A's and the £'s. 807 brown hair.e.. . . there were 2811 nieri with blue eyes noted. as the houses which are only in course of We then have erection do not enter into the question. ) 325/4960 = 66 per thousand. and 115 brown eyes. . trace the association of both. so as to make no distinction between inhabited and uninhabited houses as long as they are completed. either in the universe at large or in universes limited by the omission of one or more of the A's. ) Proportion of all houses which | are uninhabited in rural > 124/1749 districts . there were 2829 men with fair hair. Chap. . the procedure will be rather different. = 71 „ . . before. urban districts . it be desired to trace the association between the " uninhabitedness " of houses and the urban character of the district. between the erection of houses and the urban character of a Adding together the first two rows i. which affords the answer to this question in the case of a dichotomous classification." is based.g. — — . : — MANIFOLD 63 The eye. and 4 placed. i. . 2. and columns 2. and the mode of calculation 5. . ^-^ " association is therefore well-marked. and columns Here we must add together rows 1 and 2 as before.. e.— V..e. . but the between the two percentages is rather less than in the treatment adopted in the preceding section rests and. and if so. rows 1 and 2 may be pooled together as representing the least. The figures are . it gives the most detailed information possible with regard to the relations of the two attributes. 3. i. brown eyes and The may black hair. The mode of on first principles. or the reverse ? The subject of coefficients of association. the column for red being really mis1. and the need for any summary coefficient is not so often nor so keenly felt. For comparison we trace the corresponding association between the most marked degree of pigmentation in eyes and hair. twenty -five or more.'''°''!'"'^"^ ^'^^ } ^'bkck Taif light-eyed with 288/857 = 34 per cent. — CLASSIFICATION. may be treated in a precisely aimilar fashion. . hLir^ j 935/5943 ^ ^g _ association is again positive and well-marked. are. is this dependence very close. ^'wack The difference last case. the " coefficient of contingency. was only dealt with briefly and incidentally. If. for it is still the subject of some controversy further. say. if fully carried out.. as red represents a comparatively slight degree of pigmentation. / nr l^»/8»^ = /opry. . quite simple and fundamental. — J 2714/5943 = 46 per ) I nnt a ic^nn's »a o cent. The ideas On which Professor Pearson's general measure of dependence. We then have Proportion of light-eyed with fiirhair Proportion of brown-eyed with fair hair . we desire to trace the association between a lack of pigmentation in eyes and in hair. At the same time a distinct need is felt in practical work for some more summary method a method which will enable a single and definite answer to be given to such a question as Are the A's on the whole distinctly dependent on the -5's.and hair-colour data of Table II. where there are only four classes of the second order to be considered the matter is not nearly so complex as where the number is. pigmentation of the eyes. moreover. and 4 may be pooled together as representing hair with a more or less marked degree of pigmentation. Then. (1) however. for the sum of all the values of 8. . . but the second the better. and of its relation to the theory of variables. student should refer to the original memoir (ref. squaring is very usefully and very frequently employed for the purpose of eliminSuppose. therefore. The advanced is therefore given in full in the following section. x^ . 1) for a completer treatment of the theory of the coefficient. let the frequency of AJs.4„5„) and and n. It will not do. (. that every 8 is calculated. as it leads to a coefficient easily treated by algebraical methods. Generalising slightly the notation of the preceding chapters. 6. or. because every 8 is zero. then. . say. be denoted by (^„). must be zero in any case. wo form a coefficient C given by the relation (4) . It is necessary. or (2) by squaring the differences and then summing the squares. simply to add them together. we must have for all values of and n m — {A^B„)J^^^ = {A^B„). some of which are negative and others positive. The first process is the shorter. and also the ratio of its square to the corresponding value of (AB)^. and the frequency of objects or individuals possessing both characters by {A^B^. the sum of both the (ABy& and the (AB)q's being equal to the whole number of observations If. If. (2) A coefficient such as we are seeking may evidently be_ based in some way on these values of 8. Let (A„5„)„ will not be identical for all values of the difference be given by m Kn={A. in symbols. and that the sum of all such ratios is. the frequency of ^„'s by (-S„). ^^ is necessarily positive. . (^m-"ii)l ii) • (^) • • Being the sum of a series of squares. to get rid of the signs. if the A's and jB's be completely independent in the universe at large. A and B are not completely independent. using 2 to denote " the sum of all quantities like " : : : % and 82. however. if A and B bo independent it is zero. then.nBn)~{A^B„\ . which the first process does not as the student will see later. If. . and this may be done in two simple ways (1) by neglecting them and forming the arithmetical instead of the algebraical sum of the differences 8. ating algebraical signs.— 64 THEORY OF STATISTICS. —MANIFOLD CLASSIFICATION. which he denotes by ^. in the simple form (4). 1) terms S a sub-contingency. the mean contingency. identical. Cis Professor Pearson's mean square contingency this coefficient is zero B coefficient. no sign should be attached to the root. This may be briefly illustrated as follows. so that (A^BJ) = (j1. and each contributes (ref. gency .^ and B^ is perfect.„) for all values of m .V. for the coefiBcient simply shows whether the two characters are or are not independent. has one disadvantage. for any finite number of classes the limiting value of C is the smaller the smaller the number of classes. and approaches more and more nearly towards unity as )^ increases. that the association between A. only unity if the number of classes be infinitely great .! The coefficient. and nothing more. the remaining frequencies of the second order being zero . the coefficient in the latter form tends to be the least. slight pigmentation of eyes and of hair appear to go together. (2) 3 X 3-fold form. in fact. In general. It is clearly desirable for practical purposes that two coefficients calculated from the same data classified in two different ways should be. that coefficients calculated on different systems of classification are not comparable with each other. f-A'^i)-^ and therefore. further. say. Replacing S„>„ in equation (3) by its value in terms of (^m-Sn) and {A^B„)^ we have— 7. but in some cases a conventional sign may be used.he diagonal compartments of the table. viz. If slight pigmentation of eyes had been associated with marked pigmentation of hair.„) = (5„) for all values of m. and suppose. Thus in Table II. the mean square contingency . x° the square continthe ratio v^/iV^. on which a dififerent coefficient can and the sum of all be based. the contingency might have been regarded as negative. denoting the expression in brackets • <^) by S. (1) 6 x 6-fold. 65 if the characters A and are completely independent. all the frequency is then concentrated in . s (6) Now suppose we have to deal with a * x i-fo\d classification in which (A J) = {B. With the present coefficienst this is not the case: if certain data be classified in. the S's of one sign only. ' Professor Pearson . The greatest possible Value of the coefficient is. at least approximately. and the contingency may be regarded as definitely positire. in such a table. The total value of — 66 iV'' to the sum S. and the 'V^This is the greatest possible value of C for a symmetrical t x «-fold classification. and therefore. for «= 2 C cannot exceed 0-707 . value of C 5 is accordingly tif.— THEORY OF STATISTICS. 67 (1768)2/1169 .V. — MANIFOLD CLASSIFICATION. red. considering the depth of the pigmentation. for every pair Taking the of columns or of rows. the sign of the association in all the elementary tetrads The colours will then run fair. — 68 The than. will be the same. as an illustration.+^B„)}. and this would seem to be the more natural order.. whole of the contingency table can be analysed into a series of elementary groups of four frequencies like the above. etc." and the associations in them all can (r 1) be very quickly determined by simply tabulating the ratios like {A^B„)/(K+A). associations in the six tetrads are accordingly + + + + - The negative sign in the two tetrads on the right is striking. (1) In an isotropic distribution the sign of the association is the same not only for every elementary tetrad of adjacent frequencies. black. brown.„5„H.— THEOKT OP STATISTICS.and eye-colour. the proportions {A„B„)/{{A^B„) + {A„. the more so as other tables for hair. (^. or equal to the ratio (^„5„+i)/(^„+i5„+i). or perhaps better. hut for every to common set of fov/r frequencies in the compartments two rows and two columns. less than. figures of Table II. and working from the rows. the proportions run as follows : For rows 1 and 2. i. as may be most convenient. as shown in It will be termed an isotropic disthe following theorems.1) such " tetrads. e. For rows 2 and 3.)/(^m+i^n+i)> etc. {A„JB. {A„^^.. 11.e. but The signs of the the fourth ratio is greater than the second. 1768/2714 807/2194 189/935 47/100 0-651 0-368 0-202 0-470 946/1061 1387/1825 746/1034 53/69 0-892 0-760 0-721 0-768 In both cases the first three ratios form descending series. A distribution of frequency of such a kind that the association in every elementary tetrad is of the same sign possesses several useful and interesting properties. each one overlapping its neighbours so* that an r«-fold table contains (s . arranged in But the the same way. peculiarity will be removed at once if the fourth column be placed immediately after the first if this be done. exhibit just the same characteristic..). if " red " be placed between " fair " and " brown " instead of at the end of the colourseries. : tribution.jfi^. .g. and cannot be rendered isotropic by any rearrangement of the order of rows and columns. or in whatever way the universe be limited by the omission of rows or columns. the association is evidently zero for every tetrad. The expression " complete independence " is therefore justified. — CLASSIFICATION. though they need not necessarily do so. elementary association is that is to say.4„.n^^S„){A„B^. (3) As the extreme case of the preceding theorem. the association is still positive though the two columns A^ and A^^^ are no longer adjacent. (1) (il„+A)(^„+A+.) and similarly. . • (2)" . Therefore the distribution remains independent in whatever way the table be grouped.„+ A) + (^. so that (A^£n){A^A+i)>{A. . the sign of the association in such fourfold table is the same as in the elementary tetrads of the original table. is not isotropic as it stands. the sign of the unaffected by throwing the {m+ l)th and (m + 2)th columns into one.A) = {A:){B„)IN only a 2 X 2-fold table aU values of and n. and (3) we have. it (2) An isotropic may be condensed distribution remains isotropic in whatever way Thus from (1) by grouping together adjacent rows or colvm/ns.)>(^„+A)(^^+A+i) Then multiplying up and cancelling we have (^„. From the work of the preceding section we may say that Table II. (3) That is to say. as otherwise different reductions for m to fourfold of course form may lead to associations of different sign. but may be regarded as a disarrangement of an isotropic distribution.„+A)]. in — 69 the elementary — MANIFOLD For suppose that the sign of association tetrads is positive.5„)(it„+A+i)>(^™+25„)(^™5„+i) . . The following will serve as an illustration of a table that is not isotropic.— V. It is best to rearrange such a table in isotropic order. by addition of adjacent rows and columns.+ A+i) + (-4™+A+0] > (^™-B„+i)[(^. we then have the theorem If an isotropic distribution be reduced to a fourfold distribution in a/ny way whatever. The case of complete independence is a special case of isotropy. adding* (^„5„)[(. we may suppose both rows and columns grouped and regrouped until is left . 12. For if {A. Phil. A. IV. vol. 2. Brown. Blue. the Frequencies of Different Combinations of Eye-colov/rs in Father and Son. 1. 3. Blue-green. cxcv. from Karl Pearson. p.) (1900). grey. (Data of Sir F. 138 . . 4. Dark grey. Father's Ete-oolour. hazel.— 70 THEORY OF Table Showing STATISTICS.. Trans. Galton. classifioation condensed. (3) Medical. or a similar summary method. that they the principle of division so to speak. in such a case. (2) Legal. each main heading is. genera. This property is of special importance in the theory of variables. in order to render possible those comparisons on which the discussions of associations and contingencies depend.'s into B^s.4. arbitrary and variable. Music. biological classifications into orders. the peculiarity might not be remarked. —MANIFOLD CLASSIFICATION. however." with the subheadings (1) Army. 13.)JAn)^B. Thiis A'a and a's are both subdivided into B's and ^'s." with the fresh sub-heads (1) Clerical. Many classifications are. The number of sub-heads under (8) Exhibitions.g. the first " order " in the list of occupations is "General or Local Government of the Country. and of occupations in the census.i^'s. the third order is " Professional Occupations and their Subordinate Services. the classifications of the causes of death in vital statistics. say.. and the distribution in every column similar to that in the column of totals . Were such a table treated by the method of the contingency coefficient.)J^{B. (7) Art. in concluding this part of the subject. however complex are. essentially of a heterogeneous character. (5) Literary and Scientific. To take the last case as an illustration. Games. and so on. etc. that in the case of complete independence the distribution of frequency in every row is similar to the distribution in the row of totals.)J^{B)„ and so on. (2) Local Government. {A^B. Clearly this is necessary . for in.. The next order is " Defence of the Country. (2) Navy and Marines not (1) National and (2) Local Government again the sub-heads are necessarily distinct. B^'a . B/a. (A„£..). and different for each main heading . The classifications both of this and of the preceding chapters have one important characteristic in common. "homogeneous" being the same for all the sub-classes of any one class. A^s . and species . 14. Similarly. and accordingly its origin demands explanation. e.). If we only know that amongst the A's there is a certain percentage of B'a. (6) Engineers and Surveyors. viz. and amongst the a's a certain percentage of C's. . — — — .V. It may be noted." subdivided under the headings (1) National Government. 71 the great majority of the tables. Drama. there are no data for any conclusion. alone. (4) Teaching. but so long as the classification remains purely heterogeneous.... the column A„ the frequencies are given by the relations — {A^B. iil. Even where a classification has remained verbally the same. . 15. to see whether the numbers of those engaged in the given occupation or succumbing But to the given cause of death have increased or decreased. render a certain number of comparisons impossible. however desirable in themselves. become. and so bring about a virtual change in the classification. . thus. This may be done in various ways according to the nature of the case. Biometric Series i. with a given age-group. Again. incomplete until a homogeneous division is introduced either directly or indirectly. and unfortunately it very seldom is fulfilled. and it then becomes possible to discuss the association of a given cause of death with one or other of the two sexes. or with a given desert. e. or occupation. and if they have remained strictly the same. heterogeneous classification should be regarded only as a partial process. there is no opportunity for any discussion It is causation within the limits of the matter so derived. by repetition. In any case. (1) Pearson. 1904. genera. or age. the classifications of deaths and of occupations are repeated at successive intervals of time . or species may be discussed in connection with the topographical characters of their habitats may of moor and we may observe statistical associar between given genera and situations of a given topographical The causes of death may be classified according to sex. improved methods of diagnosis may transfer many deaths from one heading to another without any change in the incidence of the disease.e. it is also possible to discuss the association of a given occupation or a given cause of death with the earlier or later year of observation i. only when a homogeneous division is in some way introduced that we can begin to speak of associations and contingencies. Thus the relative frequencies of different botanical families. "On the Theory of Contingency and its Relation to Association and Normal Cornlatinn. Ka. All practical schemes of classification are subject to alteration and improvement from time to time. Contingency. and these alterations. it is not necessarily really the same. type..— 72 it ) — THEOKY OF STATISTICS. (The memoir in which the coefficient of contingency is proposed.g. marsh. London. in the case of the causes of death. in such circumstances the greatest care must be taken to see that the necessary condition as to the identity of the classifications at the two periods is fulfilled. REFERENCES." Drapers' Company liestarch Memoirs. or — tions occupation. Dulau & Co. . vii. G. 6. (On the property of isotropy and some applications. showing the resemblance between brothers for athletic capacity and between sisters for temper. Karl. p. of the Anthrop. (As stated in § 7. p. Inst. for the correlation table. a table showing the numbers of candidates who passed or failed at an examination.) contingency) for the two tables below. ix. eines Gegenstaudes.. XV. (Includes an investigation as to the influence of bias 1906. . . p. .. (A kgl.. Peaeson. 1906. "On a Property which holds good for all Groupings of a Normal Distribution of Frequency for Two Variables. Sdchsischen Gesellschaft der Wissenscha/ten general discussion of the problems of association and contingency. Soc. p.. " On a Coefficient of Class Heterogeneity or Divergence. (Deals with a measure of dependence for a common type of table. U.) Contingency Tables of two (6) Rows only." Jour." Proc. vii. (The similar problem for the case in which the variable is replaced by an unmeasured quality. and of personal equation in creating divergences from isotropy in Statistics of Ill-defined Qualities. "Die Bestimmung Abhangigkeit zwischen . — MANIFOLD CLASSIFICATION. xxxvi. G. Find the coefficient of contingency (coefficient of mean square vol. in different Merkmalen by treating the observed frequencies of some quality Aj. when one Variable is given by Alternative and the other by Multiple Categories. U. "On a New Method of Determining Correlation. Inst. : Isotropy. Series A. Ixxvii. 198. der 73 den (2) LlPPS. districts of a country." Jour. with applications to the Study of Contingency Tables for the Inheritance of Unmeasured Qualities. (An application of the contingency coefficient to the measurement of heterogeneity. xxxiii. Show that neither table is even remotely isotropic. and Biometrika. ) for B. Klasse der Leipzig. yol. Soy. vol. and assumes a normal distribution of frequency (chap. e. 1909. 325. Pearson.. Yule. . p. Karl. contingency tables. Aj . 96.. 1910. e. vol.. Karl." (3) Biometrika.) EXERCISES.-phys.. vol. Pearson. 1906. while avoiding lengthy (1) arithmetic. the coefficient of contingency should not as a rule be used for tables smaller than 5 x 5-fold these small tables are given to illustrate the method. iii.g.) for variables. " On the Inheritance of the Mental and Moral Characters in Man. 324. vol. 1905." Biometrika. F. 248. vol. (Data from Karl Pearson.8 of which only the Percentage Intensity is B exceeds (or falls short of) a given recorded (7) Grade ot A. An in the different districts as rows of a contingency table and working out the coefficient the same principle is also applicable to the comparison of a single district with the rest of the country. {i) (5) Yule." BericTite der math. The table of such a type stands between the contingency tables for unmeasured characters and the correlation table Pearson's method is based on that adopted (chap. for each year of age.) ) ) ) ) : V. "On a a Measured Character of Cases wherein for each New Method of Determining Correlation between A and a Character . v." Biometrika. "On the Influence of Bias and of Personal Equation in Anthrop.g.. 74 THEORY OF STATISTICS. Athletic Capacity. . First Brother. Method of forming the table 5. or petals) on animals or plants. of glands. or more shortly a variable.g.PART IL—THE THEORY OF VARIABLES. I. CHAPTER VI. 1. The moderately asymmetrical distribution 15. The U-shaped distribution. adapted to the treatment of quantitative measurements. Magnitude of class-interval 6. Graphical representation of the frequency-distribution— 12. The dichotomous olagsi: 76 . on which arguments as to causation depend. rainfall records. but not as a rule available (with some important exceptions. 2. Treatment of intermediate observations— 9. As common examples of such variables that are subject to statistical treatment may be cited birth. the mind cannot properly grasp the significance of the record the observations must be ranked or classified in some way before the characteristics of the series can be comprehended. this section of the work may be termed the theory of variables.— 10. The methods described in Chaps. prices. Iiitroduotoiy 2. Since numerical measurement is applied only in the case of a quantity that can present more than one numerical value. as suggested by Chap. The extremely asymmetrical or J-shaped distribution 16. Process of classification— 8. Illustrations 4. Tabulation. I. — — — — : — — — — — — — 1. The symmetrical distribution 14. can be made with other series. THE FEEaUBNCT-DISTEIBUTION. Position of intervals 7. are applicable to all observations. that is. barometer readings. Necessity for classification of obserTationa the frequency distribution 3. Tables with unequal intervals 11. wages.or death-rates. § 2) for the discussion of purely qualitative observations. definitely. whether qualitative or quantitative we have now to proceed to the consideration of specialised processes. a varying quantity. and those comparisons. and measurements or enumerations {e. Ideal frequency-distributions 13. If some hundreds or thousands of values of a variable have been noted merely in the arbitrary order in which they happened to occur.-V. spines. (c/. and so on.. a large part of the information given manifold classification. the values of the variable chosen to define the successive classes should be equidistant.). [Table I. returns of birth. for the decade 1881-90. v. of the scale . is too crude if the values are according as they exceed or fall short of some fixed value. expressed as proportions per thousand of the population per annum. so that the niimbers of observations in the different classes (the class-frequencies) may be Thus for measurements of stature the interval comparable.-IV. over 13 "5 but under 14:'5. as it may be termed) might be 1 inch. for the class limits can be conveniently and precisely defined by assigned values of the variable. by intervals of five shillings or ten shillings. 3. : considered in Chaps. or 2 centimetres. Chap. are distributed over the successive equal intervals of the scale is spoken of as the frequency-distribution of the variable. the unit is naturally taken as the class-interval unless the range of The manner in which the observations variation is very great. or each successive 2 centimetres.e. since the classes may be made as numerous as we please. of the 632 registration districts of England and Wales. if desired to obtain a more condensed table. A few illustrations will make clearer the nature of such frequency-distributions. have been classified to the nearest unit i. as for example in enumerations of numbers of children in families or of petals on flowers. and numerical measurements lend themselves with peculiar readiness to a manifold classification. the numbers of individuals being counted whose statures fall within each successive inch. I. The frequency-distribution is shown by the : following table. .— 76 fioation. For convenience. . When the variation is discontinuous. and so on. or. returns of wages might be classified to the nearest shilling. chosen for classifying (the class-interval. the numbers of districts have been counted in which the death-rate was over 12 5 but under 13 '5. avoids the crudity of the dichotomous form.or death-rates might be grouped to the nearest unit per thousand of the population . however by the original record is lost. and the service which they render in summarising a long and complex record In this illustration the mean annual death-rates. (a) Table I. THEORY OF STATISTICS. merely classified as ^'s or a's A . Showing the Numbers of Registration Districts Wales with Different mean Death-rates per Thousand of the Population per Annum for the Ten Years 1881-90. 77 Table England and I. .— VI.) Mean Annual Death-rate. —THE FREQUENCY-DISTBIBUTION. (Material from the Supplement to the 55th Annual Report of the Registrar-General for England and m JTaZcsCC— 7769] 1895. Age at Death. . (1900). Soc. THEORY OF STATISTICS.) Families. vol.— 78 Table II. Beeton. Ixvii. (Cited from Proc. Years. 172. Showing the Numbers of Married Women. Karl Pearson. in certain Qunker Dying at Different Ages. Yule. On tJie Correlation between Duration of Life and Numher of Offspring. U. p. Hoy. and 6. by Miss M. etc. thirty makes a somewhat unwieldy table. The But if the variation be conunit will in general have to serve. number of classes lies less than. Dividing the diflference between these by. The two conditions which guide the choice are these (a) we desire to be able to treat all the values assigned to any one class.g. five and twenty. The position or starting-point of the 6. and its choice is a matter for judgment. (3) This choice having been made.g. A number of classes ten leads in general to very appreciable inaccuracy. Magnitvde 'of Class-Interval. (2) The position or origin of the intervals must then be determined. record should accordingly be made and the highest and lowest values be picked out. as if they were equal to the mid-value : — if the death-rate of every district in of Table I. generally be fulfilled if the interval be so chosen that the whole of the class-interval. so that the mid. 14-5-15-5. As already remarked.. say. i. 5. The actual value should be the nearest integer or simple fraction. but in general it is fixed either so that the limits of intervals are integers. in Table I. THE FKKQUENCY-DISTKIBUTION. 13-5-14-5. we must decide whether to take as intervals 12-13. Some remarks may be made on each of these heads. tables the preceding are formed in the following way (1) The magnitude of the class-interval. e. or 14 rays like To expand slightly the brief description given in §.e. It may. subject to the first condition. (4) The process of being finished. tinuous. A preliminary inspection of the. etc. intervals is. 13-14. and III. as the first class between 15 and 25. we have an approximate value for the interval. the complete scale of intervals is fixed. say. without serious error. a table is drawn up on the general lines of Tables I. and so on . the number of units to each interval.-IIL. of rays range lisual. as a rule. 79 The numbers being the most 4. or 12-5-13-5. 14-15. and the : — observations classification are classified accordingly. is first fixed . — 12.values are integers. e. Position of Intervals. as in Tables I. or. there is very little choice as regards the magnitude of the class^interval. in cases where the variation proceeds by discrete steps of considerable magnitude as compared with the range of variation. 13. — . from 6 to 20. (6) for convenience and brevity we desire to make the interval as large as These conditions will possible. there is no such natural class-interval. were exactly 13-0.VI. and a number over.2. showing the total numbers of observations in each class-interval. five units in the case of Table II.. and II. or at least take place by discrete steps which are small in comparison with the whole range of variation. say. one unit was chosen in the case of Tables I. the death-rate of every district in the second class 14'0.. more or less indifferent. so that no In limit corresponds exactly to any recorded value (c/. Thus." " 30 and under 40. Under such circumstances. for simplicity in classification. or : have been wrongly sorted. the Census of England and Wales. it is accordingly better to enter the values observed on cards. " 25 and under 35. there is no means of tracing the error. in order to avoid sensible error in the assumption that the mid-value is approximately representative of the values in the Thus.80 THEORY OF STATISTICS. the observations may be If the not large. 17'5. 8. observations is at all considerable and accuracy is essential. When there is any probability of a clustering of this kind occurring. and verifying that no cards fifth four. entry in a class is marked by a diagonal across the preceding by leaving a space. the observations exhibit a marked This clustering round certain values. since the clustering is chiei^y round class. § 8 below). In such a case. and transfer the entries of the original record to this sheet by marking a 1 on the line corresponding to any class for each entry It saves time in subsequent totalling if each assigned thereto. e. in which a different view is taken). in the case of ages. one to each observation. it is as well to subject the raw material to a close examination before finally fixing the classification. tens. the difficulty may be readily surmounted by working out the rate to another place . it intervals in a will be sufficient to column down the left-hand side of a sheet number of observations is mark the limits of successive of paper. however.g. The disadvantage in this process is that it offers no facilities for checking if a repetition of the classification leads to a different If the number of result. In some cases difficulties may arise in classifying." and so on (c/. is generally the case. tens. and the whole work checked by running through the pack corresponding to each class. some exceptional cases. the values round which there is a marked tendency to cluster should preferably be made mid-values of intervals." " 35 and under 45. owing to the tendency to state a round number where the true age is unknown. for instance.. in compiling Table I." etc. the classification of the English census. 1911.. or 18"5. some districte will have been noted with death-rates entered in the Registrar-General's returns as 16'5. and also ref. is a better grouping than " 20 and under 30. 7.. or tens and fives. vii. vol. be chosen. fixed. 5. where the original figures for numbers of deaths and population are available. any one of which might at first sight have been apparently assigned indifferently to either of two adjacent classes. however. in age returns. owing to the occurrence of observed values corresponding to class-limits. These are then dealt out into packs according to their classes. — The scale of intervals having been classified. Classification. moreover. practical purposes . The procedure is rough. 9. Death-rates that work out to half-units exactly do not occur in this example. In the case of Table II. —THE FEBQUBNCT-DISTRIBUTION. If the difficulty is not evaded in any of these ways. p. but a good deal more laborious. to the class 15'5-16'5.. and so there is no real difficulty. there is little to be said. to assign the intermediate observations to the adjacent classes in proportion to the numbers of other observations falling into the two classes. if to the nearest eighth of an inch. but probably good enough for p. it woul<J be slightly better. etc. and. if the actual day of birth and death be cited.. it is usual to assign one-half of an intermediate observation to each : : the result that half-units occur in the Tables VII. to state the manner in which the difficulty of intermediate values has been met or evaded. these being carried to a further place of decimals. p. 90. 96). if 16'4:98. the classintervals may be taken as 150'5-151'5. with (c/. and II.. The class-limits are perhaps best given as in Tables I. but may be more briefly indicated by the midvalues of the class-intervals. it be sorted to the class 16'5-17"5 . X. and XL. 15r5-152'5..— VI. than the values in the original record. 96. if necessary. Tabulation. because there is an odd number of days in the year. might have been given in the form adjacent class. again. Thus or a smaller fraction. the intervals may be 59^|— 60i|. As regards the actual drafting of the final table.. Thus Table I. except that care should be taken to express the class-limits clearly. class-frequencies — Death-rate per 1000 . half-years still cannot occur in the age at death. 81 of decimals will if the rate stated to be 16 '50 proves to be 16-502. and so on. the age at death is only calculable to the nearest unit . if statures are measured to the nearest centimetre. there is no difficulty if the year of birth and death alone are given. The difficulty may always be avoided if it be borne in mind in iixing the limits to class-intervals. 60||-61f|. .82 THEOKT OF STATISTICS. Stature in Inches. 1. from Qrmt Brilaln nxsessal to hihahilM House Duly in 1885-6. lUiy. (Cited fr Jour. p. 610. 83 Tabie IV..) .— VI. FKEQUKNCY-DISTRIBUTION. 1887. vol. atat. flhmm'vg the Avniial Value and Number of Dwelling-houses IE IV. —THE Soc. Maodonell. i. . R. Table T.) — Head-breadth in Inches. 220. giving the distribution of head-breadths for 1000 men. Biometrika.. p. (Cited from W. will serve as an example. Measurements taken to tlie nearest tenth of an inch. at Shoviing the Frequency-distribution of ffead-breadtJis for Students Cambridge.84 THEOEY OF STATISTICS. 1902. —THE FEEQUENCY-DISTEIBUTION. 85 .VI. . histogram. i. the frequencypolygon is too small polygon tends to become very misleading at any part of the In the moirtality disrange. the frequency rises so sharply . successive ordinates y-^. the area shown by the Fig. if y2 = i(?'i + 2's)> the areas of the two little triangles shaded in the figure being equal If y^ fall short of this value. 3. it is better to use the histogram. 3) is only correct if the tops of the three Tfc yi yi Fig. for this reason. yg lie on a line. if y^ exceed it.e. The is not the same. y^. tribution of Table I.. as suggested by the shown by the frequency-polygon over any interval with an area ordinate y^ (fig.86 interval THEORY OF STATISTICS. the area shown by the and if. 4. for instance. polygon is too great. number of observations is considerable say a thousand at least the run of the class-frequencies is generally suflSciently smooth to give a good notion of the form of the ideal distribution. 12. It occurs more frequently in the case of biometric. the use of the histogram is almost imperative. the extremely asymmetrical or J-shaped distribution.). XV.VI. For elementary purposes it is sufficient to consider these fundamental simple types as four in number. to the maximum that a histogram is. iv. The symmetrical distribution. 4 will be proportional to the area of the shaded strip in the figure. Ex. and very exceptional indeed in economic statistics. on the whole. measurements. Chap. Fi^. and in such a distribution as that of Table IV. the polygon and the histogram will approach more and more closely Such an ideal limit to the frequency-polygon to a smooth curve. If the class-interval be made smaller and smaller. 5 illustrates the ideal form of the distribution. and so on. but amongst these we notice a small number of comparatively simple types. and at the same time the number of observations be proportionately increased. in any actual case. and § 18. or histogram is termed a frequency-curve. § 15. which. —THE FKEQUENCY-DISTRIBUTION. The forms presented by smoothly running sets of numerous observations present an almost endless variety. the number of observed values greater than sc^ will similarly be given by the area of the curve to the right of the When. have very little significance (c/. In this ideal frequencycurve the area between any two ordinates whatever is strictly proportional to the number of observations falling between the Thus the number of corresponding values of the variable. observations falling between the values x^ and x^ of the variable in fig. this form of distribution is comparatively rare under any circumstances. the symmetrical distribution. the class-frequencies decreasing to zero symmetrically on either side of a central maximum. Table VI. more especially anthropometric. so that the class-frequencies may remain finite. from which the following illustrations are drawn. from data published by a British Association Committee in 1883. Being a special case of the more general type described under the second heading. from which many at least of the more complex distributions may be conceived as compounded. and the U-shaped — — distribution. with small numbers the frequencies may present all kinds of irregularities. the figures being given separately . most probably. 13. 87 . the better representation of the distribution of frequency. shows the frequency-distribution of statures for adult males in the British Isles. and is important in much theoretical work. the ordinate through x^. the moderately asymmetrical distribution. 6. and Wales. Scotland. Ireland. § 9). {cf.) J^th of an Inch. Table VI.— 88 THEOET OF STATISTICS. Showing the Frequency-distrihwlions of Statures for Adult Males born in England. Inches. Height without shoes. presumably 56i-|-57i|. Final Report of {Report. and so on Glass. As Measurements are stated to have been taken to the nearest p. the 57H-58if. the Anthropometric Committee to the British Association.Intervals are h^re Sec Fig. . 1883. 256. —THE FREQUENCY-DISTRIBUTION. . 89 Fig. 5.VI. —An ideal symmetrical Frequency-distribution. 415) . p. . gives two similar distributions from more recent relating respectively to sons over 18 years of age. and to students at Cambridge. The moderately asymmetrical distribution. 1903. See Figs. Table VII. Shmoing the Frequency-distribution of Statures for (1) 1078 JEnglish Sons (Karl Pearson. K. 220). Biomelrika. 1902. 7 and 8. they may all be held to be approximately symmetrical. Macdonell. The distribution of death-rates in the registration districts of England investigations. roughly speaking. with parents living. Both these distributions are more irregular than that of fig. but. in Great Britain. The polygons are shown in figs. i. 14. p. 9 (a) or (6). Biometrika.— 90 THEORY OF STATISTICS. illustrations occurring in statistics from almost every source. 7 and 8. 6. as in fig. ii. the class-frequencies decreasing with markedly greater rapidity on one side of the maximum than on the other.. Table VII. (2) /or 1000 Male Students at Cambridge (W. This is the most common of all smooth forms of frequency-distribution. (Table VII.) . 8. 7.00 ° 80 60 40 \ / 7 60 \ \ 6 8 70 P 20 58 Z 4 Z 4 6 8 60 Stature in tnches Fig. 91 ZOO K'so 160 "140 I ^.VI. Fio. 80 h 60 40 t HO \ 60 4 6 a 70 2 4 8 80 Stature in ouches. — Frequency -distribution of Stature for 1000 Cambridge Students. —THE FEEQUENCY-DISTKIBUTION.— Frequency-distribution of Stature for 1078 "English Sons.) xoa \iao to 160 140 / [120 too ." (Table VII. 92 and Wales. — Ideal distributions of the moderately asymmetrical form. and fig. districts (Table VIII. p. is a somewhat rough example . THEORY OF STATISTICS. 10) is type (o) of fig 9. 77. in Table I. The frequency smoother and more like the attains a maximum for . The distribution of rates of pauperism in the same W Jc^) T'lG 9. given of the type.. Jour. Stat. 347. . Tablk VIII. 10. of pauperism. —THE FKEQUBNCY-DISTKIBUTION.) See Fig.— VI. Shoioing the Nvmiber of Eegistration Districts in England and Wales with Different Percentages of the Population in receipt of Poor-law Belief on the Ist January 1891. lix.v. of the population in receipt of and then tails off slowly to unions with 6. Roy. 93 districts with relief. for distributions for earlier years. q. vol. Percentage of the Population in receipt of Relief.. Soc. 2f to 3J per cent.. p. (Yule. 1896. and 8 per cent. 7. .94 THEOBY OF STATISTICS. imo I \ViOO n. {Loc. 99-5-109-5. 95 Table IX. —THE FEKQUBNCY-DISTEIBUTION.. etc. Table VI. Ireland. cit. —Showing the Freguency-distritution of Weights for Adult Males bom in Englcmd. . consequently the true ClassIntervals are 89-5-99'5. (§ 9). Scotland.) Weights were taken to the nearest pound. and Wales.VI. the Freqiiency-distribuiion of Fecundity.e. Showing of the Number .. i. p.) See Fig. 12. 803. for Brood-mares {Race-horses) Covered Eight Times at Least. Table X. (1899). Tram. th-e Batic of Yearling Foals produced to the Number of Coverings. (Pearson. cxoii. and Moore. vol. Fhil.— 96 THEORY OF STATISTICS. A. Lee. 97 600 500 too •300- 'ZOO 100 .n. 700 —THE FREQUENCY-DISTRIBUTION. " very rapidly to the maximum. but occur occasionally. — . The actual figures for this case are given in Table XII. the rising falling so slowly that there is still number Table XII.. The distribution of deaths from diphtheria. 3.) See Kg. as in fig. {Supplement to 65iA Anmial Report of the Registrar-General. according to age. suggesting an ideal curve that meets the base (at one end) at a finite angl^ even a right angle. and thence an appreciable frequency for persons over 60 or 70 years of age. and illustrated by fig. 14 . are less frequent. affords one such example of a more asymmetrical kind. in such a way as to suggest that the ideal curve is tangential to the base. 14. 9 (b). 1891-1900. Cases of greater asymmetry. and it will be seen that the frequency of deaths reaches a maximum for children aged " 3 and under 4. Showing the Numbers of Deaths from Diphtheria at Different Ages in England and Wales during the Ten Tears 1891-1900.98 THEORY OF STATISTICS. p. the distribution. and so thus suggesting a maximum number of deaths at the beginning of life. by returns of the size of agricultural holdings. In economic statistics this form of distribution is particularly characteristic of the distribution of wealth in the population at large. 23.VI. but if the maximum is not absolutely at the lower end of the fast . It is only the analysis of the deaths in the earlier years of life by one-year intervals which shows that the frequency reaches a true maximum in the fourth year. and therefore the distribution is of the moderately asymmetrical type. 15.092. i.479. The distributions may possibly be a very extreme case of the last type . they would have run 49. e.348. In practical cases no hard and Fig. 99 on. 4. any more than between the moderately asymmetrical and the symmetrical type. a distribution of the present type. ref. as illustrated. —THE FKEQUENOY-DISTRIBUTION.. only. 4).e.g. and so on (c/. line can always be drawn between the moderately and extremely asymmetrical types. by income tax and house valuation returns. —An ideal Distribution of the extreme Asymmetrical Foim. . etc. Hollis. but the position of the greatest frequency indicates a in Table XIII. Annual Value in £100. though of course very unreliable. . are of some interest. however. 573. it is very close indeed thereto. Shoioing the Jfumbers and Annual Values of the Estates of those who had taken part in the Jacobite Rising of 1715. is founded. distribution might therefore be more correctly assigned to the second type. See Fig. See a note in Southey's CommMiplace Book. p. vol. that the greatest frequency does not occur actually at zero. but that there is a true maximum The frequency for estates of about £1 15 in annual value. Nonjurors. and the frequency continuously falling as the value increases. 1745. and others who refused to take the Oaths to his late Majesty King George. Figures of very doubtful absolute value. 16 that with the given classification the distribution appears clearly assignable to the present type. A close analysis of the first class suggests.— ) 100 THEORY OF STATISTICS. 16. the number of estates between zero and £100 annual value being morfe than six times as great as the number between £100 and £200 in annual value. and for this reason the data on which Table Xni. quoted from the Memoirs of T. It will be seen from the table and fig. usually give the necessary analysis of the frequencies at the lower end of the range to enable the exact position of the maximum to be determined . Official returns do not range. (Compiled from Gosin's Names of the Soman Catholics. i. London. 7 in- 8 3 10 13 £lOO (Table XIII. 16. 14 the distribution of numbers of deaths from 16- ti 10- B. of life instead of in the fourth year. 83.. —THE : FRKQUKNOY-DISTKIBUTION.) — BVequency-distribution of the Annual Values of certain Estates in England in 1715 : 2476 Estates. p. 8- 6- *l * 6 4 5 3 AnntLot value Fig. . 101 degree of asymmetry that is high even compared with the asymmetry of fig. diphtheria would more closely resemble the distribution of estatevalues if the maximum occurred in the fourth and fifth weeks The figures of Table IV. showing the annual value and nimiber of dwelling-houses.VI. good illustration of this form of distribution.— . common in official returns. Shovring the Frequencies of Different Nurribers of Petals for Three Series of Ranunculus bulbosus.v. hot. Qes. for details. 1894. Bd. (H. xii. 102 afford a THEORY OF STATISTICS. dtsch. q.) See Fig.. Ber. 17. de Vries. but marred intervals so by the unequal Table XIV. . Cloudiness. of Estimated Intensities of Cloudiness (See ref. 2. 19 illustrate an example based on a considerable number of observations. it This in is a rare but interesting form of distribution. Table XV. — Shovnng the Frequencies during the cU Breslau Ten Tears 1876-85. as stands somewhat marked contrast to the preceding forms.) See Fig. 18. —An ideal Distribution of the XT-shaped Form. and fig. 19. I'"iG.VL~THE FKEQUENCY-DISTRIBUTION. or estimated percentage of the sky covered by cloud. at the ends of the range and a The ideal form of the distribution minimum towards is illustrated by fig. at Breslau Table XV. 18. 103 the centre. . viz. the distribution of degrees of cloudiness. ed. 18^8. for Marriayes produt-ing Five Children or more. 19. and intermediates are more This form of distribution appears to be sometimes exhibited by the percentages of offspring possessing a certain attribute when one at least of the parents also possesses the attribute. A sky completely. or almost com- pletely. Showing the Percentages of Deaf-mutes among Children of Parents one of vihom at least was a Deaf-mute. but the figures are not in a decisive shape. Fay. i 1 1000. overcast at the time of observation is the rare. during the years 1876-85. ! IS 00. the Table XVI. gives the distribution for an analogous case. (Table Francis Galton in Natural Inheritance suggest such a for the distribution of " consumptivity " amongst the offspring of consumptives. o «> 500 4 Fio. viz.) Percentage of Deaf-mutes. Washington. The remarks i2000. Volta Bureau. (Compiled from material in Marriages of the Deaf in America.— 104 THEORY OF STATISTICS. a practically clear sky comes next. E. most common. . A.) S 10 doiccUfie^s —Frequency-distribution of Degrees of Cloudiness at Breslan 1876-85: 3653 observations. 5 6 XV. of Sir form Table XVI. on account of the large collection of frequency-distributions which is given. of the children are deaf-mutes are nearly three times as many as those in which the percentage lies between 60 and 80. livre iii. olxxxvi. i. however. what is the scale of observafig. "Supplement to a Memoir on Skew Variation. and from which some of the Without illustrations in the preceding chapter have been cited. vol. 1 what number of observiitions will the polygon 5"95 and 6'06. (1895).A. 1. Ixii.— 1 VI. Soy. Lausanne. . number of observations will the polygon show between death-rates of 16 'o and 1 7 '6 per thousand. "Skew Variation (4) (5) in Homogeneous Material. what is the scale of observations to the square centimetre ? what 3. 1896-7. 105 distribution of deaf-mutism of amongst the : offspring of parents one at least was a deaf-mute." Census Bulletin IS." Phil. and The elementary student may.. Soc. If the diagram to the inch and tions to the square inch ? If the scales are 100 observations per interval to the centimetre and 2 inches of stature to the centimetre. "Cloudiness: Note on a Novel Case of ITiequenoy. 287. Cours d'iconomie politique . to which reference was made in § 16.. Trans. vol. the first being the fundamental memoir. Eakl. In connection with the remarks in § 6. ? scales of 25 observations per interval to the inch and 2 per cent. Young. square centimetre 2. See especially tome ii. If fig. In general less than one-fifth at the other end of the range the of the children are deaf-mutes cases in which over 80 per cent. : "A EXERCISES.polygon be drawn to represent the data of Table V. 2 vols. what is the scale of observations to the 6 is redrawn to scales of 300 observations per interval 4 inches of stature to the inch. what is the scale of observations to the square inch ? If the scales are ten observations per interval to the centimetre and 1 per cent. Washington. refer to them with advantage. pp. If a frequency-polygon be drawn to represent the data of Table I. he may also note that each of our rough empirical types may be divided into several sub-types. the second and third supplementary. Pbakson. attempting to follow the mathematics. The numbers are." Proc. cxovii. to the inch. 443-459. Series A. Series A. (1897). Karl." The first three memoirs above are mathematical memoiis on the theory of ideal frequency-curves. Trans. vii. Reference should also be made to the Census of England and Wales. chap. 1911. the theoretical division into types being made on different grounds. Bureau of the Census. reference may be made to the following in which a ditferent conclusion is drawn as to the best grouping Discussion of Age Statistics. (1) (2) (3) Pbaeson. Allyn A.. Pabeto. "Ages and Condition as to Marriage." especially the Report by Mr George King on the graduation of ages. "La courbe des revenus. The fourth work is cited on account of the author's discussion of the distribution of wealth in acommunity. however. Roy. If a frequency. pp. (1901). —THE FEEQUENCY-DISTKIBUTION.. p. 1904. on the grouping of ages. REFERENCES. instead of the true number 236 show between head-breadths . Soc. instead of the true number 59 1 4. U. Kakl.. Vilfredo. too small to form whom a very satisfactory illustration. to the centimetre.. Soc. vol.. 1 is redrawn to . Roy. Pbaeson. vol." Phil. 343-414.S. of similar form. Summary comparison of the preceding forms 22-26. however. we have only to compare with each other two distributions of the same or nearly the same type. to this simple case. would discussion of causation.CHAPTER VII. there are two fundamental characteristics in which such distributions may 106 . in general. and to render possible those comparisons with other series which are essential for any Very little experience. : its definition and calculation. The arithmetic mean : its definition. however. Necessity for quantitative definition of the characters of a frequency2. The mode: its definition and relation to mean and median 21. compare the frequency-distributions of stature in two races of man. The commoner forms 6-13. simpler proof average 27. The perties. Measures of position (averages) and of dispersion— 3. distributions drawn from similar When we have to material are. it is desirable to take is the quantitative definition of the characters of the frequency-distribution. AVERAGES. then. As a matter of practice. distribution The dimensions of an average the same as those of the variable 4. Confining our attention. of the death-rates in English registration districts in two successive decades. 1. calculation. for example. that very difficult cases of comparison could arise in which. 2. The median: its definition. The geometric mean : its definition. calculation. so that quantitative comparisons may be made between the corresponding It might seem at first sight characters of two or more series. and the cases in which it is specially applicable — — — — — — — — — harmonic mean 1. Desirable properties for an average to possess 5. it was pointed out that a classification any long series is the first sfep necessary to make the observations comprehensible. of the numbers of petals in two races of the same species of Ranunculus. we seldom have to deal with such a case . show that classification alone is not an adequate method. and simpler properties 19-20. seeing In § 2 of the last chapter of the observations in only enables qualitative or verbal comparisons to be made. and of average simpler properties 14-18. we had to contrast a symmetrical distribution with a " Jthat it The next step that shaped " distribution. it is merely a certain value of the variable. or (2) they "may centre round the same value.VII. if the variable be a But there are percentage. 107 (1) they of the variable differ : may differ markedly in position. G. several different ways of approximately defining the position of a frequency-distribution. the degree of asymmetry of the distribution. 20. measures of the second are termed measures of dispersion. its average is a length .e. but the two properties may be considered independently. Fig. the desirable properties for an average to possess ? dispersion : The present chapter . and is therefore necessarily of the same dimensions as the variable i. that is. in fact. and so on. its average is a percentage.e. A. —AVEKAGES. 20. position. as it is termed. in the values round which they centre. viz. we may also take a third of some interest but of much less importance. Measures of the first character. as in fig. as in fig. i. In whatever way an average is defined. 3. if the variable be a length. and the question therefore arises. deals only with averages j measures of are considered in Chapter VIII. are generally known as averages . In addition to these two principal and fundamental characters. and measures of asymmetry are also briefly discussed at the end of that chapter. B. the distributions may differ in both characters at once. but differ in the range of Of course variation or dispersion. as in fig 20. it may be as well to note. 20. By what criteria are we to judge the relative merits of different forms 1 What are. there are several different forms of average. (b) An average should be based on all the observations made. An average that was merely estimated ^ould depend too largely on the observer as well as the data. if be the arithmetic mean. (f) Finally. however carefully they may be taken. the average of the whole should be readily expressed in terms of the averages of its parts. however. e. the average of the combined series should be readily expressed in terms of the averages of the component series if a variable may be expressed as the sum of two or more others. of course. The arithmetic mean of a series of 6. (c) It it is not really a characteristic of the whole distribution. but one form of average may show much greater differences than another. . values of a variable JTj. the 5. and the mode. in common use. (a) In the first place. but of service in special cases. Xg. The arithmetic mean. . 4. It is desirable that the average should be as little affected as may be possible by what we have termed Jhictuations of sampling. That is to say. the more stable is the better. it almost goes without saying that an average should be rigidly defined. the first named being by far the most widely used in general statistical work. is desirable that the average should possess some simple and obvious properties to render its general nature readily comprehensible an average should not be of too abstract a mathematical character. and not left to the mere estimation of the observer. XVII. + X^+ .108 THEORY OF STATISTICS. X^. (d) It is. A measure for which simple relations of this kind cannot be readily determined is likely to prove of somewhat limited application. . N M M=^{X^ + X.. There are three forms of average arithmetic mean.). . the median. more rarely used. be postponed to a later section of this work (Chap. Other things being equal. -fX„). To these may be added the geometric mean and the harmonic mean. the averages of the different samples will rarely be quite the same. If. that the measure chosen shall lend itself readily to algebraical treatment. desirable that an average should be calculated with reasonable ease Und rapidity. If not. At the same time too great weight must not be attached (e) to mere ease of calculation. Of the two forms. If diiFerent samples be drawn from the same material.g. in number. by far the most important desideratum is this. We will consider these in the order named. The full discussion of this condition must. to the neglect of other factors. two or more series of observations on similar material are given. . the easier calculated is the better of two forms of average. X„. is the quotient of the sum of the values by their number. : : — . a simple relation between the whole : N : N R. for its general nature is readily comprehensible. for it is rigidly defined and based on all the observations made. S 109 to denote or. this direct process becomes a little lengthy. all the values observed are added together and the total divided by the number of observations. it will be noted that the mean is actually determined without even the necessity of determining or noting all the individual values of the variable to get the mean wage we need not know the wages of every hand. Conversely. PjN pounds. it may. : — — . a process which in general gives an approximation that is quite sufficiently exact for practical purposes if the class-interval has been taken moderately As regards simplicity position. if families possess a total of C children. but only the total." ilf=l2(Z) The word mean . —AVERAGES. It may be shortened considerably by forming the frequency-table and treating all the values in each class as if they were identical with the mid-value of the class-interval. to get the mean number of children per family we need not know the number in each family. the arithmetic mean wage. if we are told that the mean wage pounds. without qualification.VII. of calculation. i. when anyone speaks of " the mean " or " the average of a series of observations. to express it more briefly by using the symbol " the sum of all quantities like. But if the number of observations be large. the total number of children in families is N. if the mean number of children per family is M. If the wages-bill for workmen is £P. in fact. as a rule. its parts.e. is £M. it fulfils condition (c). The arithmetic mean expresses. . but only the wages-bill . be assumed that the arithmetic mean is meant.number that each family would possess if the children were shared uniformly. is the amount that each would receive if the whole sum available were divided equally between them conversely.M — N and 7. we know this means that the wages-bill is Similarly. the mean takes a high In the cases just cited. (1) or average alone. . It is evident that the arithmetic mean fulfils the conditions laid down in (a) and (b) of § 4. is very generally used to denote this particular form of average that " is to say. Further. the mean number of children per family is GjN the . . If this total is not given. but we have to deal with a moderate number of observations so few (say 30 or 40) that it is hardly worth while compiling the frequency-distribution the arithmetic mean is calculated directly as suggested by the definition.M. . corresponding class-interval. make use of this term.X) is therefore replaced by the calculation of tif-t)The advantage of this is that the class-frequencies need only Joe multiplied by small integral numbers.. Each frequency /is then multiplied by its | and the products entered sixth class-interval . since . (4) The calculation of %{f. To keep the values of f as small as possible. VI.value of the If /denote the frequency of any class. . . and a little neai-er than the middle of the range to the estimated position of the mean. M= A + !.4 is (3) a constant. and the mid-value of X another. the value of the mean so obtained X may be written^- if=ls(/. is multiplied by the mid-value of the interval.%(/. . The consequent values of f are then written down as in column (3) of the table. observations. (2) 8. or 2(/. is sometimes termed the first moment of the distribution about the arbitrary origin A we shall not. the products added together. VI. The arbitrary origin A is taken at 3 '5 per cent. the middle of the : from the top of the table. the values starting. 9. Chap. of course. however. and the total divided by the number of. the mid.f) for the grouped distribution. i) . But this procedure is still further abbreviated in practice by the following artifices : (1) The class-interval is treated as the unit of measurement throughout the arithmetic . § 5). the ^'s must be a series of integers proceeding from zero at the arbitrary origin A. against the corresponding frequencies. from zero opposite 3 '5 per cent. The process is illustrated by the following example. using the frequency-distribution of Table VIII. Chap. It may be mentioned here that 2(f). . for A being the mid-value of a class-interval. and the class-interval being treated as a unit. A should be chosen near the middle of the range. (2) the diiference between the mean and the mid-value of some arbitrarily chosen class-interval is computed instead of the absolute — value of the mean. If A be the arbitrarily chosen value and X=A + ^ then or.110 THEORY OF STATISTICS.. .Z) . In this process each class-frequency small (e/. from the Figures of TdbU Fill. Calculation of the Arithmetic of the Percentages of the Population in receipt of Belief. viz.267. viz. Chap. VI. (1) — .776 and + 509 respectively. Dividing this by N. 632. —AVERAGES.^) = . p. 93.. Calculation of the Mean Mean : Example i. that is 0'21 per cent. 0-42 intervals. we have the difference of from A in class-intervals. Hence the mean is 3'5 -0'21 =3'29 M per cent. whence 2(/. Ill in another column (4).Til. The positive and negative products are totalled separately. giving totals .. |) by ])f.. The student must notice that. 88. measures having been made to the nearest eighth of an inch. In this case the classis given directly in^val is a unit (1 inch).— 112 interval THEOEY OF STATISTICS. 58-^^. 5S-5. As the process is an important one we give a second illustration from the figures of Table VI. VI. p. the mid-values of the intervals are STj'j-. M-A Caloitlation of the Mean: MxampU ii. etc. Care must also be taken to give the right sign to the quotient. Chap.. Table VI. 10. and accordingly the quotient 267/632 is hak'ed in order to obtain an answer in units. is half a unit. etc. (1) .. Calculation of the Arithmetic Mean Stature of Male Adults in the British Isles from the Figures of Chap.. so the value of by dividing S(/. VI. and not 57'5. of an indefinite interval for the extremity of the distribution renders the exact calculation of the mean impossible 11.VII. the figures of for col. mean must be the The student should note that a classification by uneg^ual the and the use (c/. § 10). but the value ultimately obtained same. Chap. intervals is. VI. We return again below (§ 13) to the question of the . at best. a hindrance to this simple form of calculatfon. —AVKKAGES. (4) will 113 It is evident that such calculation may an absolute check on tho arithmetic of any be effected by taking a different arbitrary : origin for the deviations all be changed. of the ideal moderately asymmetrical distribution. a vertical through the corresponding point on the base. 22 shows similarly the position of the mean in In a symmetrical distribution the mean coincides with the centre of symmetry.^) must be zero. 13. as in the present example. THEORY OF STATISTICS. 22. on the side of the greatest frequency towards the longer " tail " of the distribudistribution. taken with their : proper signs. In distributions of such a type the intervals must be made very small indeed to secure an approximately accurate value for the mean. a diagram has been drawn representing the frequencythe position of the mean may conveniently be indicated by. not as an abstraction. If of the degree of inaccuracy to be expected. The following examples give important properties of the arithmetic mean. and the vertical indicates the mean. Thus fig. Median Mi. evidently %{f. MM Mo MiM Fio. and Mode Mo.— Mean M. and at the same time illustrate the facility of its algebraic treatment : (a) The sum of the deviations from the mean. for if M and A are . 21 (a reproduction of fig.— 114 interval. frequency-distribution of the variable concerned. The student should mark the position of the mean in the diagram of every frequency distribution that he draws. is zero. Mm an ideal distribution. but always in relation to the tion: fig. 10) shows the frequencypolygon for our first illustration. and so accustom himself to thinking of the mean. In a moderately asymmetrical distribution at all of this form the mean lies. The student should test for himself the effect of different groupings in two or three different cases. so as to get some idea 12. This follows at once from equation (4) identical. 2.M^+ . two countries M= (346 X 67-78) + (741 x 66-62). of. .— VII. X]. we must have. all the sums or differences of corresponding series (of equal numbers of observations) is equal to the sum or difference of the means of the two series. For if class is identical (c) The mean of observations in two Z=Xi±X„ 2(X) = S(X. N. or (2) that the mean of the values in the with the mid-value of the class-interval..fi) + 2(/. that is. It is evident that the if form of the relation (5) . men bom .)+ +%{fAr) (7) The agreement of these totals accordingly checks the work. 2(20 = 2(X. —AVERAGES.M.. Mean „ stature of the 346 .)... quite general X„ the mean M M^ . For if we denote the values in the first series by X^ and in the second series by X^. if there be iVj observations in the first series and N^ '"^ the second. 741 „ Wales = 66-62 in the Hence the mean stature of the 1087 men born is given by the equation 109.. inches. »• as before. .^) = 2(/i.). we find .M^ For example. . Chap.) + 2(X. denoting the component series by the subscripts 1. if the same arbitrary origin A for the deviations t be taken in each case.M=NyM^ + N^.)±2(X. and the means of the two series be i/j.1. As an important corollary to the general relation (6). two component series. it may be noted that the approximate value for the mean obtained from any frequency distribution is the same whether we assume (1) that all the values in any class are identical with the mid-value of the class-interval. This follows almost at once..f. . .. N. +N.. M„ there are r series of observations of the whole series is related to of the component series by the . it is useful to note that. : That is. (5) from the data of Table VI. . 115 say.. . S(y. Xg the means if^ equation . Jf= 66-99 is . + N^. (6) If a series of N observations of a variable X consist the mean of the whole series can be readily expressed in terms of the means of the two components. in Ireland = 67"78 in. . (6) For the convenient checking of arithmetic. in. .M=FyM. .. . VI. . M^ respectively.. we see that 45 per cent. But if there be an even number. had 13 or more . M=M^±M^ Evidently the form if . . . 55 per cent. the (7i-l-l)th in order of magnitude is the only value fulfilling the definition. the median may be defined as that value of the variable the vertical through which divides the area of the curve into two equal parts. seeing that it is based on all the observations made... VI. ±x„ M=M^±M^± .. X : . any value between the mth and (ra-fl)th In such a case it appears to be usual to fulfils the conditions. It should also be noted that in the case of a discontinuous variable the second form of the definition in general breaks down if we range the values in order there is always a middlemost value (provided the number of observations be odd). — In the case of a smaller values occur with equal frequency. The actual measurein any such case is the algebraic sum of the true ment measurement Xj and an error -Zj. and measurements If. The median. M. (9) As a useful illustration of equation (8)..116 That is. fulfils the conditions (6) and (c) of § 4. the the arithmetic mean of the errors M^. had 13 or fewer There is no number of rays rays. Thus in Table III. if THEORY OF STATISTICS. . had 14 or more. But the definition does not necessarily lead in all cases to a determinate value. The median may be defined as the middlemost or central value of the variable when the values are ranged in order of magnitude.. but this is a convention supplementary to the definition. ±M^ . § 3 of Chap. latter be zero. like the mean. and only if. The mean of the actual is therefore the sum of the true mean Jfj. say If there be an odd number of different values of 2n+l. will the observed mean be identical with the true X M Errors of grouping (§11) are a case in point. conbider the case of measurements of any kind that are subject (as indeed all measures must be) to greater or less errors. as the vertical through Mi in fig..^ be the respective means. .. The median. so that X=X^±X^± . observed. 22. . M^. say 2n different values. frequency-curve. take the mean of the nth and (m-t-l)th values as the median. so that its nature is obvious. any value such that greater and less values occur with equal frequency. of the poppy capsules had 12 or fewer stigmatio rays.. but there is not. as a rule. and that it possesses the simple property of being the central or middlemost value. M. (8) of this result is again quite general. similarly 61 per cent. 14. or as the value such that greater and mean. 39 per cent. . § 15) there is no number of petals that even remotely fulfils the required condition. The work may be indicated thus : Half the total number of observations (8585) = 4292'5 = 3589 Total frequency under 66i|. VI. 10. taking the figures in our first illustration of the method of calculating the mean. and 100 more with between 2'75 and 3'25 per cent. The value of the median stature of males may be similarly calculated from the data of the second illustration. —AVERAGES. its position indicated by Mi in fig. = 703-5 = 1329 - = 67-47 inches. only of the observed values. 15. . instead of by 50 per cent. = 3'195 is The mean being 3'29. The median is therefore a form of average of most uncertain meaning in cases of strictly discontinuous variation. and is perhaps best avoided in any case.inches . of the population in receipt of relief. 15. hence the value of the median is taken as : 2-75 -^^. (Chap. . it may be remarked. Difference Frequency in next interval Therefore median = 66-^ -I- . In the case of the buttercups of Table XIV. Thus. its use in such cases is to be deprecated. . 21. the median is slightly less . When a table showing the frequency-distribution for a long series of observations of a continuous variable is given. Looking down the table. in which small series of observations have to be dealt with.— VII. no difficulty arises. as a sufficiently approximate value of the median can be readily determined by simple interpolation on the hypothesis that the values in each class are uniformly distributed throughout the interval. But only 89 are required to make up the total of 316 . or 20 per cent. the total number of observations (registration districts) is 632. An analogous difficulty may arise. of which the half is 316. for it may be exceeded by 5. we see that there are 227 districts with not more than 2 '75 per cent.i = 2-75 -t- 0-445 per cent. whether the variation be continuous or discontinuous. even in the case of an odd number of observations of a continuous variable if the number of observations be small and several of the observed values identical. 117 such that the frequencies in excess and defect are equal. Taking. 227 not exceeding 3-25.. and not exceeding 3-75. 327. not exceeding 2-75. be substituted. Graphical interpolation may. Plot the numbers of districts to the corresponding with pauperism not exceeding each value Example 2-25 is . the figures of for arithmetical interpolation. i. again. if desired.118 THEORY OF STATISTICS. it is evident that mean and median must coincide. 16. the smallness of the diflference arising from the approximate symmetry of In an absolutely symmetrical distribution the distribution. X f s 400- 300- S 200- 100- . 417. the number of districts with pauperism not exceeding 138. The difference between median and mean in this case is therefore only about one-hundredth of an inch. without the necessity of measuring all the objects to be observed. however. It is equally impossible to give any theorem analogous to equations The median of the sum or diflFerence of (8) and (9) of § 13. even if the median error be zero. but impossible the value of the resultant median depends on the forms of the component distributions. not merely complex and difficult. the resultant median will not coincide with the resultant mean. a number of men be ranked in order of stature. If. however. But if the two components be asymmetrical. As was shown in § 13. as already stated. and not on their medians alone. for the median respectively will show that on the score of brevity of calculation the median has a distinct advantage. i. in general. the mean of the resultant distribution can be simply expressed in terms components. have an advantage over the mean for special reasons. When. too much weight ought not to be attached. equal to the sum or difference of the medians of the two series . comparison of the calculations for the mean and 17.e. to give any theorem for medians analogous to equations (5) and (6) for means. pairs of corresponding observations in two series is not. If two symmetrical distributions of the same form and with the same numbers of observations. the median value of a measurement subject to error is not necessarily identical with the true median. the superiority lies wholly on the side of the mean. A : .yil. and he alone need be measured. but with different medians.e. in any case in which they can be arranged by eye in order of magnitude. the median may paratively circumscribed. a factor to which. or (whatever their form) if the degrees of dispersion or numbers of observations in the two series be diflferent. lie halfway between the means of the components. therefore. for instance. i. (o) It is very readily calculated . 119 involve the crude assumption that the frequency is v/niformly distributed over the interval in which the median lies. —AVERAGES. the resultant median must evidently (from symmetry) coincide with the resultant mean. The expression of the of the means of the median of the resultant distribution in terms of the medians of the components is. These limitations render the applications of the median in any work in which theoretical considerations are necessary comOn the other hand. (6) It is readily obtained. nor with any otlier simply assignable value. however. the ease of algebraical treatment of the two forms of average is compared. It is impossible. be combined. the stature of the middlemost is (On the other hand the median. 18. when several series of observations are combined into a single series. if positive and negative errors be equally frequent. for this is entirely dependent It . however. the median will be no more affected by the addition to the group of a man with the income of £50. If. the value wliich is in fact the fashion (la mode). ) is sometimes useful as a makeshift. or due to errors or blunders). Mi the median. If observations of any kind are liable to present occasional greatly outlying values of this sort (whether real. the distribution be asymmetrical. in a certain sense. median and mode coincide with the centre of symmetry. ref. though the term is of recent introduction (Pearson. It is evident that in an ideal symmetrical distribution mean. The mode is the value of the variable corresponding to the maximum of the ideal frequency-curve which — gives the closest possible fit to the actual distribution. will remain a batch representing the median 'price when prices are reckoned at so many eggs to the shilling. owing to its being less affected by abnormally large or small values of the variable. the median wage cannot be found from the total of the wages-bill..) The point is discussed more fully later (Chap. 19.g. VI. The Mode. (Chap. median Tnay sometimes be preferable to the mean. as in Table IV. (e) It may be added that the median is. Mo being the mode. tlie three forms of average are distinct. If a number of' men enjoy incomes closely clustering round a median of £500 a year. The stature of a giant would have no more influence on the median stature of a number of men than the stature of any other man whose height is only just greater than the median. § 10).000 than by the addition of a man with an income of £5000. and the mean. 22. XVII. and the total (c) It of the wages-bill is not known when the median is given. useless in the oases cited at the end of § 6 . It represents the value which is most frequent or typical. e. the mode is an important form of average in the cases of skew distributions. Clearly. But a difficulty at once arises on attempting to determine this value for such distributions as occur in practice.120 it is THEORY OF STATISTICS.). or even £600. when the observations are so given that the calculation of the mean is impossible. a particularly real and natural form of average. the median will be more stable and less affected by fluctuations of sampling than the arithmetic mean. to {d) The a final indefinite class. 11). for the object or individual that is the median object or individual on any one system of measuring the character with which we are concerned will remain the median on any other method of measurement which leaves the objects in the same relative order. owing. M is no use giving merely the mid value of the class-interval into which the greatest frequency falls. (In general the mean is the less affected. Thus a batch of eggs representing eggs of the median price. as in fig. when prices are reckoned at so much per dozen. it is evident that some process of smoothing out the irregularities that occur in the actual distribution must be adopted. and mode that appear? to hold good with surprising — closeness for moderately asymmetrical distributions.. taking the mean to three places of That is decimals. in a practical case. 3-289 3-195 0-094 Hence approximate mode = 3'289 .3 (Mean .— VII. 21 is the value of the accordance with our definition. if the intervals could be made indefinitely small and at the same time irregular. as it happens.. For the distribution of pauperism we have. in order to ascertain the approximate value of the mode. be indefinitely increased. which is sufficient accuracy for the final result. approaching the ideal type of fig.. though three decimal places must be retained The true mode.. 20. Mean Median Difference . But there is only one smoothing process that is really satisfactory.Median). at all events the relative values of these three averages for a great many cases with which the student will have to deal. 21 and 22). and that is the method of fitting an ideal frequency-curve of given equation to the of observations cies should number The value of the variable corresponding to the the actual figures. . in so far as every observation can be taken into account in the determination. It is no use making the class-intervals very small to avoid error on that account. . As the observations cannot. in Mo in fig. the value 299 being. to say. At the same time there is an approximate relation between mean.. — — 121 on the choice of the scale of class-intervals. 9. be left to the more advanced student. the median lies one-third of the distance from the mean towards the mode (compare figs. median. The determination of the mode by this the only strictly satisfactory method must. and it is one that should be borne in mind as giving roughly. found by fitting an ideal for the calculation.3 x 094 or 3-01 to the second place of decimals. very nearly coincident with the centre of the interval in which the greatest frequency lies. = 3-007. for the class-frequencies will then become small and the distribution What we want to arrive at is the mid-value of the interval for which the frequency would be a maximum.— AVERAGES. maximum of the fitted curve is then taken as the mode. It is expressed by the equation — — Mode = Mean .. mode so determined for the distribution of pauperism. however. be so increased that the class-frequenrun smoothly. Soy. 1881. lix. is 2-99. and 1891 (the last being the illustration taken above).) England and Wales. StcU. we give below the results for the distributions of pauperism in the unions of England and Wales in the years 1850. and similar distributions at four other stations. 1860.122 THEORY OF STATISTICS. vol. distribution. and also the results for the distribution of barometer heights at Southampton (Table XL.. As further illustrations of the closeness with which the relation may be expected to hold in different cases. § 14). VI. Comparison of the Approximate and True Modes in the Case of Five Distributions of Pauperism Belief) in the Unions of Soe.. {Percentages of the Population in receipt of (Yule. Chap. Jov/r. Year. 1870. 1896. . its value may be indeterminate. it cannot replace the latter. owing to the difficulty of its determination. —AVERAGES. the value of the negative values occur. its value is always determinate. the student will find a proof in most text-books of algebra. 123 it is simply calculated. somewhat more easily calculated from a given frequency-distribution than is the mean .. . XS^ . as we have seen (§ 20). for. Objection is sometimes taken to the use of the mean in the case of asymmetrical frequency-distributions. the logarithiti of the geometric mean of a series of values is the arithmetic mean of their logarithms. is a form of average hardly suitable for elementary use. is zero. Chap. X . The geometric mean of a given series of quantities is always less than their arithmetic mean . . and in most cases it is rather less affected than the median by errors of sanjpling. The median is.. X„. . would apply with almost equal force to the median. But no one in the least degree familiar with the manifold forms taken by frequency-distributions would regard the two as in general identical . (11) that is to say. it is sometimes a useful makeshift. The objection. and while the importance of the mode is a good reason for stating its value in addition to that of the mean.. 22. and it may become imaginary if even a single value of Excluding these cases.VII. The mode.X^. The magnitude of the difference depends largely on the amount of dispersion of the variable in proportion to the magnitude of the mean (c/. and in ref. and that its value is consequently misleading. . Xg. and the elementary student will do very well if he limits himself to its use. it may be noted. It is necessarily zero. X^. (10) may also be expressed in terms of logarithms. and its algebraic treatment is difficult and often impossible. it is true. . 10.. finally. if VIII. The arithmetic mean should invariably be employed unless there is some very definite reason for the choice of another form of average.X^ The definition . . but at the same time it represents an important value of the variable. The geometric mean G^ of a series of values Xj. log^ = -^2(logX) . . Question 8). on the ground that the mean is not the mode. iS| defined by the relation — G={X. it should be noticed. but its use is undesirable in cases of discontinuous variation.. The Geometric Mean. its algebraic treatment is particularly easy. and in a certain class of cases it is more and not less stable than the mean . the difference between mode and median is usually about two-thirds of the difference between mode and mean. of the ratios of For if X=X-^IX^. if X is given as the product of any number of others.X^ . § 4 (c) ) the geometric mean does not possess any simple and obvious properties which render its general nature readily comprehensible. of §§ 9 and 10. the geometric mean of the product is the product of the geometric means.. +Nr. At the same time. (a) If the series of observations series. . as the following examples show. X consist of r component there being N-^ observations in the first. If there are many observations. no doubt. Xf. 23. computation is a little long. and ii.. G=GJG^ (c) . . . different series. observations in r expressed in terms G^ of X^. JTj. X^ X„ by G=G^.e. then summing for all pairs of Xj's and X^s.. For evidently we have at once (as in § 13 (*))N... G^. G^. (13) Similarly. That is G^ .. if a variable i..^. and so on.\ogG^ + N^. .G. and is readily treated : X : algebraically in certain cases.\ogGr (6) . a table should be drawn up giving the frequency-distribution of log X.\ogG^+ . The geometric mean is always determinate and is rigidly defined. the relation . of the component series. . geometric means G^. (14) to say. and the mean should be calculated as in Examples i. partly. . of the X^ denoting corresponding the geometric mean G oi X is .. on account of its rather troublesome computation. . JTj. owing to the necessity of taking logarithms it is hardly necessary to give an example. the mean possesses some important properties.G^. The geometric mean has never come into general use as a representative average. . (12) tions The geometric mean in two series is equal corresponding observato the ratio of their geometric means.\ogG^N. the geometric mean O of the whole series can be readily expressed in terms of the geometric means ffj.. etc. . but principally on account of its somewhat abstract mathematical character (c/. logX=logXi-logJ-2. N^ in the second.124 THBOKT OF STATISTICS. X=XyX2.. as the method is simply that of finding the arithmetic mean of the (instead of the values of X) in accordance with logarithms of equation (11). the most reasonable assump1801 II 21 31 SI 61 11 81 91 1901 300 300 Gwib^rUxndi 250 -250 200- — 200 Dorset 150 CumherloTui ISO Dorset 100 . (16) . 31 w iffoi Census year.. and tion to Pn = P. 24. is . (15) The population midway between the two censuses P^. P^r^ two years after the first census. in estimating the The use of the geometric mean finds its simplest application numbers of a population midway between two epochs (say two census years) at which the population is known..^ ... 125 24. Fig. —AVERAGES. and so on.^Po-r^^iPo-i^nY therefore . so that the populations in successive years form a geometric series. — Showing the Populations of certain rural counties of England for each Census year from 1801 to 1901. make is that the percentage increase in each year has been the same. If nothing is known concerning the increase of the population save that the numbers recorded at the first census were Pg and at the second census n years later P„. ffereforU — 100 Hereford' SO- SO isor n 21 31 11 51 ei ?.VII. . P^r being the population a year after the first census. 25. for example. and in the remainder of the area the initial population is ^q and the common ratio r. 24 would be continuously convex to the base. and from which there is much emigration. 14. it cannot be assumed for the Counties if it be assumed for the Counties. i. This result must. i. the assumption is not self-consistent in any case in which the rate of increase is not uniform over the entire area and almost any area can be analysed into parts which are not similar in this respect. a constant percentage rate of increase be assumed for England and Wales as a whole. The rate of increase of population is not necessarily.126 THEORY OF STATISTICS. in some respects. " index-numbers. a peculiarly convenient form of average in dealing with ratios. Further. or even usually. If then. 15 for a discussion of methods that may be used for the consistent estimation of populations under such circumstances. be used with discretion." as they are termed. For if in one part of the area considered the initial population is Pq and the common ratio R.e. The property of the geometric mean illustrated by equation (13) renders it. it cannot be assumed for the country as a whole. and similar results will often be found for districts in which the population is not increasing very rapidly. a curve over any considerable period of time representing the growth of population as in fig. the population in year n is given by : — This ^oes not represent a constant rate of increase unless = r. however. The student is referred to refs. the geometric mean of the numbers given by the two censuses.e. whether the population were In the diagram it will be seen that increasing or decreasing. the curves are frequently concave towards the base. constant if it were so. Let : R ^ 0' -^ 0' -^ 01 • • . of prices. VII. G^^o denote the geometric means of the' Ps for the years 1 and 2 respectively. If the geometric mean be chosen. we have £20 / Y' _ I ^ 20 \yi -t • Y" 20 ' ± Y'" : Y" ^"10 Y"' . The question is. and G-^^. for 127 will afford form of average of the Ps any given year an indication of the general level of prices for that year. what form of average to choose. —^AVERAGES. provided the commodities chosen are sufficiently numerous and representative. and if there be metrical round log G if plotted to log a single mode.) Number in . as the equivalent of a deviation + x. The general applicability of the assumption made does net. instead of a deviation . appear to have been very widely tested.a. if JI be the harmonic mean. as in Tables X. (Data from A. log G will be that mode a logarithmic or geometric mode. If a distribution take the simplest possible form when relative deviations are regarded as equivalents. pp. § 11). The Harmonic Mean. 8. as it might be termed G will not be the mode if the distrias base. however. is (18) The following illustration. The frequency-curve will then be symas base. 2. will required for an serve to show the method of calculation.G. 31. XIII.128 THEORY OF STATISTICS. 9). that is. iii. Biometrika. G. bution be plotted in the ordinary way to values of The theory of such a distribution has been discussed by more than one author (refs. X — quantities the reciprocal of the arithmetic mean of their reciprocals. 27. the result of which is example in a later chapter (Chap. and the reasons assigned have not sufficed to bring the geometric mean into common use. D. 30. Darbishire. It may be noted that. The harmonic mean of a series of X — . The table gives the number of litters of mice. as the geometric mean is always less than the arithmetic mean. the fundamental assumption which would justify the use of the former clearly does not hold where the (arithmetic) mode is greater than the arithmetic mean. the frequency of deviations between G/s and G/r will be equal to the frequency of deviations between r. with given numbers (X) in the litter. in certain breeding experiments. and XL of the last chapter. deviation Gjr will be regarded as the equivalent of a deviation r.G and s. The official returns of prices in India were. we should have had 50 returns showing a price of Id.. the result is equivalent to the harmonic mean of prices stated in the ordinary way. per egg. Question 9. and measures of dispersion in general : includes much of the matter of (1). averages. Persons. 30 fourteen to the shilling." xviii. d. 1/Zr= 0-2831. VIII. 13-16. but useful to the economic student for references cited. d." Supposing we had 100 returns of retail prices of eggs. But if the prices had been quoted in the form usual for other commodities. and. lower than the arithmetic mean." The average annual price of a commodity was based on halfmonthly prices stated in this form. and 20 a price of l-2d. Thus retail prices of eggs were quoted before the War as "so many to the shilling.) per rupee. dispersion. AVERAGES. In the issues of " Prices and Wages in India" for 1908 and later years the prices have been stated in terms of "rupees per maund (82-286 lbs. 27= 3-532.. G. per egg. 30 showing a price of 0-857d. 1913. 129 Whence. "Ueber den Ausgangswerth der kleinsten Abweiohungssumme. Leipzig (The average defined as the origin from which the (1878).. 50 returns showing twelve eggs to the shilling. (also : mean (2) dealt with incidentally.-phys. dessen Bestimmung. 1897. by W. ZlZBK. Leipzig. : VII.) . 1. p. pp.) REFERENCES. numbered xi.) : (3) . If the prices of a commodity at different places or times are stated in the form "so much for a unit of money. a fortiori. it will be seen.). Leipzig. The arithmetic mean is 4-587. Lipps Engelmann. Holt & Co. until 1907. Utatiatical Averages. M. translated with additional notes. New York. Classe). T. (Posthumously published: deals with frequency-distributions." The change. equivalent to a price of 0-984d. arithmetic mean 0-997d.. herausgegeben von G. G. Verwendung nnd VerallgemeinAbh. given in the form of "Sers (2-057 lbs.." and an average price obtained by taking the arithmetic mean of the quantities sold for a unit of money.) Fechner. Fkanz. and "index-numbers" were calculated from such annual averages. legl. measured in one way or another. F. amounts to a replacement of the harmonic by the arithmetic mean price. the amount of difference depending largely on the magnitude of the dispersion relatively to the magnitude of the mean. and 20 ten to the shilling . sdehsischen Gesellschaft d. Kolleklivmasslehre. Fechnee. DunckerundHumblot. is a minimum geometric eriiiig. Chap. of the Abh. The harmonic mean of a series of quantities is always lower than the geometric mean of the same quantities. (1) General. maih. their forms. etc. 1908 English translation. or more than a unit greater. Wissenschaften. then the mean number per shilling would be 12-2. a slightly greater value than the harmonic mean of 0-984. {Cf. (Nonmathematical. T. DiestatistischenMittelicerthe. vol. 1883. 11. Macmillan." Trans. of any number of Positive Quantities is greater than the Geometric Mean.. xi-xii.." Proc. Roy. 1879. th£ Registrar-General for A. (Contains. (8) MoAlistbk. Sac. in Investigations in Cwrrency and Finance . 714. "Skew Variation in Homogeneous Material. Soc. Stanford. vol. p. The student will find copious (16) Edgbworth. references to the literature in the following E. 260. 345. vol. p.. (Tlie geometric mean applied to the measurement of price changes. ref. p. clxxxvi. Chap. F. etc.' The general theory of indej^-numbers and. pp. A. "Reports of the Committee : appointed for the .) ) ) — ) ) 130 THEORY OF STATISTICS. Roy. (16) Waters.. Method for estimating Mean last Intercensal Period. p." ibid. Donald. 1899-1900. 1903. Populations in the Soc. 365." Joim: Roy. "On the Modal Value of an Organ or Character.. vol. Skew Frequency -curves in Biology and Statistics (9) Noordhoff. (Some criticism of the reasons assigned by Jevons for the use of the geometric (4) Jbvons. Groningen. " The Geometric Mean in Vital and Social Statistics. xlvi. a generalisation of McAlister's law. Soc. Edin. i. 1884. Serious Pall in the Valiie of Gold ascertained Reprinted Social Effects set forth . C." Proc. lix. Pay. G.. 2618. 1901. Estimates of Population : England and Wales {Cd. G. Karl. (11) Pearson. vol. Ixiv.) For the methods actually used.General England and Wales for 1907.. p. 343. Stat. xviii.) (13) Pearson. " The Law of the Geometric Mean. Dawson." Jour. (The law of frequency to which the use of the geometric mean would be appropriate. vol. vol. for a different method based on the symptoms of growth such as numbers of births or of houses. 1902. "On the Method of ascertaining a Change in the Value of Gold. C. . 1907. Also reprinted in volume cited above. Stat. 1863. amongst other forms. London. xxviii. Series A. p.. Soc. 367. Math. "An See also refs. p. W. Supplement to Annual Report of Snow. 1865. p. E. 1896. "On the Variation of Prices and the Value of the Currency since 1782. Of. (6) Edgewokth. Index-mimbers. C. and Wm. (12) p. Stanley. "A Estimates of Population. XII. (Definition of mode. Francis. Yule. : : (14) Waters. vol. (5) Jevons. These were incidentally referred to in § 25." Biometrika. xxix. J.) "Notes on the History of Pauperism in England and Supplementary Note on the Determination of the Mode.. Roy. The Mode. Phil.. Wales. 1895. London. Y. The Geometric Mean. Stanlbt." Jour. Y. U.. (The note deals with elementary methods of approximately determining the mode the onethird rule and one other. the different methods in which they may be formed are not Cbnsidered in the present work. London. W. Kael. Stat. cxvii. Eapteyn. Soc.. Stat. 293.. A and its mean. vol. Boy.. 1 and 2. 343." Jour.. (A warning as to the inadequacy of mere inspection for determining the mode. see the Reports of the Registrav. Elementary Proof that the Arithmetic Mean (10) Crawfoed. Soc. of pp. cxxxii-cxxxiv. (7) Galton. and for 1910. taken as 100 : From the arithmetic mean of the ratios in col. H. and check your work by the method of § 13 (5). ." in the Board of Trade Report on Wholesale and Retail Prices in the United Kingdom. . 247). by the method of § 20. 96. Note that. Using a graphical method.. . EXERCISES. the latter being taken as 100. p.. (4) From the geometric mean of the ratios in col. and hence the approximate mode. Find the mean weiglit of adult males in the United Kingdom from the data in the last column of Table IX. on their average prices during the years 1867-77.. Wales. 4. 1889 (p... 3. — 131 —AVERAGES. p. 88. (17) Edgicworxh. median. 3. purpose of investigating the best methods of asoei-taining and measuring Variations In the Value of the Monetary Standard.. VI.. Similarly. Find the average ratio of prices in 1908 to prices in 1898. Jowr. (18) Fountain. <&c. . Mean Median . VI. 2. . F. Macmillan.. (3) From the ratio of the geometric means of cols. the last two methods must give the same result. fiiyi the mean. 95. by § 25. Ireland. 485). . Chap. Stature in Inches for Adult Males in England. Also find the median weight. Verify the following means and medians from the data of Table VI. and approximate value of the mode for the distribution of fecundity in race-horses. 1 and 2. (1) . and 1890 (p. Article " Index-numbers " in Palgrave's Dictiona/ry of Political Mconomy. Chap VI. 1 and 2. 133). In column 3 have been added the ratios of the index-numbers in 1908 to the index-numbers in 1898. 1896. p. VI. vol. /Stei. 1903.. "Memorandum on the Construction of Index-numbers of Prices. 67-31 67-35 68-55 68-48 66-62 66-56 67-78 67-69 In the calculation of the means. (Data from Sauerbeck. use the same arbitrary origin as in Example ii. find the median annual value of houses assessed to inhabited house duty in the financial year 1885-6 from the data of Table IV. 5.— VII. 1887 (p...) The figures in columns 1 and 2 of the small table b^low show the index-numbers (or percentages) of prices of certain animal foods in the years 1898 and 1908. Chap. 3. 83. Table X." British Association Reports. (2) From the ratio of the arithmetic means of cols. ii. 181). p. 1888 (p. Chap. Boy. 1.. March 1909. Scotland. Y. ) the rural sanitary districts of Essex. The table below shows the population of 6. at the censuses Estimate the total popiilation of the county at a date of 1891 and 1901. midway between the two censuses. (1) on the assumption that the percentage rate of Increase is constant for the county as a whole. Essex. and the borough of West Ham. (Data from census of 1901.132 THEORY OF STATISTICS. (2) on the assumpti(fti that the percentage rate of increase is constant in each group of districts and the borough of West Ham. the urban sanitary districts (other than the borough of West Ham). . calculation. of two distributions covering precisely the same range of variation. Clearly we should not regard two such distributions as exhibiting the same dispersion. ETC. Moreover. while the other exhibited an almost eyra distribution of frequency over the whole range. A measure subject to erratic alterations by casual influences in this way is clearly not of much use for comparative purposes. There are seldom real upper and lower limits to the possible values of the variable. Inadequacy of the range as a measure of dispersion 2-13. the figures of Table IX. The simplest possible measure of the dispersion of a series of values of a variable is the actual range. MEASURES OF DISPEESION. like the averages discussed in the last : 133 .e. very large or very small values being only more or less infrequent the range is therefore subject to meaningless fluctuations of considerable magnitude according as values of greater or less infrequency happen to have been actually observed. and properties 20-24. showing the frequency distributions of weights of adult males in the several parts of the United Kingdom. one individual was observed with a weight of over 280 lbs. i. I. The method of grades or percentiles.CHAPTER VIII. — mean — — — 1. Measures of asymmetry or skewness 27-30. While this is frequently quoted. thj? one showed the observations for the most part closely clustered round the average.. Some sort of measure of dispersion is therefore required. Chap. though they exhibit the same range. The deviation : its definition. The addition of the one very exceptional individual has increased tho range by some 30 lbs. the measure takes no account of the form of the distribution within the limits of the range . calculation. and properties— 14-19.. The quartile deviation or semi-interquartile range 25. or about one-fifth. 95. based. Measures of relative dispersion— 26. VI. -The standard deviation: its definition. Note. the difference between the greatest and least values observed. In Wales. it might well happen that. the next heaviest being under 260 lbs. p. it is as a rule the worst of all possible measures for any serious purpose.. for instance. ' in order to obtain a measure of dispersion. VII. as in the last chapter. . VII. There is a very simple relation between the standard deviation and the root-mean-square deviation from any other oi'igin. quantity analogous to the standard deviation may be defined in more general terms. Then we may define the root-mean-square deviation origin A by the equation s2 s from the = is(a. . since this sum is necessarily zero if deviations be taken from the mean. the measure should possess all the properties laid down as desirThere are three such able for an average in § 4 of Chap.134 THEORY OF STATISTICS. (2) In terms of this definition the standard deviation is the rootmean-square deviation from the mean. . Then ^^ = x^+'ix.%{x) + N. measures in common use the standard deviation. which the first is the most important. deviations being measured from the arithmetic mean of the observations. § 8) denote the deviation of A A X from . chapter. so that no single observation can have an unduly preponderant effect on its magnitude .d+d\ = 2(«2) + M. In order to obtain some quantity that shall vary with the dispersion it is necessary to average the deviations by a process that treats them as if they were all of the same sign. then the standard deviation is given by the equation 2. . Let M-A = d so that (3) ^=x + d. 3. and a deviation from the arithmetic mean by x.e. and the quartile deviation or semi-interquartile range. The standard deviation — o-2 = is(a.d^. If the standard deviation be denoted by o-.. the mean deviation.. (1) To square all the deviations may seem at first sight an artificial procedure. and stpiaring is the simplest process for eliminating signs which leads to results of algebraical convenience. The Standard Deviation. indeed.^) . i. 2(^2) . on all the observations made. let i=X-A. but it must be remembered that it would be useless to take the mere sum of the deviations. — of is the square root of the arithmetic mean of the squares of all deviations. Let A be any arbitrary value of X. and let f (as in Chap. Generally. therefore = o-2 + d2. and accordingly zero. therefore. or %(f. Chap. and Jf*S' be set off equal to the standard deviation (on the same scale in which the variable is plotted along the base)..f) is termed first moment (c/. the standard deviation is the least possible root-mean-square deviation. (4) Hence the root-mean-square deviation is least when deviations are measured from the mean. 25. then. If o. S(^^). 5.VIII. the method of calculating the standard It is illustrated by the deviation is perfectly straightforward. ETC. concrete idea of the way in which the root-mean-square deviation depends on the origin from which deviations are measured. 25). If we have to deal with relatively few. VII.|") is termed the mth moment. i. SA will be the root-meanThis construction gives a square deviation from the point A. be the vertical through the the hypotenuse.and d are the two sides of a right-angled triangle. A If. It will be seen that for small values of d the difference of s from awill be very minute. since A will lie very nearly on the circle with centre S and radius SM: slight errors drawn through in the mean due to approximations in calculation will not. is 135 But the sum of the deviations from the mean the second term vanishes. — MEASUKES s2 OF DISPEBSION. 4.t^) if we are dealing with a grouped distribution and/ is the frequency of f.e.. . ungrouped observations. just as 2(^) or SC/'. appreciably affect the value of the standard deviation. figures given below for the estimated average earnings of MH mean X M . S(/'. of a frequency-distribution (fig. say thirty or forty. is sometimes termed the second moment of the distribution about A. s is the : M Fig. § 8) we shall not make \\m of the term in the present work. 2^16^ (i= 1^ =421-5263 0-0693 00 = 0-2632. is unnecessary.. thus checking the value for the mean. it is not necessary to take the average to any higher degree of accuracy. <P= o-2 Hence = s2_ ^2 = 421-4570 0-= 20-529c. to give the are first of all totalled and the total divided by arithmetic mean M.— — THEOKY OF STATISTICS. viz. 4. 421-5 If we wish to be more precise we can reduce to the true mean by the use of equation (4).. ll^fd-. negative 290. but the work should be checked by totalling the positive and negative [The positive total is 300 and the diflferences separately. 15s.] Finally. each difference is squared. we have — .. .l^B. that small errors in the mean have little effect on the value found for the standard deviation. Treating the value taken for the mean as sensibly accurate. The first value is correct within a very small fraction of a penny. or 15s. 15s. : lid. 136 The values (earnings) agricultural labourers in 38 rural unions.018. differences to be squared are large (see list of Tables. lid. to the nearest penny. viz. and the squares entered in tables of squares are useful for such work if any of the col. as follows : . illustrating the fact mentioned at the end of § 4. The sum of the squares is 16. + 10/38. one penny being taken as the unit the signs are not entered. the difference of each observation from the mean is next written down as in col. The earnings being estimates. Evidently this reduction. as they are not wanted. 356). in the given case. Having found N the mean. p. 3. ETC. 1894. .) 1. 137 Caioi'LATion of the Standakd Deviation: Example i. vol. ( W. —MEASURES OF DISPERSION... part i. Little Labour Com: mission. Report. v. Calculation of Mean and Standard Deviation for a Short Series of Observations unEstimated Average Weekly Earnings of Agricultural Labourers grouped. in 1892-S. in Thirty-eight Rural Unions.— VIII. 9. and we accordingly use the same illustrations as in the last chapter. Thus in Example ii. from the class-interval as unit to the natural unit In this case the value found is 2'48 classof measurement. The arithmetic mean wage is 13s. remaining steps of the arithmetic are given below the table . the class-intervals is chosen as the arbitrary origin A from which to measure the deviations 4 tbe class-interval is treated as a unit throughout the arithmetic. The 768.e. of Chap. and the class-interval being half a nnit. 1. 192x4 = figures of that column again by f. and we are thus enabled to make an interesting comparison of the dispersions of the two. also given in the same Report. VII. frequency in any one interval by /. as his earnings and not the mere money wages he receives are the important matter to the labourer. cider. It might be expected that earnings would vary less than wages. Column 5 gives the figures necessary for calculating the standard deviation. below. and all the observations within any one class-interval are treated as if they were identical with we denote the If. 10). and is derived directly from col. 26-Od. the mid-value of the interval. i. The figures dealt with in this illustration are estimates of the weekly earnings of the agricultural labourers. 20 -Sd. . If we have to deal with a grouped frequency-distribution.— 138 • THEORY OF STATISTICS. these / observations contribute f^ to the sum of the squares of deviations and we have The standard deviation is then calculated from equation (4). such as coal. and as a fact we find Standard deviation of weekly earnings wages . intervals. 2. the same artifices and approximations are used as in the calculation The mid-value of one of of the mean (Chap. 5d. 3. as before. 4 by multiplying the Thus 90 x 5 = 450. . they include allowances for gifts in kind. VII. and 4 are the same as those we have already given in Example i. etc. The whole of the work proceeds naturally as an extension of that necessary for calculating the mean. however. that is 1'24 per cent. The work is therefore done very rapidly. if necessary. for the calculation of the mean. 6. cols. .^ The estimated weekly money wages are. the student must be careful to remember the final conversion. and so on. potatoes. 7. §§ 8. . 111. 139 CALCtTLATlON OF THE Standakd DEVIATION Example ii. {Cf. Calculation of the Standard Deviation of the Percentages of the Population in receipt of Relief. the work for the mean alone.— VIII. in addition to the Mean. p. ETC.) (1) Percentage in receipt of Relief. FI. —MEASURES OF DISPERSION. of : Chap. from the figures of Table VIII. Jour. . Boy.. figures slightly amended. Stat. 140 THEORY OF STATISTICS. vol. (From Yule. lix. Means and Standard Deviations of the Distributions of Pauperism {Percentage of the Population in receipt of Poor-law Belief) in the Unions of England and Wales since 1S50.) Year. 1896. Soc. 112 for the calculation of : mean alone. {Cf. p.. ETC.) — 141 VIII. 88. Calculation of the Stamdard Deviation of Stature of Male Adults in the British Isles from the figures of Table VI. Calculation of the Standard Deviation Example iii.—MEASURES OF DISPBKSION. p. (1) . Chap. 11. : M and N^ being the numbers of observations in the two comseries. we have as a special case If <T^ = hW + <r^) . N. . give a more definite and concrete meaning to the standard deviation. The standard deviation is the measure of dispersion which it is most easy to treat by algebraical methods. Let N-^ ponent the Then the mean-square deviations mean are. of observations of which the mean is consist of two component series. . the actual range is a good deal less than six times the standard deviation.aJ) + %{A\. It must not be expected to hold for short series of observations in Example i.of the whole series may be expressed in terms of the standard deviations o-j and o-j of the components and their respective means. XL). respectively. and also to check arithmetical work to some extent suificiently.^ + d. Therefore. but the work of § 3 has already served as one example. . . . for instance.dJ) . resembling in this respect the arithmetic mean amongst measures of position.<T^-^N^. o-„ and means diverging from the general mean of the whole series by d^. The majority of illustrations of its treatment must be postponed to a later stage (Chap. the standard deviation o. o-^. .^) + N. . of which the means are M^ and i¥. tively. Similarly. (5) the numbers of observations in the component series be equal and the means be coincident. . d„ the standard deviation cr of the whole series is given (using to denote any subscript) by the equation : It is evident that the if . (7) . for the whole series.{. M of the a--^ component a-^^ series about respec- + d^ and + d^ . by equation (4).<T^ = :if^{<T^^ + d. zxA N=N-^-vN^ the number in the entire series. that is to say.T. (6) so that in this case the square of the standard deviation of the whole series is the arithmetic mean of the squares of the standard deviations of its components. and we may take another by continuing the work of In that section it was shown that if a series § 13 (6).— 142 — — THEORY OF STATISTICS. to guard against very gross blunders. form of the relation (5) is quite general a series of observations consists of r component series with standard deviations o-j. m N. d^. .^) . .. VII. S 4). let us find the standard deviation of the first iV natural numbers.— VIII. deviation 0-2 a. for the checking of arithmetic.. though the student will have to take the statement on trust for the present..is therefore given by the equation 1)2. the standard deviation of a frequency-distribution in which all values of within a range + 1/2 on either side of the mean are equally frequent. ETC. +^{fr.i(if + = J^(i\r2_i) . or the relative intensity of some character in. as they are termed. so that the frequency-distribution may be represented by a rectangle. but merely by means of their respective positions when ranked in order as regards the character. 143 Again. _ = J(iV+ if cr2 1)(2]V+ 1) .')+ 12. The mean in this case is evidently {JV+ l)/2. (10) 13. whatever the character. and the standard deviation is therefore always that given by equation (9). (9) the relative merit of. — . the sum of the squares of the first iV natural numbers is ir(iv+i)(2iv+i) 6 The standard that is. it is convenient to note.g. . that it is. . marks awarded on some system of examination. VII. . VII. It will be seen from the preceding paragraphs that the standard deviation possesses the majority at least of the properties which are desirable in a measure of dispersion as in an average It is rigidly defined. namely. as is shown in any elementary Algebra. as in § 13 of Chap. Further. —MEASURES OF DISPERSION. observations made . unit then becomes negligible compared with i^. Another useful result follows at once from equation (9). and the standard deviation reduces to that of the first The single natural numbers when JV is made indefinitely large. and consequently This result is of service ' W X W . that if the same arbitrary origin be used for the calculation of the standard deviations in a number of component distributions we must have %{m=^u^. values outside these limits not occurring. as a rule. With in the same way as boys are numbered in a class. the different individuals of a series is recorded not by means of measurements.. it is based on all the (Chap.4. e. the measure least affected by fluctuations of . Tlie base I may be supposed divided into a very large number i\^ of equal elements..m (8) As another useful illustration.2 = g . it is calculated with reasonable ease it lends itself readily to algebraical treatment and we may add. individuals there are always if ranks.^)+%if. e. The Mean Deviation. as he advances further. the " modulus " (Airy). so the mean deviation is For least when deviations are measured from the median. "error of mean square" (Airy). Such root-mean-square quantities. have been termed the "fluctuation" (Edge worth) standard deviation multiplied by the square root of 2. By this displacement of the origin the sum of deviations in excess of the mean is reduced by m. of which we have adopted the most recent (due to Pearson. and also twice the the square. Let the origin be displaced by an amount c until it is just exceeded by to. and that the process of squaring deviations and then taking the square root of the mean seems a little involved. suppose that.c. nature is not very readily comprehended. it may be said that its general sampling. however. but the latter is the natural origin to use. frequently occur in other branches of science. standard deviation should always be used as the measure of dispersion. and will realise. taken without regard to their sign.1 of the values only. for some origin exceeded by m. The student will. Thus the terms "mean error" (Gauss). while the sum of The deviations in defect of the mean is increased by (iV-m)c. The mean deviation of a series of 14. The square of the standard deviation. Just as the root-mean-square deviation is least when deviations are measured from the arithmetic mean. the student will see later the reason for The reciprocal of the modulus has the adoption of the factor. ref. : : — — {N . until it coincides with the »ith value from the upper end of the series. new mean deviation is therefore. soon surmount this feeling after a little practice in the calculation and use of the constant. it may be The added. . and "mean square error" have all been used in the same sense.m)c .144 THEORY OF STATISTICS. 2) many of the earlier names are hardly adapted to general use. . It may Lo added here that the student will meet with the standard deviation under many different names.mc "*" 'N = A-t-i^(i\^-2m)c. deviations may be measured either from the arithmetic mean or from the median. the mean deviation has a value A. values out of N. On the other hand. as they bear evidence of their derivation from the theory of errors of observation. the advantages that it possesses. i. values of a variable is the arithmetic mean of their deviations The from some average. just as the arithmetic mean should be used as the measure of position. been termed the "precision" (Lexis). unless there is some very definite reason for preferring another measure. accordingly.VIII. of course. while d= -0-42 class-intervals. and then reduced to the mean as origin. or 17. is 776. the mean deviation from the median is 15d. but the median replaces the mean as the origin from which deviations are measured. 36. The calculation of the mean deviation either from the mean or from the median for a series of ungrouped observations is very simple. as it is necessary to remember. 101 per cent. and the (iV/2 + l)th observations. (p. 50.d to the sum found and subtract JV^d. their sum is 570 . 15. 16. and the deviations from the mean are written down in column 3. and of deviations in excess 509 total (without regard to sign) 1285. therefore t«(i\?i . In the present case iV^j = 327 and JV2 = 305. The mean deviation from the mean is therefore 590/38 = 15'53d. and so on . if the latter be indeterminate. the class-interval in which the mean (or median) lies. We have already found the mean (15s. ETC. : : — M~A and the sum of deviations from the mean is 1285 . The median is 15s. 145 The new mean deviation is accordingly less tlian the old so long as all origins if JTbe even. 6d. the unit of measurement being. 57. The mean deviation is therefore a minimum when deviations are measured from the median or. We have already found that the sum of deviations in defect of 3'5 per cent. we have to add ifj.9-2 = 1275"8. That is to say. and lies in the class-interval centring round 3-5 per cent. for con- 10 . and = d.but in precisely which the median (instead the mid-value of the interval in of the mean) lies should. the mean is 3'29 per cent.9-2. as before. In the case of a grouped frequency-distribution. the sum of deviations should be calculated first from the centre of. 137) as an illustration. from an origin within the range in which it lies. The mean deviation from the median is calculated in precisely the same way. Hence the mean deviation from the mean is 1275'8/632 = 2'019 class-intervals.0-42 X 22 = . and this value is the least if iV be odd. Adding up this column without respect to the sign of the deviations we find a total of 590. to the nearest penny). The mean deviation from the median should be found similar fashion. If the number of observations below the mean is N^ and above the mean N^. exactly. The deviations in pence run 63. lid. Take the figures of Example i.iVj) = . the mean deviation is constant for within the range between the Njith. and. the class-interval. . Thus in the case of Example ii.— MEASUKBS OF DISPERSION. the mean deviation is lowest when the origin coincides with the {N+ l)/2th observation. For a short series of observations like the wage statistics of Example i. 19. but the difference is too slight to affect the second place of decimals. Isles. Hence the mean deviation correction is from the median is 2'012 intervals. It should be noted that. an approximation. but depends upon their forms. As a rule. The value is really smaller than that of the mean deviation from the arithmetic mean. The deviation-sum with 3'0 as origin is found to be 1263. a regular result could hardly be expected: the actual ratio is 15-0/20'5 = 0-73. . be taken as origin. iV^j = 327. the median is venience. VII.. the mean deviation is usually very nearly four-fifths of the standard devia +039x22= X We tion. for example. it is more affected by fluctuations of sampling than is the standard deviation. N^ = 305. or again I'Ol per cent. can be calculated rather more rapidly than the standard deviation. § 5). It is not. the ratio found is 0-80. Hence 3-0 per cent. taken as the origin. again Chap. this method of calculation implies the assumption that all the values of within any one class-interval may be treated as if they were the mid-value of that interval. d = + 0-39 intervals. and in such cases the use of the mean deviation may be slightly preferable to that of the standard deviation. VI. 18. though in the case of a grouped distribution the difference in ease of calculation is not great. The mean deviation. It is a useful empirical rule for the student to remember that for symmetrical or only moderately asymmetrical distributions. but may be less affected if large and erratic deviations lying somewhat beyond the bulk of the distribution are liable to occur. a convenient magnitude for algebraical treatment . in some forms of experimental work.146 THEOKY OF STATISTICS. 5 and 9. the mean deviation of a distribution obtained by combining several others cannot in general be expressed in terms of the mean deviations of the component distributions. of course. Thus for the distribution of pauperism we have I'Ol mean deviation standard deviation 1-24 In the case of the distribution of male statures in the British" Example iii. This may happen. it will be seen. Thus in Example ii. § 15) 3-195 per cent. This is. student to find the correction to be applied if the values in each interval are treated as if they were evenly distributed over the interval. as in the case of the standard deviation. on the other hand. instead of concentrated at its centre (Question 7). and the +8-6. should be (Chap. for example. approaching the ideal forms of figs. but as a rule gives results of amply sufficient accuracy for practice if the class-interval be kept reasonably small have left it as an exercise to the {cf. for then there would be nine The observations only below the value chosen instead of 9*5. ii a value Q^ be determined such that three-quarters of all the values observed are less than Q^ and one-quarter only greater. But rigidly symmetrical. like median. like the median.Mi. ETC. If a 20. value Qj of the variable be determined of such magnitude that one-quarter of all the values observed are less than Q^ and threequarters greater. and Therefore falling half above it and half below. there are 38 observations. of all the observations. —MEASURES OF DISPERSION. we must substitute a We range of 7-|. are determined by simple arithmetical or by .times this measure. or better. is Q interquartile range — : Lower Upper and quartile Q^ quartile = 14s. a range of six times the standard deviation contains over 99 per cent. lid. The two quartiles and the median divide the observed values of the variable into four If Mi be the value of the median. VIII. 21. then Qj is termed the upper quartile. it is usual to take as the Q= Qs-Qi and termed the quartile deviation. In the case of a short series of ungrouped observations the quartiles are determined. 147 pointed out in § 10 that in distributions of the simple forms referred to. then Q-^ is termed the lower quartile. a symmetrical distribution — • Mi -Q^ = Q^.. The Quartile Deviation or Semi-interquartile Range. = ^^i = 12-5d the 22. the quartiles. which may be regarded as divided by the quartile. the semiit is not a measure of the deviation from any particular average the old name probable error should be confined to the theory of sampling (Chap. and 38/4 = 9'5: What is the lower quartile? The student may be tempted to take it halfway between the ninth and tenth observations from the bottom of the list but this would be wrong. If the mean deviation be employed as the measure of dispersion. XV. Q^= 16s. for instance. lOd. § 17). in classes of equal frequency. In the wage statistics of Example i. quartile must be taken as given by the tenth observation itself.. Similarly. by inspection. as and no distribution measure the difference may is be taken as a measure of dispersion. In the case of a grouped distribution. with great ease. we find of the standard deviation. generally speaking. §§ 15. i. and the three men picked out for measurement who stand in the centre and one-quarter from either end of the rank..148 THEORY OF STATISTICS. we have Thus for the Total frequency under 2-25 per cent. by measuring two individuals only. It follows from this ratio that a range of nine times the semiinterquartile range. 632 -=-4 =158 = 138 Difference Frequency in interval 2 '25 . the dispersion as well as the average stature of a group of men is required to be determined with the least possible expenditure of time. approximately.2 75 = = 20 89 per cent. gives the ratio 0-68. but the actual ratio. 5 and 9. Such uses are. is required to cover the same proportion of the total frequency (99 per cent. 24. If. the semi-interquartile range is usually about two-thirds Thus for Example ii. if necessary. Example ii. e. graphical interpolation (c/. of Example iii. they may be simply ranked in order of height. a little exceptional. the wage statistics in Example . 23.. For distributions approaching the ideal forms of figs. It is calculated. 16). This measure of dispersion may also be useful as a makeshift if the calculation of the standard deviation has been rendered difficult or impossible owing to the employment of an irregular classification of the frequency or of an indefinite terminal class. distribution of pauperism. Of the three measures of dispersion. the semi-interquartile range has the most clear and simple meaning. 0-61. and. like the median. The distribution The short series of statures. VII. Chap. „ Whence Gj = 2-25 + gg x Similarly 20 0-5 = 2-362 =4-130 we find §3 Hence It is Q=^^^ = 0-884 to „ left the student to check the value by graphical interpolation.g. however. does not diverge greatly. viz. and the quartiles may be found. or more) as a range of six times the standard deviation. could not be expected to give a result in very strict conformity with the rule.. g. quartile. It has. as Pearson has termed it. As already stated. for example. this method of measuring deviations. As was pointed out in 25. and deviations should be measured by their ratios to the geometric mean. the geometric mean seems the natural form of average to use. or the kilogramme as the unit of weight and the measure should accordingly be a mere number. mean deviation. It is a much more simple matter to allow for the influence of size by taking the ratio of the measure of absolute dispersion {e. — MEASURES OF DISPERSION. like the median. standard deviation. Measures of Relative Dispersion. Measures of Asymmetry orSkewness. but also deviations from the average. ref. Pearson has termed the quantity it shares with the median. unless simplicity of meaning is of primary importance.— VIII. Such a measure of relative dispersion is evidently a mere number. with its accompanying employment of the geometric mean. in comparing the relative variations of corresponding organs or characters in the two sexes the ratio of the quartile deviation to the median has also been suggested (Verschaeffelt. 26.e. W : the percentage ratio of the standard deviation to the arithmetic mean. Thus the difference between the deviations of the two quartiles on either side of the median indicates the existence of skewness. and its magnitude is independent of the units of measurement employed. if relative size is regarded as influencing not only the average. however. § 26. 8). VII. Such a measure of skewness should obviously be independent of the units in which we measure the variable e. or quartile deviation) to the average (mean or median) from which the deviations were measured. however. the coefficient of variation (ref. owing to the lack of algebraical convenience which it is obvious that the indeterminate. 149 semi-interquartile range as a measure of dispersion is not to be recommended. particularly for anthropometric work. the stone. Chap. some numerical measure of this character is desirable. Chapter VII. the skewness of the distribution of the weights of a given set of men should not be dependent on our choice of the pound. Further. ETC. § 14. and that the use of this measure of dispersion is undesirable in cases of the student should refer again to the discontinuous variation discussion of the similar disadvantage in the case of the median. but to measure the degree of skewness we should take the ratio of this — . been largely used in the past. has never come into general use. 7). — If we have to compare a series of distributions of varying degrees of asymmetry. and has used it.g. may become : — ti=100— I. or skewness. or 10 per cent.g. form a natural and convenient series of percentiles to use. The measure (12) is much more sensitive (c/.. and while its highest possible value is 2. for summarising such statistics as we have been considering. VII. but. and that is Pearson's measure (ref. . 9. in the case of frequency-distributions of the forms referred to. when the distribution is symmetrical . skewness to be positive if the longer tail of the distribution runs in the direction of ' high values of X. it may be noted that the numerator of the above fraction may. -^ — — — . may conclude this chapter by describing briefly a method that has been largely used in the past in lieu of the methods dealt with in Chapters VI. as a fact. The Method of Percentiles. (12) ^ ' evidently zero for a symmetrical distribution. and VII. of the observed values is mode and mean —We P P . then If a series of percentiles be determined for short intervals. . be replaced approximately by 3(mean . taking the interquartile range. the semiOur measure would then be.— 150 difference to — THEORY OF STATISTICS. Chap. or values of the variable which divide the total frequency into ten equal parts. . it would rarely in practice attain higher numerical values than ±1. skewness = (^^-^^l^i^^^ = «I±%^^ .median). A similar measure might be based on the mean deviations in excess and in defect of the mean. some quantity of the same dimensions.. e.g. (11) This would not be a bad measure if we were using the quartile deviation as a measure of dispersion its lowest value is zero. as they are sometimes termed) be ranged in order of magnitude. the value does not exceed unity for frequency-distributions resembling generally the ideal distributions As the mode is a difficult form of average to determine of fig.. There is. or value of the variable wliich has 50 per cent. If the values of the variable (variates. percentiles. by elementary methods. however. and a value of the variable be determined such that a percentage p of the total frequency lies below it and is termed a percentile. they suffice by themselves to show the general form This is Sir Francis Galton's method of of the distribution. 27. No upper limit to the ratio is apparent from the formula. only one generally recognised measure of skewness. The fifth decile. than (11) for moderate degrees of asymmetry. e. 5 per cent. § 20).. in which coincide. and the preceding paragraphs of this chapter. 100 -^ above. The deciles. 9) : skewness = This standard deviation -mode — mean=-t ^--. above). and ordinates are then erected to a horizontal base to represent on some scale these integrated frequencies a smooth curve is then drawn through the tops of the ordinates so obtained. showing the number of Districts of England and Wales in which the Pauperism on 1st January 1891 did not exceed any given percentage of the population (same data as Fig. is the median the two quartiles lie between the second and third and the seventh and eighth deciles respectively. : —Curve graphical curve used for obtaining the deciles by the graphical method in the case of the distribution of pauperism (Example ii. : 151 above it and 50 per cent.VIII. as the method is precisely § 22). the same as for median and quartiles (Chap. 28. they become indeterminate (c/. § 15. as will be seen from the figure. § 24). —MEASURES OF DISPERSION. It is hardly necessary to give an illustration of the former process. so as to give the total frequency not exceeding the upper limit of each class-interval. and above. This curve. rises slowly at first when the frequencies are small. below. the tsoo- 1300izoo- Percentage of the popiUaUon. may be determined either by arithmetical or by graphical interpolation. The deciles . 10. of course on a very much reduced scale. VII. 26. and finally turns over again and : becomes quite flat as the frequencies tail off to zero. The figures of the original table are added up step by step from the top. Fig. in/ receipt of relief Fig. then more rapidly as they increase. like the median and quartiles. 92) determination of Deciles. like the former constants. 26 shows. excluding the cases in which. p. The deciles. ETC. measured.) is (Example shown for comparison in fig. The method of percentiles has some advantages as a method of representation. 30. and erecting at each point so obtained a vertical proportional to the corresponding percentile. 26 redrawn so as to give the Pauperism corresponding to each grade Galton's "Ogive. 26 may be drawn in a different way by taking a horizontal base divided into ten or a hundred equal parts (grades. The ogive curve for the distribution of statures It will be noticed that the ogive curve does not bring out the asymmetry of the distribution of pauperism nearly so clearly as the frequencypolygon. 28. the capacity of the different boys in a class as regards some school subject cannot be directly iii." : Ogive form. 26. but it may not be very difficult for the master to . This gives the curve of fig. 29. and projecting the points so obtained horizontally across to the curve and then vertically down to the base. which was obtained by merely redrafting fig. 10. 27. p. An extension of the method to the treatment of non-measurable characters has also become of some importance. The curve of fig. fig. 27. as Sir Francis Galton has termed them). — The curve of Fig. For example. as the meaning of the various percentiles is so simple and readily understood.152 THEORY OF STiTISTICS. The construction is indicated on the figure for the fourth decile. The curve is of so-called O 10 20 30 10 so eO 70 so 90 100 10 20 30 W so 60 70 SO 90 lOO Grades Fib. may be readily obtained from such a curve by dividing the terminal ordinate into ten equal parts. 92. the value of which is approximately 2 '88 per cent. constants illustrative of certain aspects of the frequency-distribution. In all cases of published work. The method of ranks. grade. as the Given the application of other methods to the data is barred. to a sufficiently high degree of approximation. and thence to other constants. 28. : 153 arrange them in order of merit as regards this character if the boys are then " numbered up " in order. p. it is better if possible to obtain a numerical measure. 10 1 20 1 3f) 1 40 50 1 60 1 70 1 80 i 90 1 100 it- H 74 -7 -70 -68 -66 -6q -62 60- -60 O 40 SO GO 70 SO 90 Stature corresponding to each.— Ogive Curve for Stature. It should be noted that rank in this sense is not quite the same as grade if a boy is tenth. same data as Fig. from the 'bottom in a class of a hundred his grade is 9 "5. the reader can calculate not only the percentiles. But given only the percentiles. serious inconvenience may be caused. say. though. they are absolutely fundamental. in the case of a measurable character. with any degree of accuracy. therefore. the number of each boy. of course. tor aduit males in the British Isles. the remarks in § 12. he cannot pass back to the frequency-distribution.a Vm. . ETC. or his rank. 89. but any form of average or measure of dispersion that has yet been proposed. the percentiles are used not merely as . O 7t—\ 12. serves as some sort of index to his capacity {cf. or at least so few of them as the nine deciles. But if. 10 ZO 30 100 FlO. 6. —MEASURES OF DISPERSION. grades. or percentiles in such a case may be a very serviceable auxiliary.. but the method is in principle the same with that of grades or percentiles). but entirely to replace the table giving the frequencydistribution. table showing the frequency-distribution. the figures of the frequency-distribution should be given . (1) Fbohnbb. Standard-deviation. p. Pieree Simon.KSOHAEFFELT. ' Standard Deviation. have given a direct method that seems the simplest and best for the elementary student. 1906. kgl. Karl. 71. Kakl.. (4th Series). Stalistios Galton. bot. Ges. with Remarks on the vol. Series A. 80. xviii." £en deutsch. Law (6) of Frequency of Error. 454. Soc. London. Series A. including ftnartiles. pp. Ixxviii. 33-46. Palin. 1. T. Mag. vol.. Kakl. olxxxv.) Trachtenbekg. Relative Dispersion. p." p. Ges.. Stat. "A Note on a Property Boy. & E.) Calculation of Mean. 343. however. . I. d. (Proof that the mean deviation is a (4) minimum when taken about the median.) — 154 THEORY OF STATISTICS. (7) Peakson. clxxxvi. p. General. (A very simple proof of Method of (5) Percentiles. 350-55. 276-7. Boy. p. Trans. E. Frequency cu7-ves and Correlation C.. Series A. " pp. (Introduction of term. M. Layton.. sacks. On the Disseotionof Asymmetrical Frequency-curves). Verwendung und Veiallgemeinerung. xlix. of the Abh. 1895.) Mean (3) Deviation. Phil. vol. pp. A process of successive summation that has some advantages can.. 370. Boy. Soc. etc.) of the Median. (Introduction of the term "standard deviation. 1818. 1894. 1878. Marquis de. math. vol. Leipzig. be used instead. 1875. Heredity.. " Regression.. " Skew Variation in Homogeneous Material. d.. The student will find a convenient description with illustrations in (10) Eldbrton. 1915. (also numbered vol." Phil. with the quartile deviation as the measure of dispersion.) " Ueber graduelle Variabilitat von pflanzlichen (8) Vi'. W." Abh. Fkancis. Natural Inheritance Macmillan. p. vol. (2) Peakson. Soc. Bd. and Panmixia. d.. " by Intercomparison. or of the General Moments of a Grouped Distribution. We . Galton. xi. Fkanois." Skewness. (9) Peakson. 1894." PAiZ. Classe) . Laplace. Th^orie analytique des probabiliUa: 2'"' supplement." Jour. Eigensohaften. . "Contributions to the Mathematical Theory of Evolution (i. dessen Bestimmung. (Introduction of " coefficient of variation. (The method of percentiles is used throughout. Wiisenschaften. xii. Trans. clxxxvii. Trans. REFERENCES. " Ueberden AusgangswerthderkleinstenAbweiohungssumine. vol. G. Soc. the same property. p.-phys. 1896. Boy. 253. 1889." Phil. . Chap. .. 1. continuing the work from the stage reached for Qu. 1. —MEASURES OF DISPERSION. Cliap. VII.VIII. 155 EXERCISES. VI. Verify the following from the data of Table VI. ETC. ) Show that if deviations are small compared with the mean. show that if deviations are small compared with the mean. and above it n^ . 564. in that interval Wj. (W. Find the coiTection to be applied to this sum. we have approximately the relation : G . . loc. Gesellschaft d. " Ueber Mittelwerthe..) Similarly. (Scheibner.. 1873. ref. p. found by the use of this formula. 1899) as an empirical one. grouped frequency-distribution. S. cited by Fechner.G^=a^. is found to be S.156 (or median). on the assumption that the observations are evenly distributed over each class-interval." Berichte der kgl. VII. in a THEOKY OF STATISTICS. and <r the standard deviation and consequently to the same degree of approximation M^ . the 'second form of the relation is given by Duncker (Die Melhode der VariationsstatistiTc Leipzig. Wissenschaften. so that (x/M)^ may be neglected in comparison with x/M. 2 of Chap. do not dill'er from the values found by the simpler method of §§ 16 and 17 in the second place of decimals. and the distance of the mean (or median) from the arbitrary origin to be d. in order to reduce it to the mean (or median) as origin. 9. G is the geometric mean. Scheibrier. sachsischen 8. cit. the arithmetic mean. G=M(l-ii). Qu. Show that the values of the mean deviation (from the mean and from the median respectively) for Example ii. we have approximately where : M H being the harmonic mean. Take the number of observations below the interval containing the mean (or median) to be n^. statures of fathers and their sons (British). Table II. The general problem 8-9. The 4-5.. Table III. and the standard-deviations of arrays 15-16. V. the methods of classification employed in the preceding chapters may be applied to both.. The correlation coefficient. fertility of mothers and their daughters (British Table V. and a table of double entry or contingency-table (Chap. The line of means of rows and the line of means of columns : their relative positions in the case of independence and of varying degi'ees of correlation 10-14. "rows" are distinguished only by the accidental circumstance 157 .. two measurements on a shell (Pecten). and the consideration of the relations between them. Table IV. We have now to proceed to the case of two variables.. exhibiting the frequencies of pairs of values lying within given class-intervals. Numerical calculations Certain points to be 17.) be formed. Table VI. the regressions. districts of England and Wales. and the more important constants that may be calculated to describe certain characters of such distributions. If the corresponding values of two variables be noted together. and the total numbers of registration. births. — — — — — In chapters VI. 1.— CHAPTER IX.. the rate of discount and the ratio of reserves peerage). 1 -3.-VIII. Six such tables are given below as illustrations the following for variables Table I. Similarly. the proportion of to deposits in American banks. we considered the frequency-distribuof a single variable.. ages of husbands and wives in England and Wales in 1901. remembered in calculating and using the coefficient. The correlation surface correlation table and its formation 6-7. 2. every column gives the frequency-distribution of the second variable for cases in which the value of the first variable lies within the As " columns " and limits stated at the head of the column. tion : — male to total births. in the Each row in such a table gives the frequency-distribution of the first variable for cases in which the second variable lies within the limits stated on the left of the row. COEEELATION. a » a s 3 §4J s . ^ a si S "-I ^ .THEORY OF STATISTICS. 159 1 .IX.TION. —COREELA. 160 THEORY OF STATISTICS. tS f< . 0 J S ST. —CORRELATION. 161 Si4 3 o E-1 S - 8.-. B £ > fe>l fe-s - § I 8^ I -^ ..IX. fO *> r-H o I 9^-^ . • QJ ti Ph 9 ..» -5 il^l 23 ^ s y CQ .162 00 THEORY OF STATISTICS. « rrt 3 ill g o r . o 5> CM i-i ^8 4 *^ PC . IX. 163 1 . — CORRELATION. distribution of frequency for two variables may be by a surface or solid in the same way as the frequencydistribution of a single variable may be represented by a plane 4. the word array has been suggested as a convenient term to denote either a row or a column. and that of the son 62-5 in. The difficulty as to the intermediate observations values of the variables corresponding to divisions between class-intervals will be met in the same way as before if the value of one variable alone be intermediate. and thus quarters as well as halves may occur in the table. If both values of the pair be intermediates. as regards the choice of magnitude and position of class-intervals.164 of the THEORY OF STATISTICS. and the difference has no statistical significance. We may imagine the surface to be obtained by erecting . (Pearson. each pack can then be run through to see that no card has been mis-sorted. 60-5-61 "5. 6. Workers will generally form — — : own methods for entering such fractional frequencies during the process of compiling. in Table III. 3.. In this case the statures of fathers and sons were measured to the nearest quarterinch and subsequently grouped by 1-inch intervals a pair iii which the recorded stature of the father is 60'5 in. in one array are associated with values of Y between the limits Y^ — h and F„ + 8.^r. the observation must be divided between /omj* adjacent compartments. each pair of recorded values may be entered on a separate card and these dealt into little packs on a board ruled in squares. one set running vertically and the other horizontally.) The special kind of contingency tables with which we are now concerned are called correlation tables. to distinguish them from tables based on unmeasured qualities and so forth. The represented figure. in such a way as to avoid their fractional frequencies. When these have been fised. If the values of X. If facility of checking be of great importance. and the rows 61'5-62-5. is accordingly entered as 0'25 to each of the four compartments under the columns 59-5-60-5. Y„ may be termed the type of the array. It is best to choose the limits of class-intervals. the four dots should be placed in the position of the four points of the X and joined when complete. Nothing need be added to what was said in Chapter VI. the table is readily compiled by taking a large sheet ruled with rows and columns properly headed in the same way as the final table and entering a dot. or into a divided tray . as. but one convenient method is to use a small x to denote a unit and a dot for a quarter . ref. the unit of frequency being divided between two adjacent compartments. e. where possible. or small cross in the corresponding compartment for each pair of recorded observations. stroke. 62'5-63'5. somewhat truncated. It is impossible. scale. to group the majority of frequency-surfaces. and erecting on each square a rod of wood of height proportionate to the frequency. The simplest ideal type is one in which every section of the surface is a symmetrical curve the first type of Chap. and like the disthe surface is consequently asymmetrical. to the same scale. and fig. may be drawn from anthropometry. and the surface is symmetrical round the vertical through the maximum. if not all. VI. however. The maximum frequency occurs in the centre of the whole distribution. equal frequencies occurring at The next equal distances from the mode on opposite sides. 30 the distribution of Table III. 89). 165 at the centre of every compartment of the correlation-table a vertical of length proportionate to the frequency in that compartment. 5. under a few simple types the forms are too varied. will serve as an example. Most. just as the area of the frequency-curve over any interval of the base-line gives Models of the frequency of observations within that interval. or by marking out a baseboard in squares corresponding to the compartments of the correlation-table. actual distributions may be constructed by drawing the frequencydistributions for all arrays of the one variable. of the distributions of arrays are asymmetrical. and joining up the tops of the verticals. 29 shows the ideal form of the surface. of course. which approximates to the same the difierence in steepness is. This form is fairly common. in the same way as the frequency-curves. Fig. 9. tribution of fig. (fig. frequency-solid over any area drawn on its base gives the frequency of pairs of values falling within that area. 5.. this is a very on sheets : — rare form of distribution in economic statistics. and erecting the cards vertically on a base-board at equal distances apart. merely a matter of type. Like the symmetrical distribution for the single variable. The total distributions and the distributions of the majority of the arrays illustrations — — : — . p. anthropometry. If the compartments were made smaller and smaller while the class- frequencies remained finite. simplest type of surface corresponds to the second type of frequency-curve the moderately asymmetrical. 92 and the maximum does not lie in the centre of the distribution. meteorology. The data of Table II. etc. the irregular figure so obtained would approximate more and more closely towards a continuous curved corresponding to the frequencysurface a frequency-surface The volume of the curves for single variables of Chapter VI.IX. — — of cardboard. but approximate . Such solid representations of frequency -distributions for two variables are sometimes termed stereograms. and illustrations might be drawn from a variety of sources economics. — COKRKLATION. p. 166 THEOKY OF STATISTICS. . ory of Statistics. 29 ^e 60 no.(daWTable HI. 30. «% Sir. ] «•* «<? <ii.uency Stature of Surface for Stature of Father and ?.-Fre.) the : . . ] "^^^S Fie.Theory of Statistics. 31. — Frequency Surface for the Rate of Discount and Rati . . The distribution of frequency is very characteristic. It is clear that such tables may be treated by any of the methods discussed in Chapter V. the distributions of all parallel arrays are similar (Chap. the frequency in each compartment being represented by a square pillar. A few examples should be worked as exercises by the student (Question 3). high values of the one variable show any tendency to be associated with high (or with low) values of the other.. on the contrary.. relate solely to averages the most important and fundamental question is whether. the more central rows being nearly symmetrical. and slowly in the direction of old age. In applying any of these methods. and VIII. Outside these two forms. or the coefficient of contingency can be calculated (§§ 5-8). In general they are not the same. it seems impossible to delimit empirically any simple types. The distribution may be investigated in detail by such methods as those of § 4. and if so. V..g. Of the two constants. The classification should. and it is not necessary to retain the constancy of the class-interval. III. Fig. But the coefficient of contingency merely tells us whether. Tables V. § 13). on an average. 31 gives a graphical representation of the former by the method corresponding to the histogram of Chapter VI. and quite different from that of any of the Tables I. however.".IX. it is desirable to use a coarser classification than is suited to the methods to be presently discussed. two variables are independent. can be If the applied to the arrays as well as to the total distributions. are given simply as illustrations of two very divergent forms. however formed. must be the same. the more important. which are applicable to all contingency-tables. in general. and VI. means and standard deviations.' how closely. and much more information than this can be obtained from the correlationtable. the two variables are related.. or tested for. II. seeing that the measures of Chapters VII.. —CORRELATION. the skewness being positive for the rows at the top of the table (the mode being lower than the mean). The maximum frequency lies towards the upper end of the table in the compartment under the row and column headed " 30 . hence their averages and dispersions. 6. isotropy (§ 11). The frequency falls off very rapidly towards the lower ages. e. we also desire to know how great a divergence of the one variable from its average value is associated : . the mean is. If possible. be arranged simply with a view to avoiding many scattered units or very small frequencies. or IV. 7. and negative for the rows at the foot. 1 67 are asymmetrical. and the relation between the mean or standard deviation of the array and its type requires investigation. and our attention will for the present be conThe majority of the questions of practical statistics fined to it. 12. the 00 2 37ki 4- S 6X 4 6 6 7 8 <9. OF be the scales of the two values of means of arrays. Let M-^ be the mean value etc. of X. being successive class-intervals. 8. and the means of arrays must lie on the vertical and horizontal lines M^M.e. and to obtain some idea as to the closeness with which this relation is usually fulfilled. with a unit Sivergence of the other.168 THEORY OF STATISTICS. Suppose a diagram (fig. M^M. the distributions of frequency in aU parallel arrays are similar (Chap. i. the scales at the head and side of the table. and M^ the mean value of T. variables. 32) to be drawn representing the Lfet OX. 01. . § 13).. If the two variables be absolutely independent. V. 36-8. is to find formulse or equations which will suffice to describe approximately these curves. a purpose which will be served very fairly by fitting a in 2 JM. 32. 33. 33. nor coincident as in fig. —CORRELATION. 4 5 6 Fig. and further. (Of. 9. plotting the points representing means of arrays on a diagram like those of figures 36-38. 169 fig. it is found either (1) that the means of arrays lie very approximately round straight lines. to know merely whether on an average high values of the one variable show any tendency to be associated with high or with low values of the other. figs. fit such lines by a simple graphical method. The complete problem of the statistician. like that of the physicist. 36-38. straight line . by means of a stretched black thread shifted about till it appeared to run as near as — — We . but. and " fitting " lines to them. in a large number of cases. say. In the general case this may be a difficult problem.) and they are relatively more frequent than might be supposed the fitting of straight lines to the means of arrays determines might all the most important characters of the distribution. it often suffices. in the first place.IX. but standing at an acute angle with one another as Sli (means of rows) and CC (means of columns) in figs. or (2) that they lie so irregularly (possibly owing only to paucity of observations) that the real nature of the curve is not clearly indicated. and a straight line will do almost as well In such cases as any more elaborate curve. as already pointed out. since 2(ray) = 0. But such a method is hardly satismore especially if the points are somewhat scattered it leaves too touch room for guesswork.-^. the horizontal through M. and may accordingly be termed the mean of the whole distribution. be Sj. 10. Consider the simplest case in which the means of rows lie as exactly on a straight line (fig. Let i/^ be the mean value of Y. . and let cut AI^x. ho\lever irregularly the means may lie. i. let the slope of to the vertical. 34). the tangent of the angle M-^MR or ratio of Id to IM. for a given distribution. Some method is clearly required which will enable the observer to determine equations to the two lines factory. Then it may be shown that the vertical through must cut in M-^. S(a. it remains only to determine RR RR RR M OX 5W M RR .5jy. Knowing that passes through M. and therefore for the whole table. = m. in M.170 THEORY OF STATISTICS.e.) = 6j2(«y) = 0. Mx be denoted by x and y^ Then for any one row of type y in which the number of observations is n. and different observers obtain very diflferent results. the mean of X. and let deviations from My. simply and definitely as he can calculate the means and standard deviations. might be to all the points. For. i/j must therefore be the mean of X. (3) are usually (3) These two equations slightly different form. . %{x . The meaning of the above expressions when the means of rows and columns do not lie exactly on straight lines is very readily obtained. RB and GO— y=r—.e. .2(l . in- 171 by This may conveniently be done p of all pairs of associated deviations terms of the mean product z and y.6iy2.— IX. say + S)— then . (6) These equations may. of course.y)2 If Jj = iW. = r-'' .2. *2 = f. . Therefore for the whole table ^1 = ^2 • • (2) Similarly. — CORKELATION. if CC and 62 its slope to be the line on which lie the means of columns the horizontal. rs/sM. (4) Then 6.b^yf = iTo-. . (5) Or we may write the equations a. be expressed. (7) be given any other value.x . X 5(x-ii. (2) and r= written in a Let -^ .y . if desired. — p=^{^y) For any one row we have S(«2/) (1) = yS(a.»-2 + 32).(l-r2) (r . If the values of x and b^y be noted for all pairs of associated deviations.) = ra. in terms of the absolute values of the variables and Y instead of the deviations x and y. we have for the sum of the squares of the differences. i.=r-'' to 4. giving b^ its value from (5). 11. = i . Therefore for the whole table. If 62 be given the value r —".yy^^nsJ) + ^nd^).with respect to the line GG.35). %{x . . is n. Similar theorems hold good. the left-hand side being a minimum.. Hence we may regard the equations (6) as being.b^yf = nsj^ + n. This is necessarily greater than the value (7) . of course. But the first of the two sums on the right is unaffected by the slope or position of JiR.cP. the deviation of the mean of the row from and the standard deviation is s^. the second sum on the right must be a minimum also.. That is to say. SB ^x-b.b^yf has the lowest possible value when 6j is put equal to rcrja-y. 35). when 6j is put equal to r crja-j.b^. hence 2(a. for any one row in which the number of observations is d (fig. hence. the sum of the sqvures of the distances of the row-means from RR. either (a) equations for estimating each individual x its from associated y (and y from its associated x) in such a way . Further. each multiplied by the correspovding frequency. and also S(»i«^) (fig. is the lowest possible.172 THEORY OF STATISTICS. %(x .yy is a minimum.. IX. —CORRELATION. 173 as to least possible the sum of the squares of the errors of estimate the . make Aff& of Wife sv^ 30 4C 50 60 10 SO so \ 10 ^50 ^60 7*7 SO . when every mean is counted once for each observation on which it is based. or {b) equations for estimating the mean of the x's associated with a given type of y (and the meoM of the p's associated with a given type of x) in such a way as to make the sum of the squares of the errors of estimate the least possible. for the sum of the series of squares in equation (7) is then zero and the sum of a series of squares cannot be negative. value cannot exceed ±1. associated with large values of y.).174 are THEORY OF STATISTICS. negative if small values of x are associated with The numerical large values of y and conversely (as in Table V. If r= ±1. this : 6Z 64 63 R Father^ statLWe 66 68 70 72 66 C e? 5<59 IS «0 7/ 73 75 . -IV. it follows that all the observed pairs of deviations are subject to the relation xjy = a-Ja.). and conversely (as in Tables I. upward or downward. r being very small ( .. ) means of rows shown and means of columns by crosses : r= +0'21. III. 38. 10 R . + 0'51. Correlation between number of a Mother's Children' and her Daughter's Children (Table IV. 36. and tig. 3 trend. and endeavour to accustom himself to estimating the value of r from the general '^ appearance of the table. Figs. or simply the regressions. Fig. 6j being the regression of x on y. but the coefhcient of contingency G (for grouping of qu.. The student Nianber of Mother's Childreiv. however. and b^ being . and IV. — number of by oivcles should study such tables and diagrams closely. 39 will serve as an illustration of a case in which the variables are almost uncorrelated but by no means independent. to right or to left. 38 are drawn from the data of Tables II. 3) 0'47. the correlation being positive in each case. 37. The two quantities are termed the coefacients of regression. —CORRELATION. conveniently Table VI. or deviation in x corresponding on the average to a unit change in the type of y.. for which r has the values +0-91. 175 any definite Two variables for which spoken of as uncorrelated. 13. r is zero are.IX.0'014). and +0-21 respectively. Whilst the coefficient of correlation is always a pure number. lirSis -per 1000 TnrOis.-IV. : .176 THEORY OF STATISTICS. Since r is in Tables I. similarly the regression of y on x. the regressions are only pure numbers if the two variables have the same dimensions. ofMale. and consequently on the units in which x and y are measured. as their magnitudes depend on the ratio of o-Ja-y. Proportion. They are both necessarily of the same sign (the sign of r). I . " and equations (6) '' — the " regression equations. from the mean of all sons " towards the general mean. however. b^. s» is the standard deviation of the a. Chap. however. to a satisfactory degree of approximation. a case to which the distribution of Table III." The expressions " characteristic lines. — COREELATION. as in Example i. the cases of means and standard deviations. 8) would perhaps be better. = !. to assume that the regression is linear.e. b^. the work is quite straightforward. and 0'52 i." 15. where the regression is truly linear and the standard deviations of all parallel arrays are equal. or sufficient to justify the formation of a correlation-table. by straight lines. that such linearity extends beyond the limits of observation.b-^. and Sy as an average standard deviation of a column about GO. The variables are (I) — X— 12 ..x). the only new expression that has to be calculated in order to determine r. the form of the arithmetic is slightly different according as the observations are few and ungrouped. J^-'T^ «!. as in Tables V. Where the actual means of arrays appear to be given. Hence s^ and s„ are sometimes termed the " standard deviations of arrays. they step back or " regress may be termed the " ratio of regression.b^. In the first case. In an ideal case. Table VII.— . § 19 (3) )." " characteristic equations " (Yule. the estimated Example i. below. ref. or " regression " towards a more the idea of a " stepping back obviously so where or less stationary mean is quite inapplicable the variables are different in kind.-array and Sy the standard deviation of the y-array (c/.y). vV1 - '-" are of considerable importance. Proceeding now to the arithmetical work. It follows from (7) that s^ is the standard deviation of {x ." In general. and similarly Sy is the standard deviation of {y . may also be regarded as a kind of average standard deviation of a row about RR. 14.. are generally termed the " lines of regression. 177 Hence the sons of fathers of deviation x from the mean of all fathers have an average deviation of only 0*52a. and VI. ""i. Hence we may regard s^ and Sy as the standard errors (root mean square errors) made in estimating x from y and y from x by the respective characteristic relations x^b^y Sj y = b^. we may say It is not safe. is the product sum ^(xy) or the mean product^. and the term " coefficient of regression " should be regarded simply RB and GC as a convenient name for the coefficients 6j and b^. X. is a rough approximation. IX. As in 8„ and Sj.x. The two standard deviations »i = <'. 1. . Table VII.178 THEOKY OF STATISTICS. Theory of Ooeeblatiok: Example i. 15-94) and y hj (Y. the percentage of the population Chap. VIII.. A penny is rather a small unit in which to measure deviations in the average earnings. 17-53. (2) in receipt of Poor-law relief on the 1st January 1891 in each of the same unions {B return). The means of each of the variables are calculated in the ordinary way. The 6„ = r^=-0-50. making 0-^. The standard errors made in using these equations to estimate earnings from pauperism and pauperism from earnings respectively are (T.y)= -666'04: therefore the mean product ^ = 2(a. replacing x by (X. = l-28s. (b) the units being Is. for the pauperism. and 20-5x1 -29 There is therefore a well-marked relation exhibited by these data between the earnings of agricultural labourers in a district and the percentage of the population in receipt of Poor-law relief. every x is found as before (Chap. VIII.IX. —CORRELATION. and 5i=r^=-0-87.3 -67) and simplifying. 136). p. a!=-0-87y y=-0-50a. Jl -r^= 0-97 per cent.= 1-71. For practical purposes it is more convenient to express the equations in terms of the absolute values of the variables rather than the deviations: therefore. for the earnings and 1 per cent.. to a shilling. added up separately and the algebraic sum of the totals gives S(a. so for the regressions we may alter the unit of a.y)/iV= : . . (Tj v/r^= 15-4d. in terms of these units. regression equations are therefore. p. multiplied by the associated y and the product entered in column These columns are then 8 or column 9 according to its sign. Z— 179 average weekly earnings of agricultural labourers in 38 English Poor-law unions 6t an agricultural type (the data of Example i. These deviations are then squared (columns 6 and 7) and the standard deviations Finally.. 137). we have X= 19-13 -0-877 r=ll-64-0-50X . and then the deviations x and y from the mean are written down (columns 4 and 5) care must be taken to give each deviation the correct sign.. (a) . in earnings another means on the average a A natural confall of 1 in the percentage in receipt of relief. Which is the correct interpretation of the facts? The above in passing The equpt^on from one district to : 1Z 13 „i4- 15 le -11 18 19 zo m -^ (5.0 . but such a Equation (a) indicates. in earnings this might mean that the giving of relief tends to depress wages.180 THEORY OF STATISTICS. or 10|d. for instance.earnings in diminishing the necessity for relief. that every rise of a unit in the percentage relieved corresponds to a fall of 087 shillings. (b) tells us therefore that a rise of 2s. clusion would be that this means a direct effect of the higher . conclusion cannot be accepted oflfhand. That is. —COREELATION. . and drawing the lines. (1) the product-sum is calculated in the first instance with respect to an arbitrary origin. It will sometimes give a sensible correction even for work in the form of . Thus is determined from (a) = 19-13 and 7=6. the procedure is of precisely the same kind as was used in the calculation of a standard deviation. The diagram gives a very clear idea of the distribution . clearly the regression is as nearly linear as may be with so very scattered a distribution. determined from (6) by the points 12. (2^ the arbitrary origin is taken at the centre of a class-interval . being the sums of deviations from the mean. Then EJi and CO it X X= X= i = x +^^ rj = y + ^. this correction must be used. The most exceptional districts are Brixworth and St Neots with rather low earnings but very low pauperism. Marking in these poiats. When a classified correlation-table is to be dealt with.— IX. SB 181 doing this given by the regression equations (a) and (6). 16. will be found to meet in the mean. In is as well to determine a point at each end of both lines.Z=13-91: GC is by the points 7=0. using p' to denote the meanproduct for the arbitrary origin. 7=367. 7 = 1"14. since the second and third sums on the right vanish. and let 1^ be the co-ordinates of the mean. and there are no very exceptional observations. Let deviations from the arbitrary origin be denoted by It/. 7=5-64 and X=21.^. or bringing 2(a:y) to the left. and is afterwards reduced to the value it would have with respect to the mean . and Glendale and Wigton with the highest earnings but a pauperism well above the lowest over 2 per cent. (3) the class-interval is treated as the unit of measurement throughout the arithmetic. In any case where the origin from which deviations have been measured is not the mean. the same artifices being used to shorten the work. That is to say. and then to check the work by seeing that they meet in the mean of the whole distribution. P =P' . they 15-94. in terms of mean-products. summing. Therefore. l'29 intervals = 6 '45 per cent. The frequencies are then collected as shown in columns 2 and 3 of Table VIIIa.. in the negative 14 -1-6 = 20: for ^17 = 2. column 4 of Table VIII. treating the class-interval as the unit these are the figures in To ^ compartment : heavy type Table VIII. being grouped according to the value and sign of ^t. 10 + 4-5H-1 -1-4-5 = 20 in the positive quadrants. The following are the values found for the constants relief in 235 unions of Wales (2) 7. 182 Example and in that case. they negative. No. The figures refer to a one-day count (1st August 1890. or 3'5. . of calculating the correlation cois of As the arithmetical process efficient great importance. and so on. is first written in every calculate %{iri}. In making these entries the sign of the product may be neglected. for Y at the centre of the fourth row. doors" (in their own homes) to one "indoors" (in the workhouse). vol. the value of of the table against the corresponding frequency. The two variables are (1) X. the total frequency in the positive quadrants is 13-)= 21-5. or at 17 '5 per cent. vi. 613. should first of all be checked to see that no frequency has been dropped.— THEORY OF STATISTICS. p. 5-h2-)-l-|-3'5 = ll'5 in the When columns 2 and 3 are completed. M^ = fj= -I-0-36 intervals or units.. My = 3'86. (Economic Journal. Example ii.. 36. and the table is one of a series that were drawn up with the view to discussing the influence of administrative methods on pauperism. 1890). (the row and column for which being careful not to count twice the frequency in the compartment common to the two this grand total must clearly be equal to the total number of observations Jff. or 235 in the present case. the percentage of males over 65 years of age in receipt of Poor-law from a grouped table — a mainly rural character in England and the numbers of persons given relief " out.) was taken at the centre of the fourth The arbitrary origin for column.. Table "VIII.0'77 per cent.. the ratio of X . the second biological. of course. the first economic. whence 16-73 per cent. whence (r„ = 2-98 units. which may be readily done by adding together the totals of these two columns together with the frequency in row 4 and = 0). i. 1896. negative in the two others.. The algebraic sum of the frequencies in each line of columns 2 and 3 is in ^ . Thus for 8 '5 fi7 = l. but it must be remembered that this sign will be positive in the upper left-hand and lower righthand quadrants. the standard deviations will also require reduction to the mean. we give two illustrations. of the single distributions : 1= i7-j = -0-15-32 intervals= . — ) IX. 183 Table VIII. (The Frequencies are the figures printed in ordinary type. The numbers in heavy type are the Deviation-Products (|r)). Thbort op Cokkblation : Example ii. Number relieved Outdoors to One Indoors. Old-age Pauperism and Proportion of Out-relief. —CORRELATION. . 1. Table VIIIa. .184 THEORY OF STATISTICS. Calculation of the Pkoduct Sum S(|t)). the standard error from Y being is (Tx made in using the equation Jl -r^= 6'07. Jf. telling us rise of 1 in the ratio of to the numbers relieved in the workhouse corresponds on an average to a rise of 0'74 in the percentage in receipt of relief. 185 a. according to the magnitude and sign of fiy. in columns 2 and 3 of Table IXa. The result is such as to create a presumption in favour of the view that the giving of out-relief tends to increase the numbers relieved.— IX. as the equation of greatest practical interest.) The two variables are (1) X. The following are the values found for the constants of the single distributions : X J= -1'058 ir»= interTals= - 6*3 mm. actual. ^ . Example iii. for estimating X that. '707 2 "828 intervals = .. the actual magnification being 24 1. Table IX. = 987 mm. The entries in these two columns are next checked by adding to the totals the frequency in the row and column for which is zero. 18'5 mm. = 4'11 mm. and seeing that it gives the total number of observations The numbers in column 4 are given by deducting the (266). on drawing. entries in column 3 from those in column 2. The mother-frond was measured when the daughter-frond separated from it. and this can be taken as a working hypothesis for further investigation.. actual. mm. (Unpublished data . actual. The totals so obtained are multiplied by iy) (column 1) and the products entered values" of The table as before. measurements by G. and drawing a diagram like figs. and check both by calculating the means of the principal rows and columns.. —CORRELATION. 4 '32 mm. This we pass from one district to another. on drawing= every compartment of the fij are entered in and the frequencies then collected. 0771 mm. and 38. U. The units of length in the tabulated measurements are millimetres on the drawings. and = 0'74y. (2) Y. 37. + 0-34 X 6-45/2-98 = 0'74. mm. or the regression equation accordingly X= 13-9 + 0-747. the length of the daughter-frond. mm. 17'0 1-2 on drawing= ^=-0'203 By— 3"084 == Jllf2=103-8 = . actual. Yule. 36. The student should work out the second regression equation. on drawing. Measures were taken from camera drawings made with the Zeiss-Abbd camera under a low power. mm. the length of a motherfrond of duckweed {Lemna minor) . a the numbers relieved in their own homes — : The arbitrary origin for both and Y was taken at 105 mm. and the daughter-frond when its first daughter-frond separated. 1.186 THEORY OF STATISTICS. . Table IXa. Correlation letween (1) daughter-frond. in Lemma minor.)). Yule. 60-66 { . The numbers in heavy type are the deviation-products (Jt. Theory of Cokeblation : Illustration iii. U.— Theory of Statistics.] Table IX. [Unpublished data .] (The freque type. G. . as such an alteration will affect both standard deviations equally).) Finally. like all other statistical measures. 8). and to check the whole work by a diagram showing the lines of regression and the means of arrays for the central portion of the table. 17. Hence the regression equation giving the average actual length (in millimetres) is of daughter-fronds for mother-fronds of actual length X 7=l-48-fO-69X again leave it to the student to work out the second regression equation giving the average length of mother-fronds for daughter-fronds of length Y. for example. but will find a series of positive and to find r = negative values centring round 0. we are very unlikely ever absolutely. If we write on cards a series of pairs of strictly independent values of x and y and then work out the correlation coefficient for samples of. 40 or 50 cards taken at random. § 15. in the value of r^p/a-. The student should be careful to remember the following points in working (1) To give p' and $rj their correct signs in finding the true mean deviation-product p. 900. XVII. the work will be wrong unless means and standard deviations are expressed '{1 the same units. III. as above. (See Chap. (2) To express cr^ and a-y in terms of the class-interval as a unit. it must always be remembered that correlation coefficients. r= +0'3 may similarly be a mere fluctuation of If sampling. —CORRELATION. though again an infrequent one.— IX. a value of r= ± O'l might occur as a fluctuation of sampling of the same degree of infrequency. the coefficient of correlation or the coefficient of.contingency. a-y. No great stress can therefore be laid on small.and daughter-fronds. if iV=100. for these are the units in terms of which p has been calculated. merely a chance result (though a very infrequent one) . (3) To use the proper units for the standard deviations (not class-intervals in general) in calculating the coefficients of regression in forming the regression equation in terms of the absolute values of the variables. e. if iV"=36. The student must therefore be careful in interpreting his coefficients. a value of r= +0*5 may be small. Further.g. §§ 7. say. it should be borne in mind that any coefficient. Chap. gives : : We N= . 187 The regression of daughter-frond on mother-frond is 0'69 (a value which will not be altered by altering the units of measurement for both mother. are subject to fluctuations of sampling (cf. or even on moderately large. values of r as indicating a true correlation if the numbers of observations be For instance. determining his coefficient (Qalton's function. p. vol. Galton. vol." Jour. and based on the use of the median the method involves the use of trial and error to some extent. Edgeworth and A.. " Analyse matWmatique sur les : probability des erreurs de situation d'un point. 1886.). and Phil. Y. Boy. "Regression towards Mediocrity in Hereditary Stature.. "Family Likeness in Stature.. 1846. Fkancis. or the original the correlation table. p... 255. their Measurement. XVI. p. Mimoires prdse^itis par divers (2) Galtox. Boy. Boy." Proc.. 812.) (10) Edgeworth." Phil. 222. Soc. F. U. clxxxvii. but not a single symbol for a coefficient of correlation. data if no correlation table has been compiled. des Sciences savants. Feancis. xv. Soc.— 188 THEORY OF STATISTICS." Phil. 1892. Soc.. 246. and (4). 5th Series. correlation coefficient from to 1 by steps of a twelfth. t. Edgeworth. Edgeworth developed the theoretical side further in (5). only a part of the information afforded by the original data or The correlation table itself. p. II« s^rie." PAii. Joiir. (7) Yule.) to memoirs on (he theory of non-linear regression are given end of Chapter X. Ix. (Tables and diagrams illustrating the meaning of values of the 1907. unless considerations of space or of expense absolutely preclude the adoption of such a course. Anthrop. Correlated "On Averages.. Bowley. Karl. p. ix. variables differing entirely from that described in the preceding chapter. and Pearson introduced the product-sum formula in (6) both memoirs being vvritten on the assumption of a "normal" distribution of frequency (c/.. Boy.. Chap. in the case of Skew Correlation." Mem. (8) (9) Yule. xxiv. Mag. xlv. a. vol. as it was termed at first) graphically..) being assumed. the so-called " normal distribution " (Chap. in (2). Y." Jour. vol. Reference . xl. . p.. Daebishirb. (3) (4) (5) (6) Francis. 1897. 1896. Soc. vol. 1897. : may also 1902. be made here to F. p. and Proc. 42. D. In (1) Brayais introduced the product-sum. Boy. The method used in the preceding chapter — (1) is based on (7) and (8). G. "On a ISTew Method of reducing Observations relating to several Quantities." ^cofZ. should always be given. 253. a.." Proc.. REFERENCES. 1886. Heredity. Stat. "Some Tables for illustrating Statistical Correlation. Inst. U. 477. Series A.. 1888. G. (3). Boy. 184. For some illustrations see F. Mag... etc. li. and vol. 5th Series. "On the Theory of Correlation.. vol. Soc. The theory of correlation was first developed on definite assumptions as to the form of the distribution of frequency. Sir Francis Galton. vol. L. " Correlations and Soc. . Ix. xxxiv. XVI. vol. Stat. Soc. Trans. 190. 1887. Y. vol. Pearson. the significance of Bravais' Formulas for Regression. p.. of the Manchester Lit. developed the practical method. 1888. 341 Beferences at the et seq. "Regression." Proc. "On p. Ixv.. (A method of treating correlated p. 135. Beavais. and Panmixia. vol. p. xxv. Galton. . X.IX. The following figures show. the ratios of the numbers of paupers in receipt of outdoor relief to the numbers in receipt of relief in the workhouse. Find the correlation-coefficient for the following values of X and Y. for the districts of Example i.. —COHEELATION. (2) the percentage 1 . Find the correlations between the out.relief ratio for so : and (1) the estimated earnings of agricultural labourers of the population in receipt of relief. and the equations of regression Y. [As a matter of practice it is never worth calculating a correlation-coefficient few observations the figures are given solely as a short example on which the student can test his knowledge of the work. 189 1. ] 2. EXERCISES. . llH-12.) for both husband and wife. from Pearson. leaving central cols. so as to avoid small scattered frequencies at the extremities of the tables and also excessive arithmetic : I. . the coefficient of mean square contingency is 465. etc.. For cols.] IV. Regroup by 2-inch intervals. . .. 0. 58'5-60'5. 44-56. 8-H4. 56 upwards. . 28-44. ref. For cols. Regroup by ten-year intervals (15-. 11 and upwards. : . Rows. Group together (1) two top rows. II. group all up to 494 '5 and all over 521 '5. etc. 59'6-61"5. leaving centre of table as it stands. etc. . 35-. (2) three bottom rows. 9-hlO. 4. 3 -1- .. — etc. making the last group "65 and over. (4) four last columns. 1 of Chap. Rows singly up 20 then 20-28. (3) two first columns.. for both father and [Both results cited son). group 1-1-2. In calculating the coefficient of contingency (coefficient of mean square contingency) use the following groupings." III. . V. VI. 13 and upwards. If a 3-inch grouping be used (58'5-61'5. for father. . for son. 25-.• 190 THEOKT OF STATISTICS. 1 -^ 2. the field of choice is frequently very much limited.CHAPTEK X. like an average or a measure of dispersion. : Changes in infantile and general mortality 15-17. to should be careful to note that the coefficient of correlation. Correlation between the movements of two variables (os) Non-periodic movements Illustration iv. and. 1. and the real difficulties arise in the interpretation of the coefficient when obtained. No general rules can be laid down. they should afford the answers to specific and definite questions. CORRELATION: ILLUSTRATIONS AND PRACTICAL METHODS.rate and foreign trade^ 18. Elementary methods of dealing with cases of non-linear regression 19. but the following are given as illustrations of the sort of points that have to be considered. if several are to be dealt with. (i) Quasi-periodic move: : — — — — — — : ments Illustration v. Certain rough methods of approximating to the correlation : : — 1. The correlation ratio. Further. Necessity for careful choice of variables before proceeding to calculate r 2-8. Illustration iii. : Causation of pauperism 9-10. only exhibits in a summary and comprehensible form one particular aspect of the facts on which it is based. and not only are care and judgment essential for the discussion of such possible hypotheses. by deficiencies in the available data and so forth. care should be exercised from the commencement in the selection of the variables between which the correlation shall be determined. but it may be equally consistent with others. Illustration i. The marriage . Unfortunately. ooefiioient — 20-22. and the crops 14. The variables should be defined in such a way as to render the correlations as readily interpretable as possible. but also a thorough knowledge of the facts in all other possible aspects.: The weather ii. the student of economic statistics. The value of the coefficient may be consistent with some given hypothesis. The student — especially whom this chapter is principally addressed — 191 . and consequently practical possibilities as well as ideal requirements have to be taken into account. Illustration Inheritance of fertility 11-13. ref. (c) Age Distribution. (p.— 192 2. pauperism rose. 3. to deal with changes in pauperism and possible factors. in fact. Illustration i. seems better to deal with (b) (as in the illustration of Table — . The question was treated by correlating the percentage of the aged relieved in different districts with the ratio of numbers relieved outdoors to the numbers in the workhouse. we presumably mean that as administration England. pauperism will fall. then.g. the actual figures given by one of the only two then existing returns of the age of paupers being— 2 per cent. The returns give (a) cost. nationality of population). we mean that when industry recovers. percentage of the population in receipt of relief. 1 per cent. over 16 but under 65. (Return 36. as representative variables. It Pauperism. it would seem better to correlate changes in pauperism with chamges in various possible factors. or moral conditions (as illustrated. was given in Chap. IX. over 65. — It is required to variations of pauperism in the throw some light on the unions (unions of parishes) of {Cf. one from each of the above groups. under age 16.) It is practically impossible to deal with more than three factors. by the statisprices. we mean that changes in that variable are accompanied by changes in the . One became lax. density of population. The possible factors may be grouped under three heads (a) Administration. THEORY OF STATISTICS. It will be better. including the pauperism itself. or four variables altogether. either in the same or the reverse direction. : — — employment). and how shall we best measure — " pauperism " ? 4. 2. pauperism would decrease if we say that the high pauperism is due to the depressed condition of industry. Is such a method the best possible ? On the whole. When we say. If we say that a high rate of pauperism in some district is due to lax administration. 20 per cent. (J) numbers relieved.) bearing on a part of this question. Changes in the method or strictness of administration of the law. Changes in economic conditions (wages. tics of crime). 183). or that if administration were more strict. that any one variable is a factor of pauperism. social conditions (residential or industrial character of the district.) table (Table VIII. 1890. viz. therefore. The next question is what factors to choose. e. the percentage of the population between given age-limits in receipt of relief increases very rapidly with old age. Yule. (6) Environment. What shall we take. the influence of the giving of out-relief on the proportion of the aged in receipt of relief.. This is the most difficult factor of all to deal 6. or (2) as the percentage of numbers given out-relief on the total number relieved. that lends itself readily to statistical treatment. relief on 1st January 1871. Environment.i^rec? Poor Condition). In Mr Booth's work the factors tabulated were (1) persons per acre . IX. 1891. Chap. 1881. —COKEELATION: ILLUSTKATIONS AND METHODS.i. as before. 1894).). The ratio of numbers in receipt of outdoor relief to the numbers in the workhouse. with. the percentages of outdoor to total paupers. The former method was chosen. Aged Poor Condition. and relief in the applicant's home).e. and 1891 (the three census years). the figures are 94 per cent. partly on the ground that the use of the ratio separates the higher proportions of out-relief more clearly from each other. but the return for 1st January was The percentage of the population in receipt of actually used. though some writers have preferred the statement in terms of expenditure (e. say. and as the administrative methods of dealing with these two classes differ entirely from the methods applicable to ordinary pauperism. it seems better to alter the Returns are available giving official total by excluding them. there does not seem to be any special reason for taking the one return rather than the other. as the numbers in receipt of relief on 1st January and 1st July . (2) percentage of population living two or more to a room. the simpler and more important ratio for the present purpose.) union. was therefore tabulated for 1st January in the census years 1871. again. respectively. The returns. 193 numbers are more important than cost from the standpoint of the moral effect of relief on the population. however..g. which are so close that they will probably fall into the same array. The data relating to overcrowding were first collected — — — 13 . 1881. If we decide on the statement in terms of numbers. and these differences seem to have significance. Mr Charles Booth. and one 5. and 91 per cent.— X. shall we measure this proportion by cost or by numbers ? The latter seems. Thus a union with a ratio of 15 outdoor paupers to one indoor seems to be materially different from one with a ratio of. in every union. The most important point here. (3) rateable value per head (. 10 to 1 . "overcrowding". VIII. is the relative proportion of indoor and outdoor relief (relief in the workhouse The first question is. less lunatics and vagrants. generally include both lunatics and vagrants in the totals of persons relieved . we still have the choice of expressing the proportion (1) as the ratio of numbers given out-relief to numbers in the workhouse. but if we take. Administration. partly on the simple ground that it had already been used in an earlier investigation. instead of the ratios. was therefore tabulated for each (The investigation was carried out in 1898. years of age was therefore worked out for every union and tabuThis is not. of rateable value per head. Age Distribution. as the conditions are and were very different for rural and for urban unions. for those under that (A more complete age. the rateable value per head appears to be highly (negatively) correlated with the pauperism. 20 per cent. population of a district is increasing at a rate above the average. this strongly suggests that the industries are suffering from a temporary lack of prosperity or permanent decay. 1881.g. As already stated. viz. of course. decided to use a very simple index to the changing fortunes of a If the district. for those over 65. and are not available for earlier years. it is evidently a most important index. it seemed very desirable to separate the unions into groups But this cannot be done with any according to their character. of a small town with a considerable extent of the surrounding country. the proportions of the population — : .g. 1891. 223-25. rural. Chap. lated from the same three censuses. which is sensibly dependent on the proportion of the two sexes.) 8. method might have been used by correcting the observed rate of pauperism to the basis of a standard population with given numbers of each age and sex. say. but changes in the two are not very highly correlated Some trial was made : probably the movements of assessments are sluggish and irregular. The population of every union was therefore tabulated for the censuses of 1871. the movement of the population itself. and do not correspond at all accurately with the real changes in the After some consideration. XI. however. at all a complete index to the composition of the population as aflfecting the rate of pauperism. taking The percentage the value in the earlier year as 100 in each case. especially in the case of falling assessments in rural unions. and only 1-2 per cent. or not increasing as fast as the average. pp. Further. The changes in each of the four quantities that had been tabulated for every union were then measured by working out the ratios for the intercensal decades 1871-81 and 1881-91. consisting. the figures that are 7. it was value of agricultural land. As the percentage in receipt of relief was. below. For any given year. 1891. and for a group of unions of somewhat similar character. ratios so obtained were taken as the four variables. e. known clearly indicate a very rapid rise of the percentage relieved The percentage of the population over 65 after 65 years of age. exactness the majority of unions are of a mixed character. if the population is decreasing. this is prirndfade evidence that its industries are prospering. and the numbers of children as well. {Of.194 at the census of THEOIiY OF STATISTICS. but with not very satisfactory results. It might seem best to base the classification on returns of occupationSj e. whether. this is impossible i the age of the most important factor is only exceptionally given the wife in peerages. Fertility in One table. The metropolitan unions were also treated by themThe limit 0'3 for rural unions was suggested by the selves.) the average density of these was 0'25. 195 in agriculture. Chap. Finally. 15 years This will rather heavily reduce the number of records at Ipast. included.e. : : — — : : variables. and the is — — . family histories. but will leave a sufficient number for discussion. was given as an example in the last chapter man (i. It is desired to find whether it is also influenced by the heritable constitution of the parents. It would be desirable to eliminate the effect of late marriages in the same way by excluding all cases in which. unfortunately. whatever the age of the parents at marriage. from the memoir (Table IV. of the 38 were under 0-3. it was decided to use a classification by density of population. the grouping used being Eural. ref. the number of children born to a given pair) very largely influenced by the age of husband and wife at marriage (especially the latter). The effect of duration of marriage may be largely eliminated by excluding all marriages which have not lasted. density of those agricultural unions the conditions in which were investigated by the Labour Commission (the unions of Table VII. and similar works.^CORKELATION engaged : ILLUSTKATIONS AND METHODS. i. from which the All marriages must . The method by which the relations between four variables are discussed is fully described in Chapter XII. But.. IX. but a country district cannot reach this density unless it include a small town or portion of a town. 9. The lower limit of density for urban unions 1 per acre was suggested by a grouping of Mr Booth's (group xiv. 0'3 person per acre or less Mixed. but the statistics of occupations are not given in the census for individual unions.X. husband was over 30 years of age or wife over 25 (or even less) at the time of marriage. available. unless a large proportion of its inhabitants live under urban conditions.therefore be data must be compiled. {Cf. Illustration ii.) of course 1 person per acre is not a density associated with an urban district in the ordinary sense of the term. —The subject of investigation is the inheritance of fertility in man.). fertility is itself a heritable character. Pearson and others.) cited. and 34.e. and by the duration of marriage. say.e. more than 0'3 but not more than 1 person per acre Urban. allowance being made for the effect of such disturbing causes as age and duration of marriage. more than 1 person per acre. 3. at the present stage it can only be stated that the discussion is based on the correlations between all the possible (6) pairs that can be formed from the four — . say. i. 4) in the correlation-table. But the woman and (2) the first sisters). 2. may. Cambridge. The produce of a crop is dependent on the weather of a long preceding period. or only one pair 1 For theoretical simplicity the second process latter. Essex. which pair ? is distinctly the best (though it still further limits the available data). The group of eastern counties. for each crop. however. On the other hand.) Produce-statistics for the more important crops of Great Britain have been issued by the Board of Agriculture since 1885 the figures are based on estimates of the yield furnished by official local estimators all over the country.g. Bedford. and it is naturally desired to find the influence of the weather at all successive stages during this period. was selected as fulfilling these conditions. Estimates are published for separate counties and for groups of counties (divisions). for instance. 12. and 4 : are we to enter all three pairs (5. (For a much moire detailed discussion of the problem. The group includes the county with the largest acreage of each of the ten crops investigated. of the varying age at marriage must be estimated afterwards. and the allied problems regarding the inheritance of fertility in the horse. 7. and to determine. If it be adopted. e. (Gf. and the weather. Suppose. the student is referred to the original. ref. / Hooker. be taken in every case. Norfolk. The subject for investigation is the relation between the bulk of a crop (wheat and other cereals. children respectively If the (5. so as to avoid bias whom data are given.) 11. 0). Illustration iii. more homogeneous from the meteorological standpoint. it should be large enough to present a representative variety of soil. (5. generation is 5 (say the mother and her brothers and and that she has three daughters with 0. Hunts. turnips and other root crops. the area should not be too small . with the single exception of permanent grass. the number of children in 10. that the times of both sowing and in : for — : . etc. some regular rule will have to be made for the selection of the daughter whose fertility shall be entered the first daughter married the table. hay. correlation between (1) number of children of a number of children of her daughter will be further affected according as we include in the record all her available daughters or only one.).196 effect THEORY OF STATISTICS. and Hertford. 2). which period of the year is of most critical importance as regards weather.. It must be remembered. Suffolk. But the climatic conditions vary so much over the United Kingdom that it is better to deal with a smaller area. consisting of Lincoln. and who fulfils the conditions as to duration of marriage. 14. X. very cursory inspection of the figure shows that when the infantile mortality rose from one year to the next the general mortality also rose. do not give quite the sort of information that is required at temperatures below a certain limit (about 42° Fahr. and the average taken as the first characteristic of the weather. If. while the slower changes show no similarity. but baaed on 8 weeks' weather. 5-12.e. however. overlapping each other by 4 weeks. on an average of many years. we correlate the produce of the crop (X) with the characteristics of the weather (7) during successive intervals of the year. Correlation coefficients were thus obtained at 4-weeks intervals. —CORRELATION: ILLUSTRATIONS AND METHODS. The method of treating the correlations between three variables. Problems of a somewhat special kind arise when dealing with the relations between simultaneous values of two variables which have been observed during a considerable period of time. 13. i. as a rule . and the growth increases in rapidity as the temperature rises above this point (within limits).. It remains to be decided what characteristics of the weather The rainfall is clearly one factor are to be taken into account." i. 41 exhibits the movements of (1) the infantile mortality (deaths of infants under 1 year of age per 1000 births in the same year) . of great importance. as the second characteristic of the weather these "accumulated temperatures. and similarly. It was therefore decided to utilise the figures for "accumulated temperatures above 42° Fahr. it will be as well not to make these intervals too short. is deacrilaed in Chapter XII. therefore. It was accordingly decided to take successive groups of 8 weeks. etc. : following examples will serve as illustrations of two methods which are generally applicable to such cases. and these two will The weekly afford quite enough labour for a first investigation. and consequently. (2) the general mortality (deaths at all ages per 1000 living) in England and Wales during the period 1838-1904. for the more rapid movements will often exhibit a fairly close The two consilience.) there is very little growth. The student should refer to the original for the full discussion as to data. 197 harvest are themselves very largely dependent on the weather..e. rainfalls were averaged for eight stations within the area. Temperatures were taken from the records of the same stations. weeks 1-8. The average temperatures. show much larger variations than mean temperatures." moreover. when the — A . the total number of day-degrees above 42° during each of the 8-weekly periods. Fig. niustration iv. the limits of the critical period will not be very well defined. temperature is another. based on the three possible correlations between them. 198 infantile mortality THEORY OF STATISTICS. fell, tke general mortality also fell. _ There were, in fact, only five or six exceptions to this rule during the whole period under review. The correlation between the annual values of the two mortalities would nevertheless not be very high, as the general mortality has been falling more or less steadily since 1875 or thereabouts, while the infantile mortality attained almost During a long period of time the correlaa record value in 1899. tion between annual values may, indeed, very well vanish, for the two mortalities are affected by causes which are to a large extent different in the two cases. To exhibit, therefore, the closeness of the relation between infantile and general mortality, for such causes as show marked changes between one year and the next, it will be best to proceed by correlating the annual changes, and not the annual values. The work would be arranged in the following form (only sufficient years being given to exhibit the principle of the process), and the correlation worked out between the figures of cols. 3 and 5. 1. X. —COKRELATION: ILLUSTRATIONS AND METHODS. i.e. 199 ceeding to the second differences, differences of the differences in by working out the 3 and in col. col. successive 5 before corre- ^ 200. ''^' isto Fig. so eo TO so Yecws in ao 1900 « 41.— Infantile and General Mortality England and Wales, 1838-1904. It may even be desirable to proceed to third, fourth or higher differences before correlating. lating. 18SS no eo es lo "is so as HO ^1^ 3^ — 200 rise THEORY OF STATISTICS. from one year to the next when the other rises, and a fall when the other falls. The movement of both variables is, however, of a much more regular kind than that of mortality, resembling a series of " waves " superposed on a steady general the short-period trend, and it is the " waves " in the two variables movements, not the slower trends which are so clearly related. 16. It is not difficult, moreover, to separate the short-period oscillations, more or less approximately, from the slower movement. Suppose the marriage-rate for each year replaced by the average of an odd number of years of which it is the centre, the number being as near as may be the same as the period of the " waves " e.g. nine years. If these short-period averages were plotted on the diagram instead of the rates of the individual years, we should evidently obtain a smoother curve which would clearly exhibit the trend and be practically free from the conspicuous waves. The excess or defect of each annual rate above or below the trend, if plotted separately, would therefore give the "waves" apart from the slower changes. The figures for foreign trade may be treated in the same way as the marriage-rate, and we can accordingly work out the correlation between the waves or rapid fluctuations, undisturbed by the movements of longer period, however great they may be. The arithmetic may be carried out in the form of the following table, and the correlation worked out in the ordinary way between the figures of columns 4 and 7. — — 1. X. —CORRELATION: ILLUSTRATIONS AND METHODS. 201 any one year with the deviation of the exports and imports of the year before, or two years before, instead of the same year ; if a sufficient number of years be taken, an estimate may be made, by interpolation, of the timedifference that would make the correlation a maximum if it were possible to obtain the figures for exports and imports for periods other than calendar years. Thus Mr Hooker finds (ref. 5) that on an average of the years 1861-95 the correlation would be a maximum between the marriage-rate and the foreign trade of about one-third of a year earlier. The method is an extremely useful one, and .is obviously applicable to any similar case. The tion of the marriage-rate in iseo iseo Fig. 43. in (1) Marriage-rate and (2) Foreign Trade (Exports the Curves show Deviations -I- Imports per head) in England and "Wales from 9-year means. Data of K. H. Hooker, Jour. Boy. Stat. Soc, 1901. : — Fluctuations student should refer to the paper by Mr Hooker, cited. Reference may also be made to ref. 10, in which several diagrams are given similar to fig. 43, and the nature of the relationship between the marriage-rate and such factors as trade, unemployment, etc., is discussed, it being suggested that the relation is even more complex than appears from the above. The same method of separating the short-period oscillations was used at an earlier date by Poynting in ref. 16, to which the student is referred for a discussion of the method. 18. It was briefly mentioned in § 9 of the last chapter that the treatment of cases when the regression was non-linear was, in general, somewhat difficult. Such cases lie strictly outside the scope of the present volume, but it may be pointed out and that if a relation between be suggested, either by X T — 202 THEORY OF STATISTICS. it theory or by previous experience, that relation into the form may be possible to throw Y=A + B.<I>{X), A and B are the only unknown constants to be determined. a correlation-table be then drawn up between T and <j>{X) instead of Y and X, the regression will be approximately linear. be the rate of Thus in Table V. of the last chapter, if discount and Y the percentage of reserves on deposits, a diagram of the curves of regression, or curves on which the and Y means of arrays lie, suggests that the relation between is approximately of the form v^here If X X X(Y-£)=A, A and B being constants ; that is, XY=A+BX. Or, if we make XY a new variable, say Z, Z=A + BX. correlation-table Hence, if we draw up a new between X and Z the regression will probably be much more closely linear. If the relation between the variables be of the form Y==AB^ we have log r=log^ + X. log 5, and hence the relation between log if the relation be of the form Y and X is linear. Similarly, X"Y=A we have log Y- log A-n. log X, and so the relation between log Y and log is linear. By means of such artifices for obtaining correlation- tables in which the regression is linear, it may be possible to do a good deal in difficult cases whilst using elementary methods only. The advanced student should refer to ref. 17 for a different method of treatment. 19. The only strict method of calculating the correlation coefficient is that described in Chapter IX. from the formula X *'=W^ —. Approximations to this value may, however, be X. — COEKELATION: ILLUSTRATIONS AND METHODS. 203 (1) found in various ways, for the most part dependent either on the formulse for the two regressions r-^ and °'y r— ""^ , or (2) on the formulEe for the standard deviations of the arrays a-^ Jl - r^ and a-y s/l - 1^. Such approximate methods are not recommended for ordinary use, as they will lead to different results in different hands, but a few may be given here, as being occasionally useful for estimating the value of the correlation in cases where the data are not given in such a shape as to permit of the proper calculation of the coefficient. (1) The means of rows and columns are plotted on a diagram, and lines fitted to the points by eye, say by shifting about a stretched black thread until it seems to run as near as may be to all the points. If 6j, b^ be the slopes of these two lines to the vertical and the horizontal respectively, • r= as Jbyb^. Hence the value of r may be estimated from any such diagram figs. 36-40 in Chapter IX., in the absence of the original table. Further, if a correlation-table be not grouped by it may be difficult to calculate the product sum, but it may still be possible to plot approximately a diagram of the two lines of regression, and so determine roughly the value of r. Similarly, if only the means of two rows and two columns, or of one row and one column in addition to the means of the two variables, are known, it will still be possible to estimate the slopes of RB and GG, and hence the correlation equal intervals, coefficient. (2) The means and of one set of arrays only, say the rows, are o-j, and a-y. The two standard-deviations means are then plotted on a diagram, using the standard-deviation of each variable as the unit of measurement, and a line fitted by The slope of this line to the vertical is r. If the standard eye. deviations be not used as the units of measurement in plotting, the slope of the line to the vertical is r a-J a-y, and hence r will be obtained by dividing the slope by the ratio of the standardcalculated, also the deviations. This method, or some variation of it, is often useful as a makeshift when the data are too incomplete to permit of the proper calculation of the correlation, only one line of regression and the ratio of the dispersions of the two variables being required the ratio of the quartile deviations, or other simple measures of dispersion, will serve quite well for rough purposes in lieu of the As a special case, we may note that ratio of standard-deviations. : — THEORY OF 204 if STATISTICS. the two dispersions are approximately the same, the slope of to the vertical is r. Plotting the medians of arrays on a diagram with the quartile deviations as units, and measuring the slope of the line, was the method of determining the correlation coefficient ("Galton's function ") used by Sir Francis Galton, to whom the introduction (Kefs. 2-4 of Chap. IX. p. 188.) of such a coefficient is due. of errors of estimate like (.3) If s^ be the standard-deviation X- b^y, we have from Chap. IX. § 11 RR and hence •=y>-sif the dispersions of arrays do not differ largely, and the regression is nearly linear, the value of s^ ma,j be estimated from the average of the standard-deviations of a few rows, and r deterThus in Table III., mined or rather estimated accordingly. Chap. IX., the standard-deviations of the ten columns headed 62-5-63-5, 63-5-64-5, etc., are— But — — 2-56 2-11 2-55 2-24 2-23 2-60- ' 2-26 2-26 2-45 2-33 Mean 2-359 The standard-deviation approximately of the stature of all sons is 2 '75: hence /, /2-359Y = 0-514. This is the same as the value found by the product-sum method It would be better to take an to the second decimal place. average by counting the square of each standard-deviation once for each observation in the column (or "weighting" it with the number of observations in the column), but in the present case this would only lead to a very slightly different result, viz. o- = 2-362, ? = 0-512. The Correlation Ratio. The method clearly would not give an approximation to the correlation coefficient, however, in the case of such tables as V. and VI. of Chap. IX., in which the means of successive arrays do not lie closely round straight lines. 20. — , X. —COKEELATION: ILLUSTRATIONS AND METHODS. 205 In suih cases it would always tend to give a value for r markedly higher than that given by the product-sum method. The product-sum method gives in fact a value based on the standarddeviation round the line of regression ; the method used above gives a value dependent on the standard-deviation round a line which sweeps through all the means of arrays, and the second standard-deviation is necessarily less than the first. We reach, therefore, a generalised coefficient which measures the approach towards a curvilinear line of regression of any form. Let Sai denote the standard-deviation of any array of X's, and let n, as before, be the number of observations in this array (Chap. IX., § 11), and further let <rJ^%{n.Bj)lNThen o-^ is . . . . (1) an average of the standard-deviations of the arrays obtained as suggested at the end of the last section. Now ' let • <rJ = <rJ^(l-V^') or . (2) vJ=-^-"4 Then 18). it • • (3) tj^ is termed by Professor Pearson a correlation-ratio (ref. As there are clearly two correlation-ratios for any one table, should be distinguished as the correlation-ratio of JT on Z: it measures the approach of values of associated with given X values of to a single-valued relationship of any form. The calculation would be exceedingly laborious if we had actually to evaluate a-a^, but this may be avoided and the work greatly simplified by the following consideration. If M^ denote the mean of all Jf s, m,^ the mean of an array, then we have by the general relation given in § 11 of Chap. VIII. (p. 142) 7 Or, using a-^ to denote the standard-deviation of CT-i^ m^ • = o-a«- + o-«i/ V.v= • • • W (5) Hence, substituting in (3) — . . on Y is therefore determined when we The correlation-ratio of have found, in addition to the standard-deviation of X, the standard-deviation of the 21. X than the X on T cannot be -r^ a measure of correlation-coefficient for X and Y, and For the divergence of the regression of X on F from linearity. means of its arrays. The correlation-ratio of less rj^y^ is if desired.. The sum-total of the last column divided by the number of observations (1078) gives <7„„^ = 2-058. The square of the mean must then be subtracted from But o-^ is necessarily The magnitude of positive. and dividing by the number of observations.206 if THEORY OF STATISTICS. in the second. § 11. p. 172 a./. . r and -q are almost certain to differ slightly. column is given the type of the array (stature of father) . Both regressions.. 'r. IX.. IX. The form of the arithmetic may be varied. (6) from (2). therefore of yf--r^ measures the (t^ and divergence of the actual line through the means of arrays from the line of regression. it is as well to check the means of the arrays by recalculating from them the mean of the whole distribution.e. owing to the fluctuations of sampling. the mean stature of sons for that array .r^ must be compared with the values that may arise owing to fluctuations of sampling alone. Substituting for o-^ . and only slightly greater than the correlation-coefficient (0-51). by working from zero as origin. it : .^-nJ-r') (7) ly^j. therefore. the deviation of the mean of an array of X's from tho line of regression. d denote. Before taking the difierences for the third column of such a table. very nearly identical. instead of taking differences from the true mean. or a-„„ = 1 'iS. and therefore is 2(. The following table illustrates the form of the arithmetic for the calculation of the correlation-ratio of son's stature on In the first father's stature (Table III. The observed value of y^ . even though the regression may be truly linear. ref.„. as in Chap. in the third. IX. and in the sixth they are multiplied by the frequency of the array. before a definite significance can be ascribed to it (c/. Blakeman. and the formula cited therefrom on p. As the standard-deviation of the sons' stature is 2 -75 in.%l-r^)^<Tj + <T^^. of Chap. ret 22./. the difference of the mean of the array from the mean stature of all sons."»/)/-^ to give o-„. 22. = 0'52. we have by the relation of Chap. not less than r. In the fourth column these differences are squared. IX. It should be noted that. 160). two decimal places only having been retained as suflScient for the present purpose. summing.. multiplying each array-mean by its frequency. p.'^T. {cf. ?.. If the second correlation-ratio for this table be worked out in the same way. Pearson. the value will be found to be the same to the second place of decimals the two correlation-ratios for this table are. 19. that is. i. 352 below). . Chap. question 3). of the same chapter to differ considerably from each other and from the correlation.X. IX. : . In the case of a short series of observations such as that given in Table VII. Son's Stature on Father's Stature : Data of Table III. p. — 1.^^ is "higher since the line of regression of TonXis sharply curved.^ = 0-46. a result confirmed by the diagram of the regression lines (fig. For Table VIII. the The oonfii-mation of these values is left to the student.^^ = 0-38 (r= -0014): 17^ is comparatively low as proportions of male births differ little in the successive arrays. 183. p. 207 follows from the last section. — CORRELATION : ILLUSTRATIONS AND METHODS. 37. p. but i. p. On the other hand.. it is evident from fig. Chap. The student should sufficiently large for notice that correlation-ratio only affords a satisfactory test when the number of observations is a grouped correlation table to be formed. 39. 176. Calculation of the Corkblatiox-Ratio Example. The values found are i7^„ = 0-14. are very nearly linear. 17. . 160.. p. r]y^ = 0-39 (r = 0'34).. that we should expect the two correlation-ratios for Table VI. 178. 174). the two ratios are t. the method is inapplicable. . Jacob. New York. 1. 0. vol. (Detailed theory of the same extended method. "Comparaison numerique de courbes statistiques. .TISIICS. Illustrative Applications. ii. and in the Cotton and Silk Imports into Great Britain.) Correlation of the Weather and the Crops.. On tJie Relation of Fertility in Man to Social Status.. G. Economic Jour. Statistics. Beatrice M. . and : L. 269-279. Illustration i. Soc. G. (1) Yule.) chiefly during the last "An (3) Peakson.. R. 249. " Drapers' Co..eit\iod. . REFERENCES. 1901. Dulau & Co. ' pp. Soc.. p. p." Biometrika. principally to Economic and Practical Methods. P. 403-413. H. "The vol." I. "Genetic Inheritance of Fertility in Man and of (reproductive) Selection Fecundity in Thoroughbred Racehorses. p. x... pp." Phil.. pp... Stat.. vol. M. E. Ixii. analogous to that of illustration v. Soc. Hooker. London. {Of." p. vol.. R."0n the Correlations of Areas of Matured C'rop% and the Rainfall. is employed." ibid. U. tion due to Position in Time or Space. Ixxiv. G. 486. 1907. (The difference-method of Illustration iv. by graphical interpolation. p. p. 1895. "On the Changes in the Marriage and Birth Kates in England and Wales during the past Half Century. U." Jour. " Jour. 1899. {Cf. but the instantaneous average is obtained by an interpolated logarithmic curve. Series 257. 1902. Roy. vol. p. Ixix. 696. but obtaining the instantaneous average in the latter case. . and v. (4) Cave. Soc. pp. vol. Jioy. Roy. Stat. "Numerical Illustrations of the Variate-difference Correlation M." Froc." (5) Hooker. Statistical Studies in the New York Money Market Macmillan Co." Jour. 1905. p. U. 1914 . v.. 1914. -with an Inquiry as to their probable Causes.. J. 1914. of total Pauperism with Proportion ot 603. vol. vi.Browne-Cave. x. . Stat. ." iSirf.) Yule. (The method of Jour. Bramlet Mooric. 208 THEORY OF STA.) (6) Hooker. F. H. Ixx. Soc. Trans. 179-180. (Applications to financial statistics an instantaneous average.) ) ) . "On the Influence of the Time-factor on the Correlation between the Barometric Heights at Stations more than 1000 miles apart.. Karl.. de la sociSU de statistique de Paris. (Uses the methods of Illustrations iv. 847. Cave.. .method. vol. vol. {Of.) " Nochmals iiber The Elimination of Spurious CorrelaAnderson.. L. (2) Yule. "A Comparison of the Fluctuations in the Price of Wheat. p. . cxoii. 255 and 306. H. (The extension of the difference-method by the use of successive differences. vol. " On the (7) (8) (9) ' (10) (11) (12) (13) (14) (15) Correlation of Successive Observations : illustrated by Corn-prices. " On the Correlation Out-relief.. March.. 1905.. vol. 1899. 1896. and Karl Pearson. (16) PoYNTiNG. and toI. (The method of Illustration iv." Jour. " St[jdent. Roy.) the Correlation of the Marriage-rate with Trade." Biometrika.. 1906." if«m. D. x. A. Research Memoirs : Studies in National Deterioration. in England Inyestigation into the Causes of Changes in Pauperism two Interoensal Decades.. Illustration ii. 88. 340-355. pp. 1904. . J. 613. R. Hbkok... Roy.) Norton." "The Elimination of Spurious Correlation due to Position in Time or Space. S. lUusti-ation iii. Ixiv. Asiatic Soc.' " JSiomeiWte. Ixviii. . "On Illustration v. vol.. 1906. 1910. Bengal. used. Alice Lee. H.. (17) Pbakson. p. " On a Correction to be made to the Correlation Ratio.. 1884. 1905." PAiZ. (The "correlation ratio.... See also references to Chapter (25) XVI. ii. pp. Karl.) (22) Testa for Linearity of Regression in FrequencyJ. but is cited because the method of Illustration v. Harris." Biometrika. i. vol. Mag. vol." : . London. Research Memoirs Dulau & Co. vii.. on the process of averaging employed." Biometrika." Biometrileo. Roy. p. 1911. 1913. 78-84. 1911. ILLUSTRATIONS AND METHOD.. 254. "On a General Theory of the Method of False (A method of curve fitting by Position.. vol. 1901. Soc. —COEKKLATION: Stat. p. (18) (19) (20) (21) Peakson. iv. 1902. p. Aethue. 1909. Pbakson. Karl. p. Pbakson. Stat. (Not an approximation. C. "On the Combinations is Calculation of Intra-class and Inter-class Coefficients of Correlation possible from Class-moments when the Number of large. 265. Mag. (This paper was written before the invention of the correlation coefficient. E.. and vol. 1914. E.tes. Soc. Kakl. distributions. the use of trial solutions. Abbreviated Methods of Calculation.. June 1903. " On Lines and Planes of Closest Fit to Systems of Points in Space. (24) Slutsky.. 367. vol. ix. 6th Series. Pearson.. but a true short method. vol..") II. Biometrika. p. Blakeman. 209 Roy. pp. and Curve or Line fitting generally. Arthtje. On the Gteneral Theory of Skew Oorrelation and NonBibmetric Series. 1905. Karl.) Theory of Correlation in the case of Non-linear Kegression. vol. p. 34. viii. J." Jour. p." Biometrika.) J.) X. vol. vol." PMl. is used to separate the periodic from the secular movement : see especially § ix. " A (26) Harris. " On Restricted Lines and Planes of Closest Fit to Systems (23) Snow. vol. " Drapers' Oo. Short Method of Calculating the Coefficient of Correlation in the case of Integral Y3. Karl. ii.Tia. 6th Series. Ixxvii. of Pointe in any number of Dimensions. 559.." Fhil. 446-472r 14 . "On the Criterion of Goodness of Fit of the Regression Lines and the best Method of Fitting them to the Data. 214.. 1. 332. "On xxi. xlvii. " On the Systematic Fitting of Curves to Observations and Measurements." linear Regression. Mag. (The second part is useful for the fitting of curves in cases of non -linear regression. Introductory— 2. suffice to show that the correlation-coefficient can be treated with the same facility. if to be widely useful. Influence of errors of observation and of grouping on the standarddeviation 6-7. To find the Standard-deviation of the sum or difference of corresponding values of two variables JT^ and X^. x^ denote deviations of the several variables from their arithmetic means. The of death-rates.CHAPTER XI. Correlation due for all possible pairs of to heterogeneity of material— 13. for varying sex and weighting of forms of average other than the arithmetic mean. The arithmetic mean and the standard-deviation derive their importance largely from the fact that they fulfil this requirement better than any other averages or measures of dispersion . This might indeed be expected. Application of weighting to the correction age-distributions 20. and the following illustrations. Reduction of correlation due to 14-17.X two-fold table— 11. Then if Z evidently 210 . Standard-deviation of a sum or difference— 3-5. MISCBLLANEOTJS THEOEEMS INVOLVING THE USE OP THE GORRELATION-OOEFFIGIENT. Correlation-coefficient values of a variable— 12. Mean and standard-deviation 10. while giving a number of results that are of value in one branch or another of statistical work. The mingling of uncorrelated with correlated material weighted mean 18-19. for a N . 1. seeing that the coefficient is derived. like the mean and standard-deviation. CorrelationCorrelation between indices 9. — — — it is It has already been pointed out that a statistical measure. etc. Influence of errors of observation on the correlationcoefBcient (Spearman's theorems)— 8. Xj. Let 2.. 2. should lend itself readily to algebraical treatment. 1. of an index — — — coefficient two. by a straightforward process of summation. . . <Tj. 3. errors of X : X — . For the sum of a series of variables Xj. .ia. In this case if x-^ be an observed deviation from the arithmetic mean. . in effect. The student should notice that the assumption made does not imply the complete independence of and S he is quite at liberty to suppose that errors fluctuate more. +Cr„2-h2rj2.. Then. = o-. 211 Squaring both sides of the equation and summing. (1) If x-^ and a. Influence of Errors of Observation on the Standard-deviation.— COEEELATION : MISCELLANEOUS THEOREMS.2 + o-22 . In that case the contingency-coefficient between and S would not be zero. 2(2^) = %{x^^) + 2(V) ± 22(a. . The same process will evidently give the standard-deviation of a linear function of any number of variables. . the arithmetic mean error being zero. .2 are uncorrelated. say 8. we have from the preceding — X The effect of errors of observation is. the error.crjCr3 + r^2 . as might very probably happen. the arithmetic mean error being zero for all values of X. being the correlation beween X^ and X^. is uncorrelated with X.XI. ..(7iOr2-l-2»-j3. . consequently. . the arithmetic mean of the observations is approximately the true value. Let us suppose that. (2) The student should notice that in this case the standarddeviation of the sum of corresponding values of the two variables is the same as the standard-deviation of their difference. . Influence of Grouping on the Standard-deviation. . <r2 we have the important special case . The consequence of grouping observations to form the frequency distribution is to introduce errors that are. o-j o-2 = CTi2-|-o-22+2r. if any value of be observed a large number of times. x-^ That is. and x^ and o-. r^^ the correlation between X^ and Xg. -|-2r23.0-2O-3+ . and so on. if r be the correlation between the respective standard-deviations. x the true deviation.. to increase the standard-deviation above its true value. ..). 4. although the correlation-coefficient might still vanish as supposed. for example. The results of § 2 may be applied to the theory of errors of observation. X^ X„ we must have . 0-2 = (Ti^ -t- CTj^ -t- .o-i(r2 . with large than with small values of X. p. Xj and 8 would be uncorrelated. the standard-deviation of 8 would be cY12. shows that grouping tends to increase rather than reduce the standard-deviation. Instead of assigning to any observation its true value X. But the true frequency distribution is rarely or never a histogram. VIII. where Xi = X+S. If we assume. as in § 3. as before (the values of 8 being to a first approximation uniformly distributed over the class-interval when all the intervals are considered together). 15. that the correlation between 8 and X. The strict proof of the formula lies outside the scope of an elementary work it is based on two assumptions: (1) that the distribution of frequency is continuous. the mean value of 8 being zero for every value of Xj further.rd- To deduce from this equation a X : deviation of the true values X. and that trial (ref. 89. if o-.r2 = <r. (10)). thereby making an error 8.2-|! . it is possible to obtain the true standard-deviation cr^ if the further assumption is legitimate that the errors Sj and S^ are uncorrelated with each This is tion. On this assumption . eqn. et seq. we assign to it the value Xj corresponding to the centie of the class-interval. The formula would not give accurate results in the case of such a distribution as that of fig. is appreciably zero and that the standard-deviation of '8 may be taken as c'/l 2. . . 89. § 12. (2) that the frequency tapers off gradually to zero in both directions. 92. neither is it applicable at all to the more divergent forms such as those of figs. p. then we have . measurement. If certain observations be repeated so that we have in every case two measures Wj and x^ of the same deviation x. 5. standard-deviation of the grouped values Xj and o. and trial on any frequency distribution approximating to the symmetrical or slightly asymmetrical forms of fig. 92. formula showing the nature of the influence of grouping on the standard-deviation we must know If the original or Xy the correlation between the error 8 and distribution were a histogram. 14. . refs. 5. : other. 9 {b). 9 {a). instead of 8 and Xj. (4) a formula of correction for grouping (Sheppard's correc1 to 4) that is very frequently used.the 8tanda. 97. p.212 THKOET OF STATISTICS. 5. where c is the classHence. p. be the interval (Chap. or fig. 1) shows to give very good results for a curve approximating closely to the form of fig. p. or fig. c we will suppose x and y alone to — be correlated. we have %{x^y^-%{x^y^ ^ %{x. assumption that. If. no mere increase in the number of observations can in any way lessen. so that we have two measures x-^ and x^. . on the Equation (8) is correction formula (refs. e the errors of observation. however. 8j. Or. and 8. Spearman's Theorems. 6. (6) It follows at once that and consequently the observed correlation correlation. on assumptions similar to those made above. of equation (7). the observations of both 7. . 8. e^. On this assumption %{x^y^)=^%{xy) . y the true deviations. "^ ~ (r r Y ' ' ' ^ ' the original form in which Spearman gave his It will be seen to imply the 7). the true value of the correlation can be obtained by the use of equations For (5) and (6).XI. Influence of Errors of Observation on the Correlationrcoeffident. y. y^ be the observed deviations from the arithmetic means. it should be noticed. of the sii quantities x. X and y be repeated. \. is less than the true This difference.^y^)%{x^y^ ^ a_ — (7) if we use all the four possible correlations between observed values of x and observed values of y. Ejj "^ and y The correction given by the second part alone are correlated. y^ and y^ of every value of x and y. also suggested by Spearman. Of the four quantities x. y. Let a!j. § 7. X. seems. 213 and accordingly ' ]f • • • • W (This formula is part of Spearman's formula for the correction of the correlation-coefficient. as assumed in § 5. —CORRELATION: MISCELLANEOUS THEOREMS. . .) 6. cf. and neglecting terms of all orders above the second.) The means and standard-deviations of non-linear functions of two or more variables can in general only be expressed in terms of the means and standard-deviations of the original variables to a first approximation. i/. to be safer.ll. Let / be the mean of Z. .— (Rei. An insufficient though partial test of the correctness of the assumptions may be made by correlating x-^ .. on the assumption that deviations are small compared with the mean values of the variables. Thus let it be required to find the mean and standard-deviation of a ratio. or index Z=X-JX^. M2 =:^(i+v-4'-"i^2+>V) . = 0-2/^2. and M^ the means of X-^ and X^. assuming that xJM^ is so small that powers higher than the second can be neglected. it may vanish from symmetry without thereby implying that all the correlations of the errors are zero. Evidently. 8. if r be the correlation between x^ and x^. in terms of the constants for Xj and X^.^: this correlation should vanish. Then to this approximation "^-^^3^(^^-^) + ]it^^(V)]That ^2 is. however. and if v^ = irJM^.x^ with y-^ — y. (9) be the standard-deviation of Z we Expanding the second bracket again by the binomial theorem. Mean and Standard-deviation of an Index. Then whole. it eliminates the assumption that the errors the same series of observations.214 THEORY OF STATISTICS. are unoorrelated. in n\xJ n «:<'-^)(-C Expand the second bracket by the binomial theorem. for in X and in y. If ^=3^X1 If s -'^1^2 + ^2') have • . JTj -Zj and X^ being ^^ncorrelated. between the indices formed by the ratios of two of the measurements to the third. where.. The value of p is termed by Professor Pearson the "spurious correlation.) The following probaffords a further illustration of the use of the same method. 215 or from (9) 9. if z>2 'Wj . 11. nevertheless. To give another illustration. the indices formed by taking their ratios to a common denominator X^ will be correlated. probably approaching 0'5. be a positive correlation.a. these are given approximately by (9) and (10) of the last section. on three bones of the human skeleton.^1^2.p. — Required to find approximately the correlation between two ratios Z-y = Xj/X. 2 Neglecting terms of higher order than the second as before and remembering that all correlations are zero. being equal to 0'5 if and hence even if JTj and X^ are independent. (Ref. there may be little. 3 > neglected. there will. say. 2^ = XJX^.-A)(f^-/. The required correlation p will be given by ir. if two individuals both observe the same series of magnitudes quite independently.) =. we have finally This value of p fJ is obviously positive. lem Correlation between Indices. in the last step. a term of the order v^* has again been Substituting from (10) for Sj and s^.— XL —COEEELATION: MISCELLANEOUS THEOREMS. = = ." Thus if measurements be taken.. = 2g. Let the means of the two ratios or indices be /j /g and the standard-deviations Sj s^ .%(?^)-Kir X. we have W2 = ^'(1+3V)-V2 . and the measurements grouped in threes absolutely at random. however. ref. the table may be written in the form 1) and tration in § 11.g. there may be considerable correlation. is only a special one. be expressed as percentages of the magnitude observed.^j. for the general discussion cf.Xg = Xj and . ref. In some cases. and the answer to the question whether the correlation between indices or that between absolute measures is misleading depends on the further question whether the indices or the absolute measures are the quantities directly determined by the causes under investigation (cf. there will be a similar " spurious correlation between the absolute measurements ^j. — and consequently two rows and two columns {cf. correlation between their absolute errors. one illusand for others the references given in questions 11 and 12). For an interesting study of actual illustrations cf. . Using the notation of Chapters I. It is therefore of some interest to obtain an expression for the coefficient in this case in terms of the class-frequencies.-IV. 14. where X-^ X^ X^ are uncorrelated. The case considered. The Gorrelation-coeffioient for a two. Values of Second Variable. a theoretical value is obtainable for the coefficient.Xj = Xj. The correlation-coefficient is in general only calculated for a table with a considerable number of rows and columns.216 THEORY OF STATISTICS. such as those given in Chapter IX. 10. " If the indices are uncorrelated.x twofold Table. 13). which holds good even for the limiting case when there are only two values possible for each variable {e. 11. It does not follow of necessity that the correlations between indices or ratios are misleading. But if the errors any. ref. The two coefficients possess. i. —COKBELATION: o-j. that have been proposed. If. unlike the association-coefficient of Chap. and there are three brothers in — . (Aft) <r/ Finally. Xj Tj on a line. For moderate degrees of association. combining If observations in pairs in all possible ways. or the interquartile range and the standard-deviation different measures of dispersion. which is unity if either {AB) = {A) or {AB) = {B).table is formed by of a Variable. but. Obviously this alone renders the numerical values of the two coefficients quite incomparable with each other. and its value is lowered when the symmetrical table is rendered asymmetrical by increasing or reducing the number of A'a or B's. The Correlation-coefficient for all possible pairs of If values In certain cases a correlation. 3 of Chap. this reduces to = 8. the association coefficient gives much the larger values./3) can vanish so that (AB) and (aj8) correspond to the frequencies of two points Xj 7^. MISCELLANEOUS THEOREMS. for example. The student is again referred to ref. while the association coefficient is the same for all tables derived from one another by multiplying rows or columns by arbitrary coefficients. III. essentially difierent properties.8)/iV^. the correlation between brothers for statv/re. § 13. for a general discussion of various measures of association. the correlation coefficient (12) is greatest when (A)=(a) and (B) = (/3). III. and are different measures of association in the same sense that the geometric and arithmetic means are different forms of average.XI. in fact. Writing {AB)-{A){B)IN=h (as in Chap. including these and others. This is the only case in which both frequencies (aB) and (J. r only becomes unity if (AB) = (A) = (B). rj by their values. say. 11. ^xi/) = 1{{AB) + (a/3) - . But further.(aB)} - F^.e. §§ 11-12) and replacing 2(0!^) |. III. Whence ^mwrni' • ' ^ ^ This value of r can be used as a coefficient of association. a table is being formed to illustrate. when the table is symmetrical. o-g 217 The standard-deviations o-j2 are given by = 0-25-f2 = (4)(a)/ivr2 = 0-25-^2 = (5)(. this gives the successive values of >•= It is clear that the first value is right. . say and <t. the product sum may be ajg i. . due to a family with Jf members. the numbers of pairs being therefore N{N. (13) N= 2. written X-.e. the means and standard-deviations of the totals of the correlation-table are the observations. 12. -It + = X-^{%{x)-x-^]+x^{%{x)-x^]+x^{%{x)-x^} == . there being N{N . The entire table will be formed from the aggregate of such subsidiary tables.. 5 regarded as giving the six pairs 5 5 ft. — \. each due to one family. .x^ - .. Let it be required to find the correlation-coefficient. ft. XIV. 9 „ ft.^ .x. for a single subsidiary table.1) pairs.Xn -p XyJCa "P X-iX^ T" -p X^JC-i "t" . ft. they will nevertheless exhibit some correlation .. in fact. and 5 ft. « • ^2*^3 ~i~ *^2'^4 "r . and Y are uncorrelated in each of two If X records. ajj only determine the two points {x-^.No-^. 10 which may be entered into the table.— 218 THEOKY OF STATISTICS. • . . - x-^ .-\. — 1. of the variable occurs once in combination with every other value.. 9. one family with statures 5 ft.1).g. two values »!. e. whence. If x^ x^ same as for the original be the observed deviations. for .. . = . these are 9 with 5 5 „ ft. attributes.. even if the variables can only assume two values. —The following ivhen the theorem § 6 for some analogy with the theorem of Chap.. 4 ... 11.. 1 iVo-2 N{N-l)<T^ For N-l . . the equation (13) still holds. however.. -|— -J— XqpC-i XnXn T" X^ykit -P . x^). and 1. 10 with 5 11 ft. As each observed value N—\ N • M . The student should notice that a corresponding negative association will arise between the first and second member of the pair if all possible pairs are formed in a mixture of A's and a's. 10. This result is utilised in § 14 of Chapter Correlation due offers to Reterocfeneity of Material. . ft. IV. times. Looking at the association. 10 11 5 5 ft. from the standpoint of § 10. 3. and the slope of the line joining them is negative. 10 „ „ „ „ 5 ft. x^) and (x^. If r^ is the value that would be expected from other records. This follows almost at once.K). unless the mean value of in the second record is identical toith that in the first record. the product-sum of deviations about M. and M. E^. the means when the two records are mingled. Both the first can. M-^ = M^. or the mean value of Y in the second record is identical with that in the first record. only vanish if M-^ = M^ or Correlation may accordingly be created by the mingling K-^ = K^. or both. the and second terms X — _ S(. therefore.— CORRELATION : MISCELLANEOUS THEOREMS. the mean values of 7. but the correlation zero.) 13.M){K^ -K) + F^{M^ . M^ are the mean values of in the two records Zj. for if M^. and a correlation r^ is observed between pairs of bones presumed to come from the same skeleton. is iVj (M^ . the means and standard-deviations of x and y being the same as in the first series of observations. N^ the numbers of observations. Suppose that n^ observations of x and y give a correlation-coefficient Similarly. and subject to some uncertainty owing to doubts as to the allocation of certain bones. of two records in which and Y vary round different means.!^i_ . X X K K Evidently the first term can only be zero ii M = M-^ or E=Ky But the first condition gives that is. 20.' Whence Suppose. and w^e will have Xjxy) {n^+n^)<T^ar.M)(iq . _ tu) n n-^+n^ for example. 219 two records are mingled. (For a more general form of the theorem cf. N^. that a number of bones of the human skeleton have been disinterred during some excavations. this correlation being rather lower than might have been expected. .XI. ref. '>= _.y) will then be unaltered. the difference might be accounted for on the hypothesis . second condition gives K^ — K^. Reduction of Correlation due to mingling of uncorrelated with correlated pairs.ry) Now let ^2 pairs be added to the material. The value of %{x. 7d. and have been virtually (For a more general form of the theorem cf. Mr of r series of observations. in market B at an average price of 27s. and in market C at an average price of 28s. in a THEOEY OF STATISTICS. we may. (Chap.{N). as the unit. on the other hahd. not by taking the arithmetic mean of the several market prices. The second form of average would be quite correctly spoken of as a weighted mean of the means of the several series at the same time it is simply the arithmetic mean of all the series pooled together.{M)lr. it will be better to form an average price. we multiply each several observed value of by some numerical coefficient or weight W. If. VII.X)/%(W). Id. but by weighting each price in proportion to the quantity sold at that price. treating the market as the unit..r^jr-^ of all the pairs. estimated. the quotient of the sum of such products by the sum of the weights is defined as a weighted mean of X. we may form a general average by taking the arithmetic mean of all the means. paired at random.220 that. viz. weighting each mean in proportion to the number of observations in the series on which it is based. 4d. i.. : .e. M^ . '2. proportion (r^ .. again ref. treating the series as the unit. was defined as the quotient of the sum of values of a variable of those values by their number N. 20. § 13.) 15. and may be denoted by M' so that M' = :S. for the " weights " may be regarded as actual. or virtual frequencies.) The arithmetic mean Jlf of a series 14. i. To give an arithmetical illustration. X distinction between " weighted " and " unweighted " means should be noted.e. the bones do not really belong to the same skeleton... or X — M=%{X)IN. but do not know the number of observations in' every series. M^. But if we know the number of observations in every series it will be better to form the weighted mean '%{NM)I'S. The Weighted Mean. the arithmetic mean obtained by treating the observation and not the series as the unit. treating the unit of quantity as the unit of frequency. very often formal rather than essential. if known. The weighted mean then becomes simply an arithmetic mean.(W. in which some new quantity is regarded Thus if we are given the means M-^. if a commodity is sold at different prices in different markets. per quarter. if no statement is made as to the quantities sold at these prices (as very The it is. Thus if wheat has been sold in market A at an average price of 29s. X) = N{M. at C. But if we know that 23. our standpoint be that of some average consumer. or weight the index-numbers for the several commodities according to their importance from some point of view . Id.) as the general average. and 3933 qrs.930 qrs. 4d.. x 26) + (28s. whence M' = M+r<r^^ (15) . XI. . attached to the small markets In the case of index-numbers for exhibiting the changes in average prices from year to year (ef. for example.w + r<r„(r. For. it will be better to take the weighted mean (29s. were sold at A. we may take as the weight for each commodity the sum which he spends on that commodity in an average year.e. 4d. and it is If r be the required to find an expression for this difference. so that the frequency of each commodity is taken as the number of shillings or pounds spent thereon instead B of simply as unity.). only 26 qrs. and not as a rate per 1000 of the population. —COEEELATION : MISCELLANEOUS THEOKEMS. x 3933) 27889 to ~^^^- This is appreciably higher than the the nearest penny. . Eates or ratios like the birth-. it may make a sensible difference whether we take the simple arithmetic mean of the index-numbers for different commodities in any one year as representing the price-level in that year. total births „. It is evident that 2( W. the rate for the whole country different districts. correlation between weights and variables. We is the mean of the rates in the weighting each in proportion to its population. 221 often happens in the case of statements as to market prices). at B. § 25). 7d. which is lowered by the undue importance and C.930) + (27s. o-„ and o-^. we have at once illustrations in 16. Birth-rate of whole country = r~r^ rr' . use the weighted and unweighted means of such rates as §17 below. — total population 2(birth-rate in each district x population in that district) ^(population of each district) i. Chap. take the arithmetic mean (28s. VII. If. treating the rate for simplicity as a fraction. the standarddeviations. arithmetic mean price. any weighted mean will in general differ from the unweighted mean of the same quantities. and w the mean weight. or marriage-rates of a country may be regarded as weighted means. X 23. and much has been written as to the weights to be chosen. death-. . for instance. if negatively. Stat. r having a sensible value and a-^a-^/w a large value. In some the weighted mean is the greater . . and then weighting makes little difference.222 That THEORY OF STATISTICS. Thus we have the following figures for rates of pauperism 17. lix. 349). cases r is very small. if the weights and variables are positively correlated. the less. nearly always of importance. is to say. birth-rates or other rates (Jour. p. January 1. Soc. The difference of death-rates. between weighted and unweighted means on the population in different districts is. vol. (1896). but in others the difference is large and important. assuming that cr^/w) approximately the same for the decade 1881-90 as in 1891. . P2. . perhaps the country as a whole. 18.. etc. but. ttj ttj ttj given by the population of the standard district. which are largely affected by the age and sex-composition of the popula- Neglecting.^) . but with the weights. where 2(p)=l. in spite of lower deathrates in every class.XI. . ... dj. . (16) For some other district taken as a basis of comparison. suppose the of deaths are noted in a certain district for. pj. . dj. . -40. or vice versd. (18) . *" _ 32-34 -30-34 4-08 459 ^ 564 = + The of course.. etc. the age-groups 0-. The difiiculty may be got over by averaging the age-class death-rates in tke district not with the weights Pi P2 Pz given by its own population. Let the deathrates for the corresponding age-groups be d^. in which the fractions of the whole population are p^. the question of sex. . the death-rates and fractions of the population in the several age-groups may be 8j Sj S3 ttj tt^ Wj and the crude death-rate . . 223 is For the birth-rate.. numbers D=l(d. It may happen that really both districts are about equally healthy.. accidental.. and the deathrates approximately the same for all age-classes. A = S(8. on the other hand. Then the ordinary or cnide death-rate for the district is tion. D' = -S. tr^ we have. The corrected deathrate for the district will then be . owing to a difference of weighting. 10-. say.{d.p) .. The principle of weighting finds one very important application in the treatment of such rates as death-rates. etc. 20-. higher crude death-rate that the second. the first average may be markedly higher than the second. .-n) . closeness of the numerical values of r in the two cases is. there will be a larger proportion of the old in the former. for instance. (17) differ Now D and A may differ either tt's because the (fs and S's or both. for simplicity. .— COKRKLATION : MISCELLANEOUS THEOREMS. The comparison of crude death-rates is therefore liable to lead to erroneous conclusions. and it may possibly have a or because the p's and differ. being 4 '08. . If the first district be a rural district and the second urban. . e. i>" = i>x|.g. A' being given by A' = S(S..py This will hold good if. 'Both methods of correction that of § 18 and that of the present section are of great and growing importance.. Difficulty may arise in practical cases .e. but only the crude rates D and the fractional populations of the age-classes Px Pi. 19) that corrected average heights or corrected average weights — — . and is used for both these purposes in the Decennial Supplements to the Reports of the Registrar General for England and Wales (ref.7r) S(8. the same as D'. The death-rates must be noted for each sex separately in every age-class and averaged with a system of weights based on the standard population. They are obviously applicable to other rates besides death-rates.g. the death-rates in the standard population and the district stand to one another in the same ratio in all age-classes. and ly and A will be comparable as regards age-distribution.. 17. refs. they may readily be extended into quite different fields. pp. It is the crude death-rate that there would be in the district if the rate in every ageclass were the same as in the standard population..g. . only be the same if It can S((Z. (19) the rates of the standard population averaged with the weights of the district population. IV. P^ The difficulty may be partially obviated {cf.224 THEORY OF STATISTICS.. those engaged in given occupations.^d^d^ classes . are not known which it is desired to compare with the standard population.7r) ^{d.p) %(8. (20) D" is not necessarily.. birth-rates {cf. 51-3) by forming what may be termed a potential or standard death-rate A' for the class or • • • district. 18). 16). An approximate corrected death-rate for the district or class is then given by i. There is obviously no difficulty in taking sex into account as well as age if necessary. Further. SJd^ = SJd^ = S^/d^ = etc.method of correction is used in the Annual Summaries of the Registrar General for England and Wales. e. The method is also of importance for comparing death-rates in different classes of the population. e. § 9. 19. e. Chap. This.j9) . i. from the fact that for the districts or the death-rates d. Thus it has been suggested (ref. nor generally. .. as well as in different districts. . W.. " On the Influence of ' Broad Categories on Con-elation. which was communicated to Spearman in 1908. 1907. W. formed by finding the value of the variable for which the sum of the weights was greatest. C. vol.. an Elementary Proof of (4) Pbakson. — correlation: miscellaneous theorems. 20. " The Proof and Measurement Things... F. 1904. iii. In §§ 14-17 we have dealt only with the theory of the weighted arithmetic mean. allowing for the smoothing of Similarly. Thus a weighted median can be formed by finding the value of the variable such that the sum of the weights of lesser values is equal to the sum A weighted mode could be of the weights of greater values. "On ' Biometrika. Slat. Land. (Formula (8). and published by Brown and by Spearman in (8) and (10). 225 of the children in different schools basis might be obtained on the a standard school population of given age and sex composition. vol. i. 161. of Psychology. xxix. Con'elation. etc. Jonr. Grouping Observations. p. (2) Sheppaud. iii. (4) for the coiTeotion of the standard-deviation is Sheppard's result. Speabman. and on other allied points. 15 . Math.. vol.e. of Association between vol. F. p. 698.. vol. v. of log ^"- s(r. Effect of Errors of Observation (6) on the Correlation-coeflBcient.) Spearman. (Proof of formula (8). result given in eqn.— XI.'' (3) Shbppabd. Kael." Proc. W. xv. 1907. Jour. (8) "Demonstration of Formulae for True Measurement of Amer.. Kabl. 308." Biometrika. 1910.iogx) s(wo EEFERENCES. C. Sheppakd." Britith Jowr. (7) Spearman. "The Calculation of Moments of a Frequency-distribution.) Two 88." Amer. p. vol. "c/ojsr. (6) Pearson. p." Biometrika. or iudeed of given composition fis. 271. p. vol. ' On the Calculation of the Average Square. p. 450.. p. Boy.li.. a weighted geometric mean could casual fluctuations. Cube. be calculated by weighting the logarithms of every value of the variable before taking the arithmetic mean. of Psychology. 363. regards hair and eye-colour as well. 1913.) F. 1897. C. "On the Calculation of the most probable Values of Frequency Constants for Data arranged according to Equidistant (The Divisions of a Scale. Soc. of a large number of Magnitudes. 1904.. Effect of (1) . pp. xviii. Sheppard's Formulae for correcting Raw Moments." of Psychology. Soc. ix. and others [editorial]. but it should be noted that any form of average can be weighted.. 116-139. "Correlation calculated from Faulty Data. but on different lines to that given in the text. 1897 2619. Trans.. on applying Sheppard's correction for grouping. U.. 1897. etc.317-46. 88. Stevenson. 1896." Eugenics Laboratory Memoirs. (§ 7 contains remarks on the effects of errors on the correlations and regressions. M. in the rough value of the standard -deviation. 301.. lix. "Note to the Memoir by Prof. Soc. Correlations between Indices. Tbitj". (10) Beown.'' Phil. .." etc.. (Data from the decennial supplements to the Annual Reports of the Registrar-General for England and Wales. p. 1910. Greenwood. (p. 1906. (11) Peahson. p. " On the Interpretation of Correlations between Indices or Eatios. August 1S09.. p." Jour. Soc." Proceedings of the Sixth International Congress of Psychology.. (16) Correction of Death-rates. II. the Changes in the Man-iage and Birth Rates in (18) Yule." iM(?. p.pp. Stat. § 5). VI. Ixix. "Genetic (reproductive) Selection: Inheritance of Fertility in Man and of Fecundity in Thoroughbred Racehorses. I. Roy. p. H. and Frances Wood. 1908). the Correlations of Areas of Matured Crops and the Asiatic Soc. Chap. . Tatham.9... Karl Pearson on Spurious Correlation. W. (§§8. Karl. 644. vol.. S. 1. and PL II. " Proc. Bramley-Moore. and L.) The following particulars are . ii. .. vol. "On a Form of Spurious Correlation which may arise Indices are used in the Measurement of Organs. G. Alice Lee. 1895 8503.. U. "The Iniluenoe of Defective Physique and Unfavourable Home Environment on the Intelligence of School Children. Soc. A. yoI. Nkwsholme.. 847. p. 139) and iii.1914. lxxvii. M. vol. Bengal.. Also Supplement to Sixty-fifth Report : Introductory Letter to Pt. Stat. Sheppard's correction will make a difference of less than 0'5 per cent. 141) of Chapter VIII. Soc.) Proc." Jour. ibid. when . Stat. (20) Miscellaneous. "Some Experimental Eesults in Correlation. Pt. Roy." Soy. "Note on Reproductive Selection." (Eqn. Ixxiii. Karl. 2. p. John. The Weighted Mean.J. G.) 226 (9) THEOEY OF STATISTICS. 1910. etc. 498. Roy. Show that if a range of six times the standard-deviation covers at least 18 olass-iutervals (c/. vol. viii. (16). (p. Hoy. 257. "The Decline of Human (17) Fertility in the United Kingdom and other Countries. Dulau & Co." Mem. W. Jacob. . (A number of theorems of general application are given in the introductory part of this memoir. and T.) EXERCISES. Geneva. Series A. 3. Boy. Soc. (Cd. (13) Yule. vol.) (12) Galton. (19) Heron. Soc. Supplement to the Fifty-fifth Annual Report of the Registrar-General for England and Wales: 'hUroductory Letters to . (15) PBAiisoN. as shown by Corrected Birth-rates. "On Rainfall. some of which have been utilised in §§ 1213 of the preceding chapter. Pearson. p. 7769. 0. with especial reference to this problem. Francis. Find the values obtained for the standard-deviations in Examples ii. London. Ix. "On England and Wales during the past Half Century. 34. David. 1910. (14) Brown. 489. "A Study of Index-Correlations. vol. cxcii. 1899. Karl... . —CORRELATION : MISCELLANEOUS THEOREMS.— XI. 227 found for 36 small registration districts in which the number of births in a decade ranged between 1500 and 2500 : Decade. . A. find the coiTelations between a!j. Trans. p. is the effect on a weighted mean of errors in the weights or the quantities weighted.228 THEORY OF STATISTICS holds for all values of iKj. and the correlation table between parent and offspring reduces to the form 10. or with the variables (1) if the arithmetic mean values of the errors are zero (2) if the arithmetic mean values of the en'ors are not zero ! 11.. 1904. deviations from their respective arithmetic means).=]. If we consider the correlation between number of recessive couplets in parent and in offspring. the only possible numbers of recessive couplets are and 1. the correlation is found to be 1/3 for a total number of couplets n. such errors being uncorrelated with each other. with the weights. What — . 53). b and c. in our usual notation. Offspring. x^ and Kg in terms of their standard-deviations and the values of a. vol. coiii. (Pearson. !Cj and ajg (which are. Of. in a Mendelian jiopulation breeding at random (such as would ultimately result from an initial cross between a pure dominant and a pure recessive).. "On a Generalised Theory of Alternative Inheritance." Phil. If «. to a correlation between changes in out-relief and changes in proportion of old. -XI. Introduotoiy explanation— 3. — 14. for we have as yet no guide as to how far a correlation between 229 . Arithmetical work : Example : Example coefficient 17.— CHAPTER XII.. 15. and similarly the student will find it impossible to advance very far in the discussion of many problems in correlation without some knowledge of the theory of multiple correlation. Special notation for the general case : generalised regressions 5.. 16. Generalised deviations and standard-deviations 7-8. Reduction of the generalised regression 13. Theorems concerning the generalised product-sums 9. X. and the question might arise how far the first correlation was due merely to a tendency to give outrelief more freely to the old than the young. In such a problem as that of illustration i. or correlation between several variables. in order to be able to deal with the complex causation characteristic of statistics . 1. 1-2. Generalised correlations 6. and also with changes in the proportion of old . 18. Chap. Reduction of the generalised — — — — — correlation-coefiSoient ii — Geometrical representation of correlation between three variables by means of a model — The of M-fold correlation — Expression of regressions and correlations of lower in terms of those of higher order — Limiting inequalities between the values-of correlation-coefficients necessary consistence — Fallacies. Direct interpretation of the generalised regressions 10-11. the theory of the correlation-coefficient for a single pair of variables has been developed and its applications illustrated. PARTIAL CORRELATION.e. for 19. Reduction of the generalised standard-deviation 12. In Chapters IX. for instance. Direct deduction of the fommlie for two variables 4. The question could not at the present stage be answered by working out the correlation-coefficient between the last pair of variables. it might be found that changes in pauperism were highly correlated (positively) with changes in the out-relief ratio. i. But in the case of statistics of attributes we found it necessary to proceed from the theory of simple association for a single pair of attributes to the theory of association for several attributes. — — i. 2. the mean change in X^ associated with a unit change in X^ when all the remaining variables are kept constant. say.. the variables 1 and 2 can be accounted for by correlations between 1 and 3 and 2 and 3. X^. and the n—\ others.X^^hyX^^ . If the variables are X^X^X^ .. If in -I-6„. assigning such values to the constants as to make the sum of the squares of the errors of estimate as low as possible the more complicated case may be discussed by forming linear equations between any one of the n variables involved. parent and grandparent. The correlation between X-^ and X^ indicated by \ may be termed a partial correlation. such a generalised regression or characteristic equation we any one coefficient such as \. grandson and grandparent can or cannot be accounted for solely by observed correlations between grandson and parent. the corresponding problem is of great iinportance. and the question might arise whether the last result might not be due merely to a negative correlation between rain and accumulated temperature.230 THEORY OF STATISTICS. the bulk of a crop and the rainfall during §.. in fact. in which it is necessary to consider simultaneously the relations between at least three variables.X„. for the effects of changes in these variables are allowed for in the remaining terms on the right. or X„. The latter case was discussed by forming linear equations between the two variables. and possibly more. In the problem of inheritance in a population. Problems of this type. taking each in turn. may be treated by a simple and natural extension of the method used in the case of two variables.. which may be termed partial regressions. the crop being favourably affected by an increase of accumulated temperature if other things were eqwil. but failing as a rule to obtain this benefit owing to the concomitant deficiency of rain. and practically no correlation between the crop and the accumulated temperature during the same period . The magnitude of h^ gives. certain period. It is essential for the discussion of possible hypotheses to know whether an observed correlation between. Chap. again assigning such values to : the constants as to make the sum of the squares of the errors of estimate a minimum. partial coefBcients of correfind a sensible positive value for . Again. .. we know that there must be a positive correlation between Xj and X„_ that cannot be accounted for by mere correlations of X^ and X^ with X^. . as corresponding with the partial association of Chapter IV. X„. a marked positive correlation might be observed between. say.. X^ = a + h„. and it is required to deduce from the values of the coefficients b. in the case of illustration iii. as already indicated in Chapter IV. X. the equation will be of the form . Evidently a^j would be very large for values of aj that erred greatly either in excess or defect of the best value (for the given value of Sjj). 44. required.2 a minimum. it is required to make Sj. in a regression-equation. the least possible. but exceptionally. a curve would be obtained more or less like that of fig.. though possibly. § 14). if any. If therefore the values of «j 2 were plotted to the values of a^ on a diagram. be as well to revert briefly to the case of two variables.^ attained its so that «j.XII. and the meaning of the coefficient in the more general case was subsequently investigated. to obtain the greatest possible simplicity of treatment. Let us take the arithmetic means of the variables aS origins of measurement. and let x^.. Sj. Such a process is not conveniently applicable when a number of variables are to be taken into account. we write associated pairs Put more briefly. Then it is required to determine a^ and ij. The best value of Oj. for all deviations x^ and x„.2 is in using regression-equation (a) . 239-247). in the regression-equation X^ . x^ denote deviations of the two variables from their respective means. For examples of such generalised regression-equations the student may turn to the illustrations worked out below (pp.. introducing a notation that can be conveniently adapted to more. or when changes in these variables are corrected or allowed for. IX. to determine the coefficients and constant term. Chap. for which s^. and the problem has to be faced directly i. so far as this may be done with a linear equation. however.j being calculatted for each. 231 lation giving the correlation between Xj and variables when the remaining variables X^ or other pair of are kept constant. —PARTIAL CORRELATION. zero.j -a-^ + b-^^-x^Y. It will first. so as to make the sum of the squares of the errors of estimate a minimum. Suppose any value whatever to be assigned to Jjj.e. 3.j could never become negative. the value of the coefficient r ^^pjcr-^a-^ was deduced on the special assumption that the means of all arrays were strictly collinear.. the value of Sj. will take this problem first for the case of two variables. In Chapter IX. if the root-mean-square value of the errors of estimate (cf. and would continuously decrease as this best value was approached . X^ : We *i = «i + V^2 • • • • («) of so as to make S(a. With this explanatory introduction. and a series of values of a-^ to be tried. we may now proceed to the algebraic theory of such generalised regression-equations and of multiple correlation in general.. aj + b^^. S(a. or that the two lines of regression must pass through the mean (Chap. could be approximately estimated from such a diagram . S(a. *'^- "'^•^ S(V) . the corresponding valiies of Sj j are equal. If. by similar reasoning. we must have. neglecting value of 6i2-^'^*> the term in 8^. minimum value. the equation gives. now. but it can be calculated with much more exactness from the condition that if a\ a"i be two values close above and below the best. § 10).i . That is.sc^)^ — 2{x-^ .i - «! + b-^^. ij^. Let Then if ttj and (aj + 8) be two such values. IX. the value of a-^ is the best for the assigned evidently. ij^ + S. say o-j j.aj + 8 + b^^-""!)^ when 8 is very small.x^) = <h = 0. b^^ is to be assigned result of the best value. whatever the value of Jjj. for slightly differing values. that is. = (c) breaking up the sum. We may therefore omit any constant term. ^x^{x^~b^^. This is the direct proof of the that no constant term need be introduced on' the right a regression-equation when written in terms of deviations from the arithmetic mean.232 THEORY OF STATISTICS.x^) or. again neglecting terms in 8^. e. . . form . . and remains to be shown that r^^. and the second will be the subscript of the x to which it is attached. 233 which is the value found by the previous indirect method of Chapter IX. n • *^8 + • • • + "ln. „ may really be regarded as. ... .2. these may be called the primary subscripts.24 . and separated from them by a point. The order . . . . . .^ . n "21.. but the order of the primary subscripts is material ..2) 4. A coefficient with p be termed a regression of the ^th order. follow from this. From the fact that Sjj is determined so as to make the value of 2(ai . After the primary subscripts.g. » • ^2 * ^13.34 . . . figj. . . . and notably we have for cr^. For the partial regression-coefficients (the coefficients of the sc'a on the right) a special notation will be used in order that the exact position of each coefficient may be rendered quite definite. .34 .XII. . (d) apply the same method to the regression-equation Writing the equation in terms of deviations. the minimum value of Sj. the standard-deviation of errors of estimate cri/ = V(l-r. •''I — "1184 . Kj being the dependent variable in . . and may be termed total as distinct from partial regressions.^ secondary subscripts may '•l2=(*12-*2l)'- We shall generalise this equation in the '"12. quite indiiferent. etc. The first subscript affixed to the letter b (which will always be used to denote a regression) will be the subscript of the X on the left (the dependent variable). it should be noted. . the method of determination is sometimes called the method of least squares. .b^^^^Y ^^^ least possible. . it follows from reasoning precisely similar to that given above that no constant term need be entered on the right-hand for Now n variables. in the case of two variables may be regarded as of order zero. are placed the subscripts of all the remaining variables on the right-hand side as secondary subscripts. in the second. . . The regression-equation will therefore be written in the form side. 5. S^j. n) • • (2) it This is at present a pure definition of a new symbol. 612 g „ and 621 3 „ denote quite distinct coefficients.j. .23 . (n-1 ' •''n (1) which the secondary subscripts are written is. Evidently all the remaining results of Chapter IX. The regressions b-^^. . In the case of two variables. the correlation-coefficient r-^^ may be regarded as defined by the equation in the first case and x. n = ("12. fijg. —PARTIAL CORRELATION.34 . be assigned the "best" values. correlation-coefficient it. A 2(a. or... as a definisubscripts is and possesses all the properties of.234 THEOKY OF STATISTICS. . being regarded as of order zero. . .. as usual...2 .n •'^1.(612. and so on.„) = .„• .e. ......2. etc..elation of order 'p. « X. .... (4) standarddeviation denoted by a symbol with p secondary suffixes will be termed a standard-deviation of the pth order.34 . .. correlations. .. coefficient... x„Y being very small. .. 613. (n-l) 1) for determinagain for determining . .23.. tion we have . .. as it is sometimes called) denoted by a symbol with p secondary suffixes. . . In-ll ^n • (3) where x. as determined by the method of least squares. .n.23 .2 o-j. Such an error (or residual.5) There are a large number of these equations. r^j. will be given by equations of the form '' N being. the number observations. .. |„_i. (« ing the coefficients 61J. in the case of a correlationa corr... more briefly.„. neglecting the term in • • • n a.24 m etc.„ = 2(423. • ..i etc.^23 .... . that is.„. . 7. . That .. by the equation -Z\^-0i..] j. 2X^{X^ 6. Finally. The correlations rjj. as distinct from partial.„-„ 8^... .. we will define a generalised standard-deviation .34. eqn.2 + • + *1. n ^2 • " "13. the difference between the actual value of x-^ and the value assigned by the right-hand side of the regression-equation (1). however.. A correlation-coefficient with p secondary subscripts will be termed Evidently. (. . a name may.2+ .. pending the proof. be applied to the equation (2) is . and spoken of as total... From the reasoning of § 3 it follows that the " least-square values of the partial regressions 612. the order in which both primary and secondary written is indifferent.34 . 6.. etc. of ») . .23 . +6l„. If the regressions J1234 . + 6i„ 53 .^'i~ . will be denoted by a. • 2(a.„. i. . the standarddeviations (rj.23. "1. („-l) ^n) = 0. etc. .~ "ln. {cf.612. .J. for the right-hand side of unaltered by writing 2 for 1 and 1 for 2. the standarddeviations o-j cTj...^x^ x^ are assigned any one set of observed values.34 . will be termed a deviation of the pth order.23 .. may be regarded as of order zero. the error of estimate.. .„. .!•„)' = 2(^1 8 + SK+ is.23 ... . etc.j .24 .34 .2 ^1. in terms of the notation of equation (3). — ''^\~ O12. . (d) of § 3) of the first order. . a!2. .. is imaltered by adding to the secondary subscripts of the former the latter. the product-sum of any deviation of order p with a deviation of order p -I.34 . n !»2. .. —PARTIAL CORRELATION. . (n-1) • '''2.„. he will see that when the con" dition is expressed that b^^.84 . x^ enters into the product-sum with 3:1. (n-1)) . a. 2(Xl. . we may note that x^ is and x^ uncorrelated with Xj.Jjji. (n-1) .. „ :x^ the equal product-sums that may be obtained deviations is unaltered by omitting any or all of the secondary subscripts of either which are common to the two.q.„. conversely.. . n).. 8.. every x the suffix of which is included in the secondary suffixes of *i.. the p subscripts being the same in each case.».34 .».. m . .. ..2.n .34 .. As the simplest case.. .. .. Therefore.„) — ^\^l. . . «) = S(K] XiM . of any deviation of order zero with any deviation of higher order is zero. If the student will follow the process by which (5) was obtained.. . . The theorems of this of .— XII. amd. and so 3(^1. .. .34 .. . ^a). ») =2a!l. quite generally. : 235 the coefficients Jji 34 ..i X2. n • %34 . But it follows from this that ...«) = ^\pl. etc. and so on they are sometimes termed the normal equations. . . . and so on..i..si .34 . provided the subscript of the former occur among the secondary subscripts of the latter. .U .. . and of the preceding paragraph are fundamental importance... uncorrelated with X2. . . n Xg- . when the same condition is expressed for b^g^i n> ^s enters into the product-sum.34 .n shall possess the "least-square value.s4 Comparing in this way. • • . on. 71 enters into the product-sum. all (6) *'2. we see that the product-simi of any two any or all of the q additional subscripts of It follows therefore from (5) that amy product-sum is zero if all the subscripts of the one deviation occur among the secondary subscripts of the other.ii . . 2(lBl. 2(^1.34 Similarly. ^n) = S(a!l.34.n. . .34 ..23 .U . Similarly again. . n) — S(a..» = S(a!i. Taking each regression in turn. .... . and should be carefully remembered. . The normal equations of the form (5) are therefore equivalent to the theorem The product-svm.34 . in fact..Sl . . .. n . .23 .ni^^-iwA.Kjj). (n-1)) = '^{H'l. THEORY OF STATISTICS.— 236 9. We have now from g§ 7 and 8 = S(%34.. .. it will not be possible to estimate x^ with any greater accuracy from x^ and x^ than from x^ alone.. - hn. r2g= +0"5.. (I .hnM . . . Further. m) = 2(a!i. . .. . — . .34 .34 . . .M =Sa. say the inverse.. as it leads to rather unexpected results.rj.. for if we are estimating one variable from n others. . . . (n-l) /ii% (. .. (n-l)._i| by (n-l) • 6„2. (n-l) 7 • "n2.. ^ • (n-l) ~ °ln. . ff2.34 . 237 This is again the relation of the familiar form ai„ = ai(l-rl) with the secondary suffixes 23 (n-l) added throughout.ry(l .. § 13). It should be noted that. . or.34 . . .«-i. iCg x^-i.' Any regression of order p may be expressed in terms of For we have regressions of order p -1.i. („_i) This condition is somewhat interesting.. . like any correlation of order It also follows zero. 7 . rj3= +0'4. Xn will not increase the accuracy of estimate unless ri„_^ .... („_i) The student should note that 7 this is O iii "1 an expression of the form • "liM — 1 — bin I . (n-ll '^2. . order. It is clear from (9) that r^^^ . .si . for the value of rjg2 is zero (see below..) • (10) This is an extremely convenient expression for arithmetical use the arithmetic can again be subjected to an absolute check by eliminating the subscripts in a different.. it is obvious that the values must be identical . „ — Ol2.. .„_i|. .. .i. a. .34 ~ 7 ''2n.terms . . (n_l|). XII.rU. ''•2. in equation (9). S(a.34 . n) .M . .34 .. (n-l) . 11. (ii-l)S(!Ci.S4 . (n-l| • a%.6-.34 . 6n2. This is useful as affijrding an independent check on arithmetic.hnM . any other subscript can be eliminated in the same way as subscript n from the suffix of o-. ?^ . "vS 6_.23 „. .„. it is clearly indifferent in what order the latter are taken into account. cannot be numerically greater than unity.ii. —PARTIAL COKKKLATION. . . .34 . .„ = tr?(l . („_i) ain .. . so that a standard-deviation of order p can be expressed in p ways in terms of standard-deviations of the next lower order.23 (n-i) "^i^ ^® expressed in the same way in terms that we must have of o'l.S4 . . .34 .34 .23. (9). (not ri„) differ from zero. <r?.a (n-2)) ^^^ so on. . . . in a^ to a!„_i) . (»-i)(a!2 .. 0-2. from I . (n-l)) . (re-1) . 0-1. .. . . at once that if we have been estimating ojj from x^. if r-^^ = + 0"8.. (n-l) .. so . . 12.i. .. (n-i)) we have 612. .. . ..){l . . . 2>A34 . Apart from the algebraic proof.. ''J2. (n-l) ''„2. .31 >i= 612. = S(a..34 .a4 . . n . . .s4 (»-l) . b^^ . (n-Jai^si .34 . .rj^^) . (»-ii • <»n . . .34 Eeplacing . . For example... . .. . (n-l) + "ln.34 „ can be eliminated so as to obtain another equation of the same form as (12). (3) to calculate any required regressions by equation (8): the use of equation (11) for calculating the regressions of successive orders directl.»> %46 n being it is .. (2) to calculate any required standard-deviations by equation (10) . . From equation (11) we may readily obtain a corresponding equation for correlations.)'(l-rL34.. . .„ „ _ ^12. (n-l) ~ ^ln. (n-I) = "12.M i. we might also regard it as the partial regression of aj^j „ on ^2. and so on. ... . ~ -^i ^2n.. the arithmetic in the calculation of all partial coefficients of an order higher than the first.. the expression for three variables '"•" (l-0'(l-r-|..34 . is (1) to calculate the correlations of higher order by successive applications of equation (12) .34 . having calculated all the correlations and standard-deviations of order zero.34 . the first for . for any one of the secondary suffixes of r'i2.. . with the subscripts 34 coefficient 612. ... . (1. .34 . The equations now obtained provide all that is necessary for the arithmetical solution of problems in multiple correlation. . have been eliminated in lieu of n.. For (11) may be written I "12.34 .^ from each other is comparatively clumsy. . and the value obtained for r-i^M „ by inserting the values of the coefficients of lower order in the expression on the right must be the same in each case.. given. Evidently equation (12) permits of an absolute check or . 238 THEORY OF STATISTICS. (n-II> *n.. As any other secondary suffix might (n-l) being given.34 .(n-l.. (n-l) • ^2n.34 the partial regression of ic^g^ _ _ (^-d on X2... .3i ...34 .34 .. We will give two illustrations. ...34 .34 . .34 „ can be assigned interpretations corresponding to those of 612 34 „ above..34 0'2... and »-i2. . 1 J- (n-l) • ^271.34 n _ ^12.e.)' This is. . . ti • '"2. (n-l ) "'1. . („_i). 14.34 .-1) . . .. .....S4 . The best mode of procedure on the whole. The therefore be regarded as determined from a regression-equation of the form n K^a-J ^l. (n-l| Hence. 13.. similarly..... (n-l) ' (12) (l-rS. (n-l) added throughout. .. .^ .)» with the secondary subscripts added throughout.. (n-II • '''n. .. .34. writing down the corresponding expression for b^^si and taking the square root n2.„-l. (n-l). . ...34. .23 . . |n-l) ~ ^ln.. illustration of continuation of example cultural labourers we shall take will be a Chapter IX. — The first i. — PARTIAL CORRELATION. Using as our notation Xj = average earnings. these had.. .. First it will be noted that the logarithms of (1 . In col. In Question 2 of the same chapter are given (3) the ratios of the numbers in receipt of outdoor relief to the numbers relieved in the workhouse. Example tion i. equation (12) used direct in simplest form ». especially if logarithms are used. the first constants determined are Jlfj = 15-9 shillings 0-1 = 1 '71 shillings M^= lf. rj2=-0-66 r. t. regressions. 5-79 o-j o-„ = 1 '29 = 3-09 per cent. for these three variables.—— XII. accordingly. X^ = percentage of population in receiptof relief.= its 3-67 per cent. The introduction of more variables does not involve any difference in the form of the arithmetic. 239 three and the second for four variables. 3 the product term of the numerator of is The work in tabular form. 2 of the table below).g=-0-13 r„o=+0-60 is To obtain the partial correlations. Required to work out the partial correlations. but rapidly increases the amount. better be worked out at once and tabulated (col. in which the correla- was worked out between (1) the average earnings of agriand (2) the percentage of the population in receipt of Poor-law relief in a group of 38 rural districts. as 1. X^ = out-relief ratio. etc.r^)* occur in all the denominators . _ ^12 ~ ^13 -^28 best done systematically and the results collected many of the logarithms occur repeatedly. in the same districts. omitting the former entirely.may be calculated twice inde student. o-j.8 is. log log log rapidly done if the are The values found = 0-06146 = 1 "84584 0-3.1.2s/<''2.0-45 ail + 0-22 Wg (3) .' the values of log first-order coefiBcients (col. pendently by the formulae of the form = 0-1(1 BO as to -r?3>(l-»i. = 1-33917.2= : -0'*5 +0-85 : : log log log *i3..13- This transformation is a useful one and should be noted by the The values of each n. therefore. Jl .1= From log log log these and the logarithms of the r's = = 1 "64:993. 0'08116.i-i-2-18 x^ . ""ssi ^^^^^^ ^^^ standard-deviations of the first-order are not in themselves of much interest.8= 631.1. 631.r^ for the Having obtained the correlations we can now proceed to the If we wish to find all the regression-equations. may save needless arithmetic. 9). to calculate at once.i3 o-i23 0-2.r^ have been tabulated. 0-2. 632.12 = 0-34571 0-1. we shall have six regressions to calculate from equations of the form regressions.2 628. by replacing the standarddeviations of the first-order by those of the second. and the standard-deviations of the second-order are so.2 = 1-93024. *i2.)» is values of log check the arithmetic .g.g (2) x^= .j to the form We "12s ~ ''12. for reference in the calculation of standard- deviations of the second-order.240 THEORY OF STATISTICS of the denominators from those of the numerators we have the It is also as well logarithms of the correlations of the first-order. +2-18 That the regression-equations are (1) iBi= -l-21a.1 = T-36174. and transforming the above equation for Sjj. "12-3 ~ ''12-8 • "'iw^'is- These will involve all the six standard-deviations of the first order o-j.r3= -)-0-85 a.3 621. as being the standard-errors or root-mean-square errors of estimate made in using the regression-equations of the second-order. the work Jl .1 = 0-33891.13 = l'15 = 0'70 0-5.2-l-0-23a.3= -1-21 6.12 = 2-22 we have 6133= +0-23 6231= -t-0-22 632.8 • ''l.g.23 o"2. 612. g. 241 or. and the second regression-equation. and this is. IX. Chap. that out-relief may adversely affect Vae possibility of earning. 3.2= -hO'44) and the regression-equation (1). actually occur. The percentage of the population in receipt of relief. 1 per cent. however.0-45 Xj + 0-22 Xj Out-relief ratio Xl= -15-7 + 085 X^+2-18 X^ The units are throughout one shilling for the earnings Xj. It should be noticed. (1) (2) Earnings Pawperism Xj = + 19-0 - 1-21 X^ + 0-23 X^ (3) X^= + 9-55 . of the take a portion of the data from another investigation into the causation of pauperism. that described in the first illustration of Chapter X. indicate that in unions with a given percentage of the population in receipt of relief (X^ the earnings are highest where the proportion of out-relief is highest . The ratio of the niimbers given outdoor relief to the numbers relieved in tte workhouse. against the hypothesis of a tendency to lower wages.XII. The partial correlation coefficient (»'i3. —PARTIAL CORRELATION. The percentage of the population over 65 years of age. will Example work ii. 2. though very small (c/. viz. —-(Four variables. and 1 for the out-relief ratio Xj. by limiting the employment of the old.. The first and second regression-equations are those of most practical importance. e. however. that a negative ratio is clearly impossible. and the total coefficient (rj3=-0'13) between earnings (Xj) and out-relief (Xg). to which the student should refer for details. for the pauperism Xj. It remains possible. the argument might be advanced that the observed correlation (rjj = -f 0'60) between pauperism and outrelief was in part due to the negative correlation (»']3= -0'13) between earnings and out-relief. and consequently the relation cannot be strictly linear but the third equation gives possible (positive) average ratios for all the combinations of pauperism and earnings that . of course. Such a hypothesis would have little to support it in view of the smallness and doubtful significance of rj3. transferring the origins to zero. in the case of four variables. § 17). and is definitely contradicted by the positive partial correlation r^^^ = 4. does not seem inconsistent with such a hypothesis.0'69. The argument has heen advanced that the giving of out-relief tends to lower earnings. in so far. The variables are the ratios of the values in 1891 to the values in 1881 (taken as 100) of— 1.) As an illustration of the form we 16 . The third regression-equation shows that the proportion of out-relief is on the whole highest where earnings are highest and pauperism greatest. As regards pauperism. — 242 4. THEOEY OF STATISTICS. The pop^ilation itself, in the metropolitan constants follows : (means, group of 32 unions, and the fundamental standard-deviations and correlations) are as Table I. Xn. —PARTIAL CORRELATION. 243 Table II. 1. 244 sorrelations of the THEORY OF STATISTICS. first order (Table II. col. The first-order coefficients are then regrouped 4) are obtained. in sets of three, with the same secondary suffix (Table III. col. 1), and these are treated precisely in the same way as the coefficients of order be seen, the value of each coefficient two ways independently, and so the arithmetic is checked r-^^-u occurs in the first and fourth lines, for instance, rj324 in the second and seventh, and so on. Of course slight diiferences may occur in the last digit if a sufficient number of digits is not retained, and for this reason the intermediate work should be carried to a greater degree of accuracy than is necessary in the final result thus four places of decimals were retained throughout in the intermediate work of this example, and three in the final result. If he carries out an independent calculation, the student may differ slightly from the logarithms given in this and the following work, if more or fewer figures are retained. Having obtained the correlations, the regressions can be calculated from the third-order standard-deviations by equations of the form (as in the last example), zero. In this way, it will of the second order is arrived at in : ; "12-34 '1234^ "^2134 > BO the standard-deviations of lower orders need not be evaluated. Using equations of the form <^i.2,i = <ri(l - »^2)»(1 - '1a2)'(l - ^u.^y we find 1-35740 log <^i.234 log oTj 134 =1-50597 0-65773 log <r3.i24 log <^4.i23= 1-32914 = = o-i.23,=22-8 (r2i34 0-3.124 ^^4.1,3 = 32-l = 4-55 = 21-3 All the twelve regressions of the second order can be readily calculated, given these standard deviations and the correlations, but we may confine ourselves to the equation giving the changes pauperism (Xj) in terms of other variables as the most important. It will be found to be in iBj = 0-32.5a;2 + 1 -383x3 - 0-383a;4, and expressing the equation in transferring the origins percentage-ratios, or, terms of Zj = - 31-1 -I- 0-325X2 + 1-383Z3 - 0-383X4, XII. —PARTIAL CORRELATION. : 245 or, again, in terms of percentage-changes (ratio - 100) Percentage change in pauperism — , = + 1 '4 per cent. + 0'325 times the -fl"383 - 0'383 „ change „ in out-relief ratio. „ „ proportion of old. population. These results render the interpretation of the total coefficients, which might be equally consistent with several hypotheses, more clear and definite. The questions would arise, for instance, whether tlie correlation of changes in pauperism with changes in out-relief might not be due to correlation of the latter with the other factors introduced, and whether the negative correlation with changes in population might not be due solely to the correlation of the latter with changes in the proportion of old. As a matter of fact, the partial correlations of changes in pauperism with changes in out-relief and in proportion of old are slightly less than the total correlations, but the partial correlation with changes in population is numerically greater, the figures being ri2= -1-0-52 r,,= +0-il r-j,= »-i2 34=-l-0-46 ri3.2,= -l-0-28 -0-14 ri^,3=-0-36 So far, then, as we have taken the factors of the case into account, there appears to be a true correlation between changes pauperism and changes in out-relief, proportion of old, and population the latter serving, of course, as some index to changes in general prosperity. The relative influences of the three factors are indicated by the regression-equation above. [For the full discussion of the case cf. Jour. Boy. Stat. Soc, vol. Ixii., 1899.] 15. The correlation between pauperism and labourers' earnings exhibited by the figures of Example i. was illustrated by a diagram in — (fig. 40, p. 180), in which scales of "pauperism" and "earnings" were taken along two axes at right angles, and every observed pair of values was entered by marking the corresponding point with a small circle the diagram was completed by drawing in the lines of regression. In precisely the same way tlie correlation between three variables may be represented by a model showing the distribution of points in space ; for any set of observed values X,, Xj, X) may be regarded as determining a point in space, just as any pair of values Xj and Xj may be regarded as determining a point in a plane. Fig. 45 is drawn from such a model, constructed from the data of Example i. Four pieces of wood are fixed together : ; 246 like the THEORY OF STATISTICS. bottom and three Supposing the open sides of a box. a scale of pauperism is drawn vertically upwards along the left-hand angle at the back of the "box," the side to face the observer, Fig. 46. Model illustrating the Correlation between three Variables (1) Pauperism (percentage of the population in receipt of Poor-law relief) (2) Out-reiief ratio (numbers given relief in their homes to one in the Workhouse) (3) Average "Weekly Earnings of agricultural labourers, (data pp. 178 and 189). A, front view £, view of model tilted till the plane of regression for pauperism on the two remaining variables is seen : ; ; — as a straight line. ; : XII. — PARTIAL CORRELATION. 247 scale starting from zero, as very small values of pauperism occur a scale of out-relief ratio is taken along the angle between the back and bottom of the box, starting from zero atthe left finally, the scale of earnings is drawn out towards the observer along the angle between the left-hand side and the bottom, but as earnings lower than 12s. do not occur, the scale may start from 12s. at the Suitable scales are pauperism, 1 in. = 1 per cent. ; outcorner. relief ratio, 1 in. = 1 unit ; earnings, 1 in. = Is. ; and the inside measures of the model may then be 17 in. x 10 in. x 8 in. high, Given these three the dimensions of the model constructed. scales, any set of observed values determine a point within the " box." The earnings and out-relief ratio for some one union are noted first, and the corresponding point marked on the baseboard a steel wire is then inserted vertically in the base at this point and cut off at the height corresponding, on the scale chosen, to the pauperism in the same union, being finally capped with a The model small ball or knob to mark the "point" clearly. shows very well the general tendency of the pauperism to be the higher the lower the wages and the higher the out-relief, for the highest points lie towards the back and right-hand side of the model. If some representation of all three equations of regression were to be inserted in the model, the result would be rather confusing ; so the most important equation, viz. the second, giving the average rate of pauperism in terms of the other variables, may be chosen. This equation represents a plane the lines in which it cuts the right- and left-hand sides of the "box" should be marked, holes drilled at equal intervals on these lines on the opposite sides of the box (the holes facing each other), and threads stretched through these holes, thus outlining the plane as shown In the actual model the correlation-diagrams (like in the figure. fig. 40) corresponding to the three pairs of variables were drawn on the back sides and base they represent, of course, the elevations and plan of the points. The student possessing some skill in handicraft would find it worth while to make such a model for some case of interest to himself, and to study on it thoroughly the nature of the plane of regression, and the relations of the partial and total correlations. : : : : 16. If we write <^.23 I. = o"i(l - ^iixi . . . . «)) • • • (13) ajj is the correlation between be shown that R^^ nj and the expression on the right-hand side of the regressionequation, say 61,23 ....«) where it may *1.28.. . « = "12.34. ..n- ^2"'' °13M. • n • •''s''' ' "• + "l».2S. . . (n-I| • *» • (14) 248 For we have 2(a;i. 61.23 THEORY OF STATISTICS. „)=2ajj(a;j-a;i,23 .... „) = iV'(af - (r?.jj . . . „) and also 2(ef,j3 „) = 2(a;i - a;i.23 (aj ... x-^ „y = iV(cr| ei.23 0^.23 „) whence the correlation between - and ... „ is o-f.;3 ....„)* is the value of £^23 ....„> given by (13). The value of accordingly a useful datum as indicating how closely x-^^ can . x„, and be expressed in terms of a linear function of x^, x^ the valuer of the regressions may be regarded as determined by the condition that Ji shall be a maximum. Its value is essentially positive as the product-sum 2(xi.ei.23 ....„) is positive. maybe termed a coefficient of (7i-l)-fold (or double, triple, etc.) correlation ; for n variables there are n such correlations, but in the limiting case of two variables the two are identical. The value may be readily calculated, either from o-j.^s „ and o-j or directly from the equation i.e. . B . . B , . . . l-K^...«) = a-ri.){l It is obvious - rf3.2)(l -ri,.n) • • (1 -^In.,,...,„-:,)• (15) from this equation that since every bracket on the right is not greater than unity, Hence Bi,^ ....„) cannot be numerically less than r-^^. same reason, rewriting (15) in every possible form, For the ^1,.^ „, cannot be numerically less than rj^, r^,, .... r^^ i.e. any one of the possible constituent coefficients of order zero. Further, for similar reasons, B^^ ....„) cannot be numerically less than any possible constituent coefficient of any higher order. That is to say, B^^s the greatest „| is not numerically less than of all the possible constituent coefficients, and ie usually, though not always, markedly greater. Thus in Example i., B^^^) (the coefficient of double correlation between pauperism on the one hand, out-relief and labourers' earnings on the other) is 0'8.39, and the numerically greatest of the possible constituent coefficients is rj2g=-0'73. Again, in Example ii., ^11.^3,) is 0-626, and the numerically greatest of the possible constituent . . . . coefficients is »'i2.4= -fO'573. The student should notice that is necessarily positive. Further, even if all the variables Xj, X^, X„ were strictly uncorrelated in the original universe as a whole, we should expect ''121 ''13.21 ''1428' '^^''i ^'^ exhibit values (whether positive or negatived . . . B . XII. — PARTIAL CORRELATION. 249 differing from zero in a limited sample. Hence, B will , not on an average of such samples, to be zero, but will fluctuate round some mean value. This mean value will tend, be the greater the smaller the number of observations in the sample, and also the greater the number of variables. AVTien only a small number of observations are available it is, accordingly, little use to deal with a large number of variables. As a limiting case, it is evident that if we deal with n variables and possess only n observations, all the partial correlations of the highest possible order will be unity. 17. It is obvious that as equations (11) and (12) enable us to express regressions and correlations of higher orders in terms of those of lower orders, we must similarly be able to express the coefficients of lower in terms of those of higher orders. Such expressions are sometimes useful for theoretical work. Using the same method of expansion as in previous cases, we have = 2(a;i 23 ...... x^,si .... ,„_i,) The following table gives the limits of the third coefficient in a few special cases.Aa..»-?2 .:..J + »23. if j-j^j must lie between the limits -na3n3.3 + »13..1 + 2n2.. this gives as limits for rgg '•12^3 ± n/1 its . (i-r?„.»^3. 18. we take r-^^. but we propose to treat them only briefly here. These inequalities correspond precisely with those "conditions of consistence" between class-frequencies with which we dealt in Chapter II. + ri. simplest form for r^^ in Similarly writing (17) in ''i2-8> *'i3-2> terms of ^'^^ ''23.— 250 which is THEORY OF similarly the equation > STATISTICS. we must have r?2.sr-i3.. (?i.i < 1 and r^^^ are given.3<l or (^12 ^13 ''23) (l-rj3)(l-rl3) that is.1) added throughout.2ri3r-23 <1 If (18) if the three r's are consistent with each other.rls + rj^rfs. Equations (12) and (17) imply that certain limiting inequalities must hold between the correlation-coefficients in the expression on the right in each case in order that real values (values between ±1) may be obtained for the correlationcoefficient on the left.2r-23. + rfs + r^ - 2?-.2 + 'ii.3.2± Jl - r^as . »"?2 <1.)'(i-^U* with the secondary suffixes 34 . r^^-^ (19) and therefore.1) ^s must have »f2. Writing (12) in its simplest form for r^2. for the three coefficients of zero order and of the first order respectively : Value of . rjj as known. 19. Finally. permits no inference of any kind as to the correlation between 2 and 3. and this may lead to still more serious errors of interpretation. . of the first order and value unity are only consistent if one only. The general nature of such fallacies is the same as for the case of attributes. We do not think it necessary to add to this chapter a detailed discussion of the nature of fallacies on which the theory of multiple correlation throws much light. conversely. if of opposite sign.\ V(l -ris. We may thus easily misinterpret a coefficient of correlation which is zero. The values of the two given r's need to be very high if even the sign of the third can be inferred .^ may be. the set of three coefficients not -1. IV. of opposite sign to r-^^. +1. It suffices to point out the principal sources of fallacy which are suggested at once by the^ form of the partial correlation .1. -1.i){i--'rh. but On the other hand.1. Vy^.1. .1.1. ''•' (a) x/(l-rl3)(l-r?3) and from the form of the corresponding expression for r-^^ in terms of the partial coefficients „ _ ^12 - 3 + ^18-2 ^ 28-1 IJ. . Thus the quantity of a crop might appear to be unaifected. by the amount of rainfall during some period preceding harvest if From rj2 this might be due merely to a correlation between rain and low temperature.i) the form of the numerator of (a) it is evident (1) that even be zero. positive. or . we see that. —PAKTIAL COKKELATION.. indeed often is. which may lie anywhere between + 1 and . r^j will not be zero even though r-^^. §§ 1-8. the partial correlation between crop and rainfall being positive. unless either r-jj^ or r^gi is zero. . 251 The student should notice that the set of three coefficients of order zero and value unity are only consistent if either one only. .. +1. and important. or all three.: XII. or all three. + 1 . 1 and 3 are uncorrelated.e. If rjj and r^^ are of the same sign the partial correlation will be negative . +1. -1.. on the other hand. This corresponds to the theorem .1 and . (2) r-^^. i. or both. are zero. say. are positive. are negative the only consistent sets are +1. it may be noted that no two values for the known coefficients ever permit an inference of the value zero for the third the fact that 1 and 2. if the : two are equal. + 1.1.^ is zero. and was discussed fuUy in Chap. . From the form of the numerator of (J). pair and pair. they must be at least equal to \/0'5 or "707 .^ will not be zero unless either r^j or rgg. g and x^..s is the correlation between a. The process for deterChap... 3 and that we might determine the value of this partial correlation by drawing up the actual correlation table for the two Suppose. p. Stat. however. Stat. Heredity. (3) YULB. Jioy.g would class-intervals of its range. S aud 4. not be the same (or approximately the same) for all such tables.B should be regarded. vol. Series A. Greenwood. 182. 8. U. 1892. Study of Index-Correlations. W. 197. thorough and complete." Proc. X. I. "On the Theory of Correlation for any number of Variables treated by a New System of Notation. Hoy.. Ixxix. (1) (2) Edgkworth. of Chap. 1914. 891-4U. Soc. 194. The preceding chapter is written from the standpoint of refs. Ixix. 45-6). (4) Yum. § 20) might be useful. and Panmixia. 263. . it will be remembered. 7 and the theory more fully developed in ref. TJ. p. Tlie tlieory of correlation for several variables was developed by Edgeworth and Pearson (rets. vol. mining partial associations (cf. 1897. G. 1914.. vol. that instead of drawing residuals in question. Soc. Roy. IV. say.. "On the Theory of Correlation.). JI. Soc. "Note on Estimating the Kelative Influence of Pbaksoh. Stat. Traits. pp. of Chap. Chap. in the case of Skew Correlation.. Yule. (6) Yule. E." Jour. is illustrated by Example i." Froc. pp. in general. Hence r-^2.jj associated with values of x^ lying within successive In general the value of rj^... Soe. Ix. H.. Mag. We and ajj." Biometrika. IV. up a single table we drew up a series of tables for values of a. It is actually employed in the paper cited in ref.j... F. but would exhibit some systematic change as the value of a. 1897. and Fkanokb Wood. Series A.. clxxxvii. IV. (6) Hooker. " On the Significance of Bravais' Formulro for Eeeression....) is. Pb. 20. Ix. U. p. though exceedingly laborious. etc. vol. Two Variables upon a Third. 1906. Theory.252 THEORY OF STATISTICS. Hoi/. XVI." Phil. "On Correlated Averages. 1907. vol. have seen (§ 9) that j-jj. and with the notation and method of ref. Theoretical.) are probably exceptional.^ for everi/ value of x^ (cf. XVI. p.. TJ. L.. as we always obtain the actual tables exhibiting the association between.. vol. 1896. and indicates a source of fallacies similar to those there discussed.. p.) (8) ISSEKLIS. Jour." Jour. xxxiv. Moy. G. 1 and 2)from the standpoint of the "normal" distribation of frequency (c/. Soc. : : REFERENCES. (The partial or "solid correlation-ratio is used. 6th Series. For the general case an extension of the method of the " correlation-ratio " (Chap. "Oa the Partial Correlation-Ratio. Chap. vol. Kakl.. Soc. vol. Hoy. § 6. "A . It might sometimes serve as a useful check on (pp. partial-correlation work to reclassify the observations by the fundamental methods of that chapter.^ g and ^j. J. 317-46. X. nature of an average correlation the cases in which it measures the correlation between ajj. A and JB in the universe that these two associations may of C's and the universe of y's differ materially. Ixxvii. p. (7) Bbown. and G. 812. as of the increased." PAt/. 477. G. " Eegression. Y. 5. . XI. what are the values of the partial correlations of successive orders ? Under the same condition. by working out the partial . Chap. correlations. C. p. Stat. 1899. ^3= accumulated temperature above 42° England during 20 ffi F. G.j and %i ? 5.Zi-)-&. Ray. Check the answer to Qu. Soo.a certain district of ifi years. and X3=o. (The following figures must be taken as an illustration only the data on which they were based do not refer to uniform times or areas. floOKEB. Stat.. " The Application of the Method of Multiple Correlation to the Estimation of Post-censal Populations." Jmir. —PARTIAL COKEBLATION.. say = r. If all the correlations of order zero are equal. vol. (11) Show. H. in. The following means. XL.." Jour. a^.! ) XII.= +0-80 r]3=-0-40 = 28-02 M2= i-91 Ms=59i = i-i2 (r2=l-10 0-5=85 ?-23=-0-66 Find the partial correlations and the regression-equation for hay-crop on spring rainfall and accumulated temperature. 2. 576. Ixii. 6. 249. figures below for 30 urban areas in England and Wales. Xj. p. Soy.) found for 1. (10) yoLE. vol. and correlations are Jfj = seed-hay crops in cwts.. in spring.Xj=deaths of infants under mortality). : . R. 1911. (Ref. 1907. " An Investigation into tlie Causes of Changes in Pauperism in England. Ixriv.000. Soc. standard-deviations. ri5.. and Xg." Jam. Write down from inspection the values of the partial correlations for the three variables Zi. vol. 1. 10. 1 year per 1000 births in same year (infantile ^2= proportion per thousand of married women occupied for gain.Xj. 253 Illustrative ApplicationB of (9) . Slat. EXERCISES.. U. Chap. Sac. Boy. 9. "The Correlation of the Weather and the Ctovs. What is the correlation between ajj. what must the partial correlations Check the answer to Qu. X2= spring rainfall in inches. J/] = 164 jl/2=168 j)/g = 143 <ri= 20-0 <ra= 74-9 <r3= 22-4 Ci2=-f0-49 /i3= -I-0-78 ?-i4=-|-0-20 /•23=-l-0-15 r24=-0-37 /•34= -l-0'23 Mi=2Q5 o-4=130-0 3. find the partial correlations and the regression-equation for infantile mortality on the Taking the other factors. per acre. correlations.Xj= death -rate of persons over 5 years of age per 10. Ixx. by working out the partial If the relation holds for bel all sets of values of ajj. 7. etc. what is the limiting value of r if all the equal correlations are negative and n variables have been observed 4. ^4= proportion per thousand of population living 2 or more to a room (overcrowding). Economic Interest.. p. E. 1. when the chance of success or failure is very small 13. Standard-deviation of number — — — — — simple sampling when the numbers of observations in the samples vary 12. or its reciprocal as a measure of precision 7. or standard error.—THEORY OF SAMPLING. measures of dispersion and so forth cannot in general be assumed to indicate the action of definite and assign- Small differences may easily arise from indefinite and highly complex causation such as determines the fluctuating proportions of heads and tails in tossing a coin. The two chief divisions of the theory sampling 3. XIII. Determination of the mean and standard-deviation of the — — — of successes in n events 6. — — 1. The problem of of the present Part 2. ajid relation between mean and standard -deviation. or of cards bearing measurements within some given class-interval in drawing cards. Definition of the chance of success or failure of a given event— 5. Limitation of the discussion to the case of simple sampling 4. 264 . The same for the proportion of successes in n events : the standard-deviation of simple sampling as a measure of unreliability. for checking and controlling the interpretation of statistical results. Use of the standarddeviation of simple sampling. we may have noted 56 heads and only 44 tails.: PART III. CHAPTER SIMPLE SAMPLING OF ATTRIBUTES. Verification of the theoretical results by experiment 8. if on measuring the statures of 1000 men in each of two nations we find that the mean stature is sUghtly greater for able causes. In 100 throws of a coin. Biological cases to which the theory is directly applicable 11. On several occasions in the preceding chapters it has been pointed out that small differences between statistical measures like percentages. Approximate value of the standard-deviation of simple sampling. of black balls in drawing samples from a bag containing a mixture of black and white balls. but we cannot conclude that the coin is biassed on repeating our throws we may get only 48 heads and 52 tails. for example. More detailed discussion of the assumptions on which the formula for the standarddeviation of simple sampling is based 9-10. averages. Similarly. say. from an anthropometric record. The result of drawing a second ball is therefore is The theory of great : : .g. It doesnot hold good. and is unaffected by.. If associated measures of are recorded on each card. e. we only classify the balls drawn as black or as These cases correspond to the theory of attributes. —SIMPLE SAMPLING OF ATTRIBUTES. In tossing a coin we only classify the results of the tosses as heads or tails . for the tossing of a coin or the throwing of a die the result of any one throw or toss does not affect. and these will vary in a similar manner. the remainder may be either 2 black and 3 white. In the present and the three following chapters the theory of sampling attributes alone. If. the results of the preceding and following tosses. 255 nation A than for nation B. the greater part of the theory. Finally. This condition of independence holds good. in drawing balls from a mixture of black and white balls.. 2.4's and a's. but also from the theoretical forms of frequencydistribution to which it leads. or 2 white and 3 black. we can also form two variables and correlation-coefficients for the different batches. These cases correspond to the theory of variables. is dealt with for the case of importance and interest. and white. correlation-coefficients. on the other hand. according as the first ball was black or white. on the other hand. we put in a bag a number of cards bearing different and draw sample batches of cards. in successive samples. The theory of such fluctuations may be termed the theory of sampling. 3. etc. the number or proportion of A's in successive samples being observed. measures of dispersion. The theory of sampling attains its greatest simplicity if every observation contributed to the sample may be regarded as independent of every other. lying somewhat outside the limits of this work. for the drawing of balls from a bag if a ball be drawn from a bag containing 3 black and 3 white balls. not only from its applications to the checking and control of statistical results. owing to its difficulty. one or two of the more important cases of the theory of sampling for variables are briefly treated. we values of some variable can form averages and measures of dispersion for the successive batches. in Chapter XVII.XIII. and these averages and measures of dispersion will vary slightly from one batch to another. and there are two chief sections of the theory corresponding to the theory of attributes and the theory of variables respectively. we cannot necessarily conclude that the real mean stature is greater in the case of nation A possibly if the observations were repeated on diiFerent samples of 1000 men the ratio might be reversed. and it is the function of the theory of sampling for such cases to inform us as to the fluctuations to be expected in the : X X T averages. the general case may be represented as the drawing of a sample from a universe containing both . 4. the latter pjf. refer to an event the chance of success of which is p and the chance oi failure q. using terms which have become conventional. homogeneous circular disc. of which approximately 23^71 will be successes. if we may regard the ideal die as a perfect homogeneous cube. and the cliance of throwing six (or any other face) with a die is 1/6. in any long series of throws. If we may regard an ideal coin as a uniform. Similarly. it will tend. or with any given face uppermost one-sixth of the whole number of times. Suppose we take Jf samples with n events in each. we theoretical study and experimental verification. Successes |. consider first the single event (ra = I). as in the case of an arithmetical example : Frequency/. f^. we may expect. — pF pN pN F — pN . there is nothing which can make it tend to fall more often on the one side than on the other . The disturbance can only be eliminated by drawing from a bag containing a number of balls that is infinitely large compared with the total number drawn. or with. times in If trials. and the mean number of successes in a sample will therefore tend towards pn.n events. These results are sometimes expressed by saying that the chance of throwing heads (or tails) with a coin is 1/2. for there are J!f. dependent on the result of drawing the first. we shall in future. as in coin-tossing or dicethrowing the simplest cases of an artificial kind suitable for For brevity. may refer to such cases of sampling as simple sampling the — : implied conditions are discussed more fully in § 8 below. In this chapter our attention will be confined to the case of independent sampling. Take this frequency-distribution and work out the standard-deviation of the number of successes for the single event. say. 5. or by returning each ball to the bag before drawing the next. The single event may give either no successes or one success. therefore. To avoid speaking of such particular instances as coins or dice. ?# pif 1 — /f. to fall with each of its six faces uppermost an approximately equal number of times. What will be the values towards which the mean and standard-deviation of the number of successes in a sample will tend ? The mean is given at once. and will tend to give the former qlf. Obviously p + g' = 1.— 256 THEORY OF STATISTICS. As regards the standarddeviation. heads uppermost approximately half the times. that in any long series of throws the coin will fall with either face uppermost an approximately equal number of times. § 2. The student should particularly bear in mind that the standard-deviation of the number of successes. o-„ being the standarddeviation of the number of successes in n events. if we regard the observed proportion in any one sample as a more or less unreliable determination of the true proportion in a very large sample from the same material. • . SAMPLING OF ATTRIBUTES. and 257 We have therefore a\=p-p^=pq. of successes in But the number sum of successes for the single events of <ji = npq (1) This is an equation of fundamental importance in the theory of sampling. and so forth have been carried out by various persons in order to obtain exof the proportion of successes in — 17 . reciprocal of the standard-deviation (l/«). all the events being independent. and the standard-deviation of the proportion of successes s„ be given by — ^. The although the true proportion is the same throughout.=a>/n^=pq/n The standard-deviation .). and consequently the reliability or precision of am.XIII. —SIMPLE M=p. We return to this point again below (§ 8 and Chap. XIV. the greater the fluctuations of the observed proportion. observed proportion varies as the square root of the number of This is again a very important observations on which it is based. but as the square root of n. equation (2)). l/«th of the number in each sample. . 7. rule with many practical applications. in a group of n events varies. and. due to fluctuations of simple sampling alone.e. a group of n such events is the which it is composed. As this would amount to merely dividing all the figures of the original or rather the value record by n. (2) samples of such independent events varies therefore inversely as the square Now root of the number on which the proportion is calculated. we have therefore. and the exact conditions from which it has been deduced. not directly as n. by the usual rule for the standard-deviation of the sum of independent variables (Chap. but the limitations of the case to which it applies. In lieu of recording the absolute number of successes in each sample of n events. Experiments in coin tossing. as it is sometimes termed. dice throwing. we might have recorded the proportion of such successes. on the other hand. XI. should be borne in mind. the standard-deviation of sampling may fairly be taken as a measure of the the greater the standardwnreliahility of the determination deviation. 6. precision. i. the mean proportion of successes towards which the mean tends to approach— must be p. or. may be regarded as a measure of reliability. therefore p = g = 0'5. Totals of the columns in the table there given. Weldon. to remark that if ordinary commercial dice are to be used for the trials. 5. p. is to roll them down an inclined gutter of corrugated paper. Cheap dice are generally very much out of truth. xxii. 11th edn. and the marks not out very deeply. vol. (1) (W. we believe. Y. A convenient mode of throwing a number of dice. Successes. The following will serve as illustrations. suggested. cited by Professor F. in order to It may be as well acquire confidence in the use of the theory.. Brit. E.) Twelve dice were thrown 4096 times . Encycl. but the student is strongly recommended to carry out a few series of such experiments personally.. theoretical value of the standard-deviation o- M= 1-732. by the late Professor Weldon. care should be taken to see that they are fairly true cubes. F.258 THEORY OF STATISTICS. so that they roll across the corrugations. Edgeworth. 394. or 6 points reckoned a success. a throw of 4. . perimental verification of these results. Theoretical mean 6 . and if the marks are deeply cut the balance of the die may be sensibly affected. 0-816. 5 = 2/3.— XIII. : . Theoretical mean 1.= 1-296. and not to be always expected. Yule. The following may be taken as an illustra(3) (G. standard-deviation = 0-1667. /i = l/3.) Three dice were tion based on a smaller number of observations. 259 Actual proportion o. U. and the numbers of 5's or 6's noted at each throw. —SIMPLE SAMPLING OF ATTRIBUTES. agreeing with the theoretical value to the fourth place of decimals. thrown 648 times. Of course such very close agreement is accidental. Frequency-distribution observed Successes. Standard- Mean M= 2000. of successes 2-00/12 deviation. To revert to the case of deathrates. For if each sample incladed persons of both sexes and different ages. " six " with the dice was the same throughout we did not commence an experiment with dice loaded in one way and later on take a fresh set of dice loaded in another way. Consequently if formula (2) is to hold good in our practical case of sampling there must not be a : i. so that the chance throwing " heads " with the coins or.— 260 identically similar of THEORY OF STATISTICS. each sample only contained persons of one sex and one age. but. say. but also for every individual in every sample. the condition would be broken. must any essential change have taken place during the period over which the observations are spread. where we have more knowledge. we have also tacitly assumed not only that we were using the same set of coins or dice throughout. Consequently. if the observations have been made at different epochs. nor to death-rates in successive years during a period of conIn all such cases variations tinuously improving sanitation. so that the chances p and q were the same at every trial. so that the chances p and q were the same for every coin or die. again a very marked limitation. it may. be difficult or impossible to say what differences or changes are to be regarded as essential. Thus it is obvious that the theory of simple sampling cannot apply to the variations of the death-rate in localities with populations of different age and sex compositions. the chance of death during a given period not being the same for the two sexes. the conditions that regulate the appearance of the character observed must not only be the same for every This is sample. difference in any essential respect affect the proportion — . in any character that can observed between the localities from which the observations are drawn. nor for the young and the old. even if these samples w^ere all of the same age and sex composition. but also that all the coins and dice in the set used were identically similar. (6) In the second place. nor to death-rates in a mixture of healthj' and unhealthy districts. unless. Where the causation of the character observed is more or less unknown. our formulae deduced. nor. the condition laid down enables us to exclude certain cases at once from the possible applications of formula (1) or (2). formulae (1) and (2) would not apply to the numbers of persons dying in a series of samples of 1000 persons. further. if our formulae are to apply in the practical case of sampling.e. and living under the same sanitary conditions. throughout the experiment. of course. due to definite causes are superposed on the fluctuations of sampling. if we were observing hair-colours. The groups would not be homogeneous in the sense required by the conditions from which our formulae have been Similarly. but with some fluctuations closer and closer to some limiting value. 261 would not apply if the samples were compounded by always taking one person from district A. The third condition was explicitly stated (c) The individual "events. all the samples and all the individual contributions to each sample being taken under precisely the same conditions. we will find th&tp approaches not continuously. sample and observe in it the actual proportion of. It is this limiting value which is to be used in our formulse the value of p that would be observed in a very large sample. The standard-deviation of the number of sixes thrown with n dice. like the drawings of balls from a bag containing a number of balls that is very large compared with the number drawn. even if the dice be out of truth or loaded so that p is no longer : — — — the standard-deviation of the number of black samples of n drawn from an infinitely large mixture of black and white balls in equal proportions may be Jnpq even 1/6. speak of simple sampling in the following pages. our formulae would riot apply even if the sample populations were composed of persons of one age and one sex." or appearances of the character observed. and consequently it has been necessary to emphasise them specially. It may be as well expressly to note that we need not make any assumption as to the conditions that determine term is When we p If we draw a unless we have to estimate Jnpq a priori. these districts not being similar as regards the distribution of hair-colour. Reverting to the illustration of a death-rate. balls in . less erratic variations. another from district B. and (c). and so on.: XIII. must be completely independent of one another. and hence of dying from the disease. for example. Similarly.(/. on this understanding. e. and the individual " events " or appearances of the character being quite independent. A's draw another sample under precisely the same conditions. like the throws of a die. and so on. the intended to imply the fulfilment of all the conditions (a). or sensibly so. if we were dealing. may be Jnpq. railway accidents due to derailment. and consequently the annual returns show large and more or . The above conditions were only tacitly assumed in our previous work. —SIMPLE SAMPLING OF ATTRIBUTES. with deaths from an infectious or contagious disease. and explosions in mines if such an accident is fatal to one person it is probably fatal to others also. and observe the proportion of A's in the two samples together add to these a third sample. For if one person in a certain sample has contracted the disease in question. . (b). The same thing holds good for certain classes of deaths from accident. say. he has increased the possibility of others doing so. reason. degree of approximation in certain biological cases. XIV. >Jnpq. the larger of the two in every case but one. if we note the number of successive periods. is where p is the chance the standard-deviation We are not able to assign an of the proportion of male births. (p. Chap. the standard-deviation 9. however. as a rule. but this is not a necessary inference from the mere applicability of the formulse (c/. of Chap. that in these cases all the necessary conditions are fulfilled. The corresponding standard-deviations for Table VI. but it does apply fairly closely to the sex-ratios of births in different localities. considering the small numbers of observations. however. In the case of the sex-ratio at birth. § 15). p say. of Chap. standard-deviations of simple sampling corresponding to the midnumbers of births. for a few isolated groups of districts conIn both tables the taining not less than 30 to 40 districts each. portion of male births. and it will be seen that the two agree. notably in the proportions of offspring of different types obtained on crossing hybrids. and still more closely to the ratios in one locality during That is to say. for some {Of. slightly in excess of the theoretical values. it seems doubtful whether the rule applies to the frequency of the sexes in individual families of given numbers (ref. drop in dispersion as we pass from the small to the large districts The actual standard-deviations. to hold to a high modification. IX. but it is quite sufficiently accurate for practical purposes to use the proportion of male births actually observed if that proportion be based on a moderately large number of observations. are given in Qu. to the proportions of the two sexes at birth. based on the same data. In Table VI.) field evident that these conditions very much limit the an economic or sociological character to which formulse (1) and (2) can apply without considerable The formulse appear. and the is extremely striking. and show the same general agreement with the standard-deviations of simple sampling. and. IX. STATISTICS. and not 1/2 owing to the black balls. 10. accordingly. is §4. with surprising closeness. is approximately otherwise. male birth or. however. Chap. It of practical cases of of that of a number . ijpq/n . 9). a priori value to the chance p as in the case of dice-throwing. tending to slip through our fingers. with some limitations. are given at the foot of the table. 163) was given a correlationtable between the total numbers of births in the registrationdistricts of England and Wales during the decade 1881-90 and the proThe table below gives some similar figures. on the whole. The actual standard-deviation is.262 if THEORY OF is. again. the actual standard-deviations are. It is possible. XIV. 7 at the end of this chapter. males in a series of groups of n births each. 1/3. [Data based on Decennial Supplement to Fifty -fifth Annual Report of the Registrar-General for England and fVales.XIII.'] . for Groups of Districts with the Numbers of Births in the Decade lying between Certain Limits. —SIMPLE SAMPLING OF ATTKIBUTES. 263 Table showing Frequencies of Registration Districts in England and Wales with Different Ratios of Male to Total Births during the Decade 1881-90. /^ samples containing n^ individuals or observations each. in the seoond pq/n^. the mid-number of births 2000. as the means tend to the same values in all the groups. and assuming that it would be sufficiently accurate' for practical purposes to treat all the districts in one class as if the sex-ratios had been based on the mid-numbers of births. such a process does well enough. the harmonic mean number of observations in a sample must be substituted for n in equation (2)... where the number of observations varies from one sample to another. ^ iz /i TOj A Mj h ™3 and accordingly ^=f Thus the following percentages (taken (3) That is to say. we must have for the whole series 11. These values are given by simply substituting Thus for the proportions per 1000 for^ and q in the formula.. and some other procedure must be adopted.. in both cases the standard-deviagiven are standard-deviations of the proportion of male births ^er 1000 of all births.S^=pq(ii + ^ + '^+ of n-^ .o ' V 2000 / In the above illustration the difficulty due to the wide number of births n in different districts has been surmounted by grouping these districts in limited class intervals. then. But if the number of observations does not exceed. grouping is obviously out of the question. n. and so on What would be the standard-deviation of the observed proportions in these samples? Evidently the square of the standard-deviation in the first group would he pq/n-^. the first column of Table I. the proportion of males is 508 per 1000 births. and therefore The student should note that tions _/5 08x492 Y_. though it is not very good. 1000 times the values given by equation (2). that is. Given a sufficiently large number of observations. . to the nearest unit) of .. variation in the : : JV. /g containing n^. that a series of samples have been taken from the same material. 50 or 60 altogether. and so on therefore. .. /j containing n^. Suppose. perhaps. ).— — 264 THEORY OF STATISTICS. But if ff be the harmonic mean w. — XIII. crossed inter se (A. D. 265 albinos were obtained in 131 litters from hybrids of Japanese waltzing mice by albinos. Darbishire. . iii. p. Biometrika. 30) oentage. —SIMPLE : SAMPLING OF ATTRIBUTES. 266 THEORY OF of STATISTICS. The frequency-distributioa corps per the number of deaths per army annum was aths. . could not occur as a fluctuation of simple perhaps indicate a slight bias in the dice. The deviation observed sampling. Can the divergence from the exact theoretical result have ? arisen owing to errors of sampling only The numerical difference from the expected result is 23.152 Is this excess throws altogether). This proportion is 0'5116 instead may The problem might. The excess is 569 throws.) Certain crosses of Fisum sativum gave 5321 yellow and 1804 green seeds. and the difference observed bears the error as before. standard error of the proportion is of The ' = x/ix|x 49^52 =0'0*^226. —SIMPLE SAMPLING OF ATTRIBUTES. same ratio to the standard Example ii. The standard error is (r= — V0-25X 0-75x7125 = 36-8. p. or 1781. 25. of green seeds. have been ?. and may be called accordingly the standard error. (Data from the Second Report of the Evolution Committee of the Royal Society. It is 5'1 times the standard error. It is desired to know whether the deviation of a certain observed number or proportion from an expected theoretical value is possibly due to errors of sampling. The expectation is 25 per cent.576 expected (out of 49.XIII. of course. i. 5's. Three principal cases of comparison may be distinguished. — Example 5. for the number of observations contained in the sample. Case I. difference in excess O'OllG.ttacked equally well from the standpoint of the proportion in lieu of the absolute number of 4's. practically speaking. or 6's thrown. as of course it must. In this case the observed difference is to be compared with the standard error of the theoretical number or proportion. and. 1905. . or 6 were made in lieu of the 24. possibly due to mere fluctuations of sampling 1 The standard error is 0-= >yjx|x49152 = 110-9. the theoretical 0'5000.145 throws of a 4. 267 " standard-deviation of simple sampling " may be regarded as a measure of the magnitude of such errors. Henqe the divergence from theory is only some 3/5 of the standard error. — In the first illustration of § 7. and may very well have arisen owing simply to fluctuations of sampling. 72. \." for it is not a theoretical number given a priori as in the above illustrations. instead of the theoretical 0'25. in other samples taken in precisely the same way 1 This case corresponds to the testing of an association which is indicated by a comparison of the proportion of A's amongst B's and . then 4=Po<lo/ni. as before. by the (weighted) mean proportion in our two samples together. owing to fluctuations of sampling. the two universes being really — similar as regards the proportion of A's therein 1 (6) If the difference indicated were a real one. (a) Can the difference between the two proportions have arisen merely as a fluctuation of simple sampling. we have 0'2532 s= V0-25x 0-75/7125 = 0-0051. not the standard error for differences of {AB) from (A){B)/]V. the numbers of observations in thesamples being «j and n. If the . Two samples from distinct materials or different universes give proportions of A's p^ and p^. i. It should be noted that this method must not be used as a test of association by comparing the difference of (AB) from {A){B)lN with a standard error calculated from the latter value as a "theoretical number. the proportion of A's being really the same in both cases. whether the observed difference between p^ and P2 may not have arisen solely as a fluctuation of simple sampling. is only some . (A) and (B) being the numbers of heads thrown in the case of the first and the second coin respectively. and {A) and {B) are themselves If we formed an association-table liable to errors of sampling. Case II. and given.3/5 of the standard error.8's (a) We have no theoretical expectation in this case as to the proportion of A's in the universe from which either sample has been taken. Working from the observed proportion of green seeds. might it vanish. Let t. 4=PoQo/''k- samples are simple samples in the sense of the previous work. however. Let us find. and similarly the divergence from theory between the results of tossing two coins iV times. by (the best guide that we have).e.268 THEOEY OF STATISTICS. Cg ^^ ^^^ standard errors in the two samples. then the mean difference between pj and P2 will be zero. o-= JN.^ would be the standard error for the divergence of {AB) from the a priori value m/4. let us say. viz.^ respectively. . are not the same in the material from which the two samples are drawn. as a rule. and in that case either formula will place the difference outside the range of fluctuations of sampling. The justification of this usage we indicate briefly later (Chap. Below Average. (5) the observed difference is less than some three times i-^^ it arisen as a fluctuation of simple sampling only. (6) If the difference between p^ and p^ does not exceed some three times this value 6i €j2i it may be obliterated by an error of simple sampling on taking fresh samples in the same way from the same material. for they do not differ largely unless jOj and p^ differ largely.and self-fertilisation respectively. it is hardly possible that the true value of the difference can be zero. The difference between the values of ej2 given by (5) and (6) is indeed. {b) If. § 3). 17 17 12 22 cross- The figures indicate an association between tallness and fertilisation of parentage. the student should note that the value of Cj^ given by equation (6) is frequently employed. or may some have arisen solely as an " error of . Further. tjg. —SIMPLE 269 and the standard error of the difFereuce independent. but p-^ and ^2 are the true values of the proportions. of more theoretical than practical importance. will be given by the samples being 4=Fo?o(i+i) If .. 3 of Chap. — SAMPLING OF ATTRIBUTES. equation (6) gives approximately the standard-deviation of the true values of the difference for a given observed value. on the other hand. Height Above Average. Below Average. the standard errors of sampling in the two cases are may have «?=i'i!Zi/% ei=^2?2/"2 and consequently 4=M>+M^ . Height Above Average. . III. if the observed difference is greater or less than some three times the value of cjj given by (6). for testing the significance of an observed difference. for plants oi Lohelia fulgens obtained by cross. Parentage Self-fertilised. Example iii. Is this association significant of it real difference.. in lieu of that given by equation (5). Here it is sufficient to state that. . XIV. The following data were given in Qu. if n be large.—— XIII. . the proportions of A'e. and hence. — Parentage Cross-fertilised. — only slightly in excess of the standard error of the difference. and 35 per cent. Examvple iv. .. Medium. . by equation (6). vol.529 9.. Per cent. xxxvii. The proportion of plants above average height in the and self-fertilised) together is 29/68.764 Can the difference observed in the percentage of girls of medium hair-colour have arisen solely through fluctuations of sampling ? In the two towns together the percentage of girls with medium is 43'5 per cent. Memoir on the Pigmentation Survey of Scotland. Gray. but rather more marked. The standard-deviation of the differences due to simple sampling between the proportions of " tall " plants in two samples of 34 (cross- observations each is therefore /29 39 2V oTon The actual proportions observed are 50 per or 12'0 per cent. = (43-5x56-5)^x(^-^3H-^y = 0'56 per cent. . If 50 per cent. (^ 34 fluctuations of sampling. 4.— — 270 sampling " 1 two classes THEORY OF STATISTICS. As this difference cent. so long as experi- and consequently the actual difference might not infrequently be completely masked by ments were only conducted on the same small scale. definite significance could be attached to it The student will notice. .. the standard error of sampling for the difference between percentages observed in samples of the above sizes would be hair-colour . Total observed. 41-1 44-1 Edinburgh Glasgow . for samples of 34 observations drawn from identical material.743 39. or perhaps the real difference may be greater and may be partially masked by a fluctuation of sampling. (Data from J. were the true proportions in the two classes. of the Royal Anthropological Institute. 1907.008 17. difference 15 per cent. observed may be a real one. is — ^i2 = /50 X 50 35 X 65\* „ „ + 34 J =11 "9 percent. If this were the true percentage. however. the standard error of the percentage difference would be. and 35 per cent. no if it stood alone. Jowr. that all the other cases cited from Darwin in the question referred to show an association of Hence the difference the same sign.) The following are extracted from the tables relating to hair-colour of girls at Edinburgh and Glasgow — : Of Medium Hair-eolour. viz. treatment is similar to that of Case II. and could not have arisen through the chances of simple sampling. If Eji be the standard error of the difference between p^ and Pq. Case III. since errors in p^ and pj ^^^ uncorrelated.. But. — Two different universes. or over 5 times this. some three times this value of £„i. we arrive at the same value. 271 difference is 3'0 per cent.. say. p^. giving proportions of A's p^ and p^. but the work is complicated owing to the fact that errors in p-^ and jOj are not independent. —SIMPLE SAMPLING OF ATTRIBUTES. With such large samples the difference could not. writing it in terms of deviations in p^^ p^ and p^.XIII. as " ^ n^Pi + n^i n^ + n^ Required to find whether the difference between p^ and p^ can have arisen as a fluctuation of simple sampling. samples are drawn from distinct material or in the last case. we have at once ^j being the correlation between errors of simple sampling in and p^. 0-56 per cent. as before. multiplying by the deviation in p^^ and summing. This case corresponds to the testing of an association which is indicated by a comparison of the proportion of A'a amongst The general the B's with the proportion of A'a in the universe. from the above equation relating p^ to p^ and P2. If we assume that the difference is a real one and calculate the standard error by equation (6). viz. we have. it may have arisen solely by the chances of simple sampling. . ^q being the true proportion of A's in both samples. accordingly. but in lieu of comparing the proportion p-^ with p^ it is compared with the proportion of ^'s in the two samples together. where. be obliterated by the fluctuations of simple sampling The actual alone. rjj Therefore finally Unless the difference between p^ and p-^ exceed. edition. . The former is 41 "1 per cent. of sampling.) with the proportion amongst all offspring (viz. . both the subsamples have the same number of observations.. Downes. — = (43-5 X 56-5)'(j^^^^3y = 0-45 is per cent. be the same as in Example iv. As. G. suppose that we had compared the proportion of girls of medium haircolour in Edinburgh with the proportion in Glasgow and Edinburgh together. p. the allied problem whether. the standard error for a sample omit. and consequently it may have arisen as a mere fluctuation of sampling. REFERENCES. QtJETELET. Layton. for the cases dealt with in this chapter. as it should. As in the working of Example iii. 50 per cent. is generally treated by first determining the frequency-distribution of the number of This frequencj'-distribution is not considered till successes in a sample.507 observations is therefore c„. 29/68. Example vi.. The solution is a little complex as we no longer have e? =Po9'o/('h + '^a)Example v. The actual as a difference over five times this (the ratio must. See especially letter xiv. in this case. indicated by the samples were real. A. coin tossing. C..). 1846 (English translation by 0. difference in this case. or 42 6 per cent. n-.. difference 2 4 per cent. Taking now the figures of Example iv. 255 of the English. if the between p-^ and p^. €„j THEORY OF STATISTICS..). London. Taking the data of Example iii. suppose that we compare the proportion of tall plants amongst the offspring resulting from cross-fertilisations (viz. 374 of the 1849). The theory Experimental results of dice throwing. and the table on p. Chapter XV. French. and the student will be unable to follow much of the literature until he has read that chapter. Lettres sur la thiorie des probability Bruxelles. etc. of course. and could not have occurred mere error of sampling. the observed difference is only 1'25 times the standard error of the difference. the latter 43 '5 per cent. & E. it might be wiped out in other samples of the same size by fluctuations of simple sampling alone. The standard error of the difference between the percentages observed in the subsample of 9743 observations and the entire sample of 49.272 It will Wg. be observed that if «! be very small compared with approaches.=n„ = 34.. of Mj observations. and We — ' = l68^68^68J=°°^^ f29 39 IV nf^«n or 6 per cent. (1) . pp. Venn. 1829. "Tables of Poisson's Exponential Binomial Limit. (11) Edgeworth.. "Miscellaneous Applications of the Calculus of Probabilities. "Skew Variation in Homogeneous Material. Jena. vol. Die Orwndziige Jena. of the Manchester Lit. Soy. "Some Tables Mem. Roy. and G. li. The law of small chances (14) PoissoN." Biometrika.. 239. and Woods. "Methods jubilee volume. data regarding the distribution of sexes in families on p. Soc." Mimoires de I'Acad.. or on "Probability. V." 6 Trans. 425. 390 et seg. 1837. des Sciences. vol. Phil.) Lexis. Y.. IT.. Series 6. xxii. vol.. with a note by H. 1914. Student. Weldon. D. (12) Vigor.. 1881-90. Pearson. Ix. clxxxvi. "On the Sex-ratios of Births in the Eegistration Districts of England and Wales. 3rd edn. Mag. "^iomeirifca." Proc. 576. Boy. Yule. D.. 698. Boy. The Logic of Chamee. F. 36-71. Soc. W. Fischer. John. 273 Fischer. pp.. Soc. H. vol. . Camb. London. pp. F. with new matter. H. (especially Part II.. xvii. vol.. "The probability variations in the distribution of a particles. vol. .. p. Wbstergaard. 1907. vol. Karl. Ixi. (Sections 2 to on the binomial distribution. Stat. (2) (3) —SIMPLE SAMPLING OF ATTRIBUTES. (7) Lexis. 1895. Ixix. vol. V.. p. . " Sur la proportion des naissances des filles et des gargons. (15) (16) BoRTKBwiTSOH. Bateman. vols.. Y. 1907. and H. (6) PoissoN. (§ 12). 119). 280 . D. U. 1914. 18 . p. Leipzig. Dabbishikb. 1898.. 1906.. p. Article on the " Law of Error" in the Tenth Edition of the Sncyelopcedia Britannica..) L. xx. 264. F. p. Soc. Series A. H. (Contains. Geiger. the of Statistics. (17) Rutherford. Zwr Theorie der Massenerscheinungen in der miTischlichen : Freiburg. ix. X. and Phil. S. Abliandlungen zwr Theorie der BevSlkerungs und MoraUtatistik . Ixi. A. Edgbwoktii.. . 343.) ) XIII." JoKr. der Theorie der Statistik ." Biometrika.. vol. "Fluctuations of Sampling in Mendelian Ratios. "On the Error of Counting with a Haemacytometer.) (18) SoPBR.. 181.. E.. (10) to which reference was made in § 9. p. "On Poisson's Law of Small Numbers. G. and for illustrating Statistical Correlation. (The frequency of particles emitted during a small interval of time follows the law of small chances: the law deduced by Bateman in ignorance of previous work. Stat. 1888.. and vi. Yule. vol. S. reference may also be made to papers in vols. Paris. xxviii. D. (19) Whittaker. 205-7. 1903.) (8) (9) Edgeworth. 1877.'' Proc.) (13) As regards the sex-ratio. 1914. X. Lucy. 25-35. vol. (Principally theoretical the statistical illustrations very slight. 1902." Jour. vol. VON. E. Soc. etc." Joitr. p. {Cf. (4) ). p. p." Phil. Jiecherches sur la pvobaiiliti des jugemerds. 1885. Macmillan. 351. 1890. 1910. Y. of Biometrika by Heron." Eleventh Edition. W.. reprints of some of Professor Lexis' earlier papers in a form convenient for Oesellschaft reference.. Das Gesetz der kleinen Zahlen. (Pp. §11. Phil. Soc. Teubner. (Use of the harmonic mean as in Stat. General (5) : and applications to sex-ratio of births. 1897-8 (especially partii.. 5. 1. 4." Successes. 4 total of columns of all the 13 tables given. EXERCISES.274 THEORY OF STATISTICS. 1 2 14 103 302 711 1231 1411 . Frequency. (Ref.) Compare the actual with the theoretical mean and standard-deviation for the following record of 6500 throws of 12 dice. 1 Successes. or 6 being reckoned : as a "success. use Sheppard's correction for grouping (p. In the 4096 drawings on which Qu. 3. ard-deviation of the proportion with the given number of throws. 2 is based 2030 balls were black and 2066 white. maybe approximately determined from the mean and standard-deviation of the distribution. 163. 1 and Qu. If a frequency-distribution such as those of Questions 1. and state whether you would regard the excess of successes as probably significant of bias in the dice. . p. 263. 7. and 3 be given. 5. 212). Find n anip in this way from the data of Qu. of Chapter IX. — SIMPLE SAMPUNG OF ATTRIBUTES. show how n andp. In calculating the actual standard-deviation. if unknown. Is this divergence probably significant of bias ? 6. 275 4. 2. Verify the following results for Table VI. and compare the results of the different grouping of the table on p.XIII. The proportion of successes in the data of Qu. 1 is Find the stand'bOdT. Chap. (a) Effect of divergences from the conditions of simple sampling effect of variation in p and q for the several universes from which the samples are drawn 11-12. as a rule. The importance of errors other than fluctuations of "simple sampling " in practice unrepresentative or biassed samples 9-10. while the same range on the shortei side may extend beyond the limits of the distribution altogether. Summary. XV.. (J) Effect of variation in^ and q from one sub-class to another within each universe 13-14. p be less than 0'5. as they may become of importance when the number of observations is small. Thbrb are two warnings as regards the methods adopted in the examples in the concluding section of the last chapter which the student should note. or if p be 276 . he should remember that. Warning as to the assumption that three times the standard error gives the range for the majority of fluctuations of simple sampling of either sign —2. No theoretical rule as to the limits can be given. our assumed range may be greater than is possible for negative errors. The inverse standard error. therefore. or standard error of the true proportion for a given observed proportion equivalence of the direct and inverse standard errors when n is large— 4-8.CHAPTER XIV SIMPLE SAMPLING CONTINUED: EFFECT OF REMOVING THE LIMITATIONS OF SIMPLE SAMPLING. that a range of three times the standard error includes the great majority of the deviations in the direction of the longer " tail " of the distribution. while we have taken three times the standard error as giving the limits within which the great majority of errors of sampling of either sign are contained. XIII. In the first place. strictly the same for positive and for negative errors. As is evident from the examples of actual distributions in § 7. 1. (c) Effect of a coiTelation between the results of the several events 15. the distribution of errors is not strictly symmetrical unless p = q = 0-5. the limits are not. but it appears from the examples referred to and from the calculated distributions in Chap. Warning as to the use of the observed for the true value of ^ in the formula for the standard error 3. § 3. If. — : : — : — — — 1. replacIt should be remembered that the maximum ing IT hJ Tr± 3e. however. form of distribution. no means exact. where we were unable to assign any a priori value to p. The interpretation the number of cases in the sample. for each of which the true value of p is knowli. of the value of the standard error is also more limited in this Suppose a large number of observacase than when n is large. The maximum value of the standard error is therefore i /50 X 50V c OR =6-06 per cent. of Chap. and hence these values. observed value ir may lead to an under. this proportion is 6 per cent. on diff'erent masses of material. value of the product pq is given by ^ = g' = 0'5. 43 per cent.. the point is of importance only when n is small. —^— I On the other hand. greater than . the standard error is unlikely to be lower than that based on a proportion of 43 . the use of the serious error is possible. however. and a true proportion of 50 per cent. is 277 greater than 0*5. Chap. the approxitnate standard error e may first be calculated as usual from the observed proportion w. 25 X 75V K OK = 5-25 per cent. XIII. the observed proportion of The standard error of tall plants is 29/68. or in different universes. and no If. tions to be made. 68 J 1 —^— 3.18 = 25 per cent. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. is therefore well within the limits of fluctuations of sampling. likely as a rule to lead to a serious mistake as stated at the commencement of this paragraph.) In the second place. (Gf.or over-estimation of the standard error which cannot be neglected. by means of samples of n observations each..XIV. for the properties of the limiting 2. for when n is large the distribution tends to become sensibly symmetrical even for values of p differing considerably from 0'5. n be small. The assumption is not. To get some rough idea of the possible importance of such effects. and then fresh values recalculated. we have assumed that it is suflElciently accurate to replace p in the formula for the standard error by the proportion actually observed. the student should note that. will give The procedure is by one limiting value for the standard error.. but may serve to give a useful warning. say ir. possible for positive errors. or. The two difficulties mentioned in §§ 1 and 2 arise when n. Thus in Example iii.is small. Where n is large so that the standard error of p becomes small relatively to the product pq the assumption is justifiable. XV. On these data we could . if within the limits of fluctuations of sampling. say. We have already referred to the use of the inverse standard error in § 13 of Chap. while the regression of ir on ^ is unity i. we can see that if n be large. for example. is (pq/ny is the standard-deviation of the array at right angles to this. on an average of all arrays. the last chapter is that the standard-deviation of an array of it's but the question may be asked —What associated with a certain true value p. Hence if n become very large. 0-5 cannot be neglected in comparison with o-y. the standard-deviation But of the true proportion p for a given observed proportion ir. i. the mean of the array of it's is identical with p. 278 THEORY OF STATISTICS. and the standard-deviation of the array of it's is. would give a distribution of p ranging uniformly between and 1. therefore. on an average. It we determine. the type of the array the regression of ^ on tt is less than unity. it should be for extreme values of p near noticed that. a-^ the standarddeviation of observed proportions in the samples. or indeed grouped symmetrically in any way round 0'5. correspondingly greater than the standard deviation of the array of p's the state- — ment not true for every pair of corresponding arrays. cr„ is therefore appreciably greater than o-y. with sufficient exactness. if n be small. On general principles. (r„ becomes sensibly equal to ctj.Trj/nf may be sensibly equal.erved proportion tt in a sample of n observaWhat we have found from the work of tions drawn therefrom. on the average of all values otp.. however. taken as giving. p. Further. (Case II. what is the standard-deviation of the true This is the inverse of the problem with which we have been dealing. observed for every conceivable subject. o-j becomes very small. 269). and as the proportions ? -f- standard-deviation of the differences. that a tabulation of all possible chances. any observed value n. the standard error of the difference is — . But o-j varies inversely as n. and conversely. If we assume. the array of p's associated with a certain observed proportion ir ? In other words. and therefore the standard-deviations of the arrays. are also If n be large. therefore.. greater of the two.greater than 0'5 will probably correspond to a true value of p slightly lower than ir.e. 8 is uncorrelated with p. in this table. XIII. and therefore if a-p be the standard-deviation of p in all the universes from which samples are drawn. while if n be small the standarddeviation of the array of it's will tend to be appreciably the For if tt =p S. form a correlation-table between the true proportion ^ in a given universe and the obs. and it is a much more difficult problem. especially and 1.e. [ir(l . the two standard-deviations will tend.— . to be nearly the same. given an observed proportion tt. V. but one which may lead to serious results [cf. whether an observed divergence of a certain proportion from a certain other proportion that might be observed in a more extended series of observations. or they might be represented by a number of lively black insects sheltering amongst white stones in neither case would the ratio of black balls to white. —REMOVING of LIMITATIONS OF SIMPLE SAMPLING. and we knew that the proportions in successive samples were subject to the law of simple sampling.XIV. Clearly. inferences as to the material from which the sample is drawn are of a very doubtful and uncertain kind. might or might not Their use is be due to fluctuations of simple sampling alone. true differences for the given observed It 4. viz. the characters not being well defined a source of error which we need not further discuss. say. (1) owing to variations of classification in sorting the A'a and a's. or that has actually been observed in some other series. and it is this uncertainty whether the chance of inclusion in the sample is the same for A's and a's. if on drawing samples from a bag containing a very large number of black and white balls the observed proportion of black balls was rr. whether the observed proportion tt in the sample may not diverge from the the universe from which it was drawn. proportion p existing owing to the nature of the conditions under which the sample was taken. in any parallel case. thus quite restricted. be represented in their proper proportions. 5 of Chap. ways. we could not necessarily infer that the proportion of black balls in the bag was approximately w. this may be taken. for in many cases of practical sampling this The principal question in is not the principal question at issue. For the black balls might be. or of insects to stones. even though the standard error were small. The use of standard errors must be exercised with care. much more highly polished than the white ones. ref.]. and to bear in mind that it covers those fluctuations alone which exist when all the assumed conditions are fulfilled. The formulae obtained for the standard errors of proportions and of their differences have no bearing except on the one question. many such cases concerns quite a different point. as approximately the standard-deviation difference. tt tending to be definitely greater or definitely less than Such divergence between tt and p might arise in two distinct p. far more than the mere divergences between different samples drawn in is m — : . provided n be large. (2) Owing to either ^'s To give or a's tending to escape the attentions of the sampler. so as to tend to escape the fingers of the sampler. very necessary to remember the limited assumptions on which the theory of simple samplinff is based. 279 between two observed proportions by equation (6) of that chapter. an illustration from artificial chance. the results might be unduly favourable . In such cases we can see that any sample. if say 10. if from the bottom layer. as the persons who make the returns will probably include an undue proportion of the more intelligent farmers whose crops will tend to be above average. and no mere examination of the sample itself can give any informa: : tion as to whether it exists or no. the estimates are likely to err in excess. to keep the necessary accounts. the volunteering of teachers for the work might in itself introduce an element of bias. the question would arise whether this method would tend to give an unbiassed sample bf all the children. one school in ten in a large town. even in the long run. but. Thus in collecting returns as to family income and expenditure from working-class households. on the other hand. the families with lower incomes are almost certain to be under-represented . taken in the way supposed. No assured answer could be given conjectures on the matter would be based in part on the way in which the schools were selected. 5.000 herrings were measured as landed at various North Sea ports. are compared. or in different places. Again. no certainty that it does not exist. if estimates as to crop-production are formed on the basis of a limited number of voluntary returns.280 THEORY OF STATISTICS.g. Whilst voluntary returns are in this way liable to lead to more or less unrepresentative samples. to what extent they are under-represented. There may be no definite reason for expecting definite bias in either case. however. Compulsion could not ensure equally accurate and trustworthy returns from illiterate and well-educated workmen. . the same way. 6. or to form any estimate as to the possible error when two such samples taken by different persons at different times. is likely to be definitely biassed. no assured answer could be given. they largely "escape the sampler's fingers " from their simple lack of ability It is almost impossible to say. in the sense that it will not tend to include. from intelligent and unintelligent farmers. and the question were raised whether the sample was likely to be an unbiassed sample of North Sea herrings. say. Thus if we noted the hair-colours of the children in. unduly unfavourable. compulsory sampling does not evade the difficulty. In other cases there may be no obvious reason for presuming such bias. which renders many statistical results based on samples so dubious. Again. but it may exist. equal proportions of the A's and a's in the original material. e. The following of some definite rule in drawing the sample may also produce unrepresentative samples if samples of fruit were taken solely from the top layers of baskets exposed for sale. 8. if an observed proportion ttj in a sample drawn from one universe be greater than an observed proportion -ir^ in a sample drawn from another universe. Similarly. The student must therefore be vfery careful to remember that even if some observed difference exceed the limits of fluctuation in simple sampling. for example. Similarly. or the it necessarily . But while the dissimilarity of subsamples would then be evidence as to the difficulty of obtaining a representative sample. p-^ most likely exceeds ^j j the standard error only warns us that this conclusion is more or less uncertain. on the standard-deviation of sampling. On the contrary. but Ti-Tj is considerably less than three times the standard error of the difference. great heterogeneity in the original material. Further. he must remember that if the standard error be small. and questionable therefore whether the five. XIII. so that there is some essential difference between the localities from which. ten or fifteen schools observed might not also have given an unrepresentative sample. On the other hand. jpj and p^. First suppose the condition (a) to break down. Such indicating one possible source of bias. it does not follow that it exceeds the limits of fluctuation due to what the practical man would regard and quite rightly regard as the chances of sampling. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. 281 an examination may be of service. of divergences from the conditions of simple sampling which were laid down in § 8 of Chap. however. It may be quite untrustworthy for other reasons owing to bias in taking the sample. it does not. and tha. in the first illustration. If. of course. but merely to a greater or less extent untrustworthy if the standard error be large.— : XIV. it would. it by no means follows that the result is necessarily trustworthy the smallness of the standard error only indicates that it is not VMtrustworthy ovring to the magnitvde of fluctuations of simple sampling. the hair-colours of the children differed largely in the different schools much more largely than would be accounted for by fluctuations of simple sampling it would be obvious that one school would tend to give an unrepresentative sample. if the herrings in different catches varied largely. for some very different material which should have been represented might have been missed or overlooked.t possibli/ p^ may even exceed Pj. for instance. or owing to definite errors in classifying the A's and a's. — — — : should also be borne in mind that an observed proportion is not incorrect. Let us now consider the effect. be diificult to get a representative sample for a large area. 9. viz. are most probably equal. of course. be no evidence that the sample was representative. again. as 7. follow that the true proportion for the given universes. the similarity of subsamples would. of course. y. the chance of success -varying from time to time. viz.. instead of that of the absolute number. and substituting o-^ be the standard-deviation of p. (1) the formula corresponding to equation (1) of Chap. or Na-'^ = rCZifpq) + 7. even for individuals of the same age and sex. . . represent such circumstances in a case of artificial chance by supposing that for the first /j throws of n dice the chance of success for each die is^j.. conditions under which. ^P\1i being the square of the standard-deviation for these throws. equation (2) is sensibly of the form s' = sl + trl. or that some essential may change has taken place during the period of sampling. now. we have. XIII.n^a-^.^ 2/(p -p. just as the chance of death.npl .. Suppose. that the records of all these throws are pooled together. and so on.282 THEORY OF STATISTICS. we have sum is = «po . then the last 1 -^ for q. dividing through by n\ the formula corresponding to equation (2) of Chap. varies from district to district.— This is -.of the whole distribution is given by the sum of all quantities like the above.mo-y + T^a^ = npaqa + n{n-\)<j% . Hence the standard-deviation a. is the mean value '2i{fp)IN of the varying chance p. If TO be large and s^ be the standard-deviation calculated from the mean proportion of successes p^^. and «(pj -Po) *^^ difference between the mean number of successes for the first set and the mean for all the sets together. if ^.^Mo + IZ-V^ . . . Let o-p N. for the next/^ throws p^. for the next/g throws p^. samples are drawn. (2) 10. To find the standarddeviation of the number of successes at each throw consider that the first set of throws contributes to the sum of the squares of deviations an amount /i[™Pi?i + «^(Pi--?'o)^].. XIII. The mean number of successes per throw of the n dice is given by We where iV= 2(/) is the whole number of throws and ^(. we deal with the standard-deviation of the proportion of successes. . 283 Table showing Frequencies of Segislraiion Districts in England amd Wales with Different Proportions of Deathi in Childbirth {including Deaths from Puerperal Fever) per 1000 Births in the saine Year. —REMOVING LIMITATIONS OF SIMPLE SAMPLING. Decade 1881-90. XIII. Data from same soui-ce.XIV. for the same Groups of Districts as in the Table of Chap. § 10. 283 are given some does s fall short of «„. there appears to be a significant standard-deviation of some 6 . that if we make n large s becomes sensibly equal to a-p.284 THEORY OF STATISTICS. and in this case the effect of definite causes is relatively larger. The figures of this case also bring out clearly one important consequence of (2). ''^' the standard-deviation of sampling is approximately . If n be very large the actual standarddeviation may evidently become almost indefinitely large compared with the standard-deviation of sampling. five out of the eight values being very close to this average. viz. but being constant throughout the experiment. but differ from one die to another as they would in any ordinary set of badly made dice. in § 10 of Chap. 11. or the circumstances that regulate the appearance of the character observed the same for every individual or every sub-class in each of the universes from which samples are drawn. we want to obtain good illustrations of the theory of simple sampling n should be made small. at any one throw. certain registration districts of England. and so on. the chances varying for different dice. The values of suggest an almost uniform significant standard-deviation women per thousand births. The case differs from the last. viz. It will be seen that in the first group of small districts p. in one case only In the table on p. Now : . units in the proportion of male births per thousand. but varied from one throw to another now they are constant from throw to throw.cijn. Thus during the 20 years 1855-74 the death-rate in England and Wales fluctuated round a mean value of 22 -2 per thousand with a standard-deviaTaking the mean population as roughly 21 millions. while if we make n small s becomes more nearly equal to P(. 263. on the other hand. given in § 8 (6) of Chapter XIII. the condition that the chances p and q shall be the same for every die or coin in the set. consider the effect of altering the second condition of simple sampling. if. as in that the chances were the same for every die. for m^ dice. XIII. as one might expect. but in the more urban districts this falls to 1 or 2 units . tion of 0'86.. Suppose that in the group of n dice thrown the chances for m^ dice are p-^ q^ . different data relating to the deaths of women in childbirth in the same grpups of districts. Required to find the effect of these differing chances. p^ q^. Hence if we want to know the significant standard-deviation of the proportion p the measure of its fluctuation owing to definite causes n should be made as large as possible .Js^ a-p — si = 0'8 in the deaths of — — y This is 2221978 _ 0-032 only about one twenty-seventh of the actual value. say. and crp = |. n-- and using . Thus the standard-deviation of sampling for a deathrate of. the value of s if the chances are In most practical oases. . • (3) or if s be. 285 For the mean number of successes we evidently have M=mjPj^ + m^2 + mgPg+ . if the values of p are uniformly distributed over the whole range between and 1. and so on and these numbers of successes are all independent. the standard-deviation of the proportion of successes. 1 8 per thousand in a population of uniform age and s one sex is (18 x 982)*/ Jn= 133/^/m. if p be zero for half the events and unity for the remainder.. instead of 0'5/s/n. however. the deathrate is not. the standard-deviation of the rate within such a populaBut the effect of this tion is roughly about 30 per thousand. ^^P^_^ n n . but varies from a high value in infancy (say 150 per thousand). it should be noted that this may be regarded as made up of the number of successes in the mj dice for which the chances are p^ q^.XIV. so that s is zero. through very low values (2 to 4 per thousand) in childhood to continuously increasing values in old age .. To take another illustration. p^ = q^ = |. .1<4 . standard-deviation of p. VIII. as calculated from the mean proportion p^^. In a population of the age composition of that of England and Wales. § 12. Substituting \-p for q. uniform. — REMOVING LIMITATIONS OF SIMPLE SAMPLING. To find the standard-deviation Pg being the mean chance I. To take a limiting case. 143). and the effect may conceivably be considerable. /"o "= So = i before but a^ Hence s2 = 0-1667/?i. . as before. . together with the number of successes amongst the m^ dice for which the chances are p^ q^. .. 1/12 = 0-0833 (Chap. of course.(mp)/n. p. The effect of the chances varying for the individual dice or other "events" is therefore to lower the standard-deviation. the effect will be J in every case. of the number of successes at each throw. Hence : = l. (4) ' ^ 12. much less. as before. a-^ to denote the = re-^o^o . however.{mpq). still somewhat extreme. ^ = = 0'408/\/m. ...pq + 2pq{ri^ + r-^^+ r.). The standardieviation for each event is (pq)* as before. we must have (cf. but may nevertheless. and constant throughout the experiment.mpling will therefore be increased or diminished according as the average correlation between the results of the single events is positive or negative. XI. § 2) . correlation-coefficients. and if. We shall suppose. therefore. and so on for variables (number of successes) which can only take the values and 1. but the events instead. as o. i\„.may be reduced to zero or increased to n(j)qy. : expression o-^ = n. the chances^ and q being the same for every event at every trial. Chap. and the effect may be considerable. of the simple are no longer independent as 1 3. be treated as There are n(n-l)/2 ordinary variables {cf.pq.+ . r is the arithmetic mean where. for. as calculated from equation (4). .286 THEORY OP STATISTICS. however. etc. first and third events. quite s2 = -(18x982 -900) = 130/ Jn s compared with 133 / Jn. For the standard deviation of the "proportion of successes in each sample we have the equation The standard-deviation «^=^[1 +'-(»It 1)1 • • • • (6) should be noted that. — of the correlations we may (T^ write . is variation on the standard-deviation of simple sampling small. .. and to discuss the effect of a certain amount of dependence between the several " events " in each sample. XIII. r is the correlation-coefficient for a table formed by taking all possible pairs of results in the n events of each sample.. are the correlations between the results of the correlations first and second. that the two other conditions (a) and (b) are fulfilled. r^^. (r^ = n. . = npq[l+r{n-l)]. of course.. XL § 10). We have finally to pass to the third condition (c) of § 8. as the means and standard-deviations our variables are all identical. .. Chap. The problem is again most simply treated on the lines of § 5 of the last chapter. Chap. of (5) simple S9. therefore. for . ^) w = n. since a positive or negative correlation may arise for reasons quite different from those discussed in i 9-12.1). For drawing 2 balls out of 4. but the following will . cr becomes 0'816 {npqf . The cases (a).a limited universe. in the case of drawing half the balls out of a very large number. 0'745 {npqf' . if fatal at all. : : : whence 0-- = n. or nth ball of the sample the correlation-table formed from all possible pairs of every sample will therefore tend in the long run to give just the same form of distribution as the correlation-table formed from all possible pairs But from Chap. the case when r is negative for if the chances are not the same for every event at each trial. as the conditions are hardly ever constant from one year to another. r is positive. As a simple illustration. of drawing n balls in succession from the whole number «) in a bag containing^!* white On repeating such drawings a large balls and qw black balls. —KEMOVING LIMITATIONS OF SIMPLE SAMPLING. best kept distinct.npqy. to result in wholesale deaths. ball for the first. it approximates to {0-5. §§ 9-10 this introduces the positive correlation at once. number of times. (6) and (c) are. we have the obviously correct result that fr= (pq)\ as drawing from unlimited material: if. It is difficult to give a really good example from actual statistics.pq -n -^ w -I = 1. on the other hand. n — w. consider the important case of sampling from. 14.g.pq\{'-. and if n be large (as it usually is in such cases) a very small value of r may easily lead to a very great increase in the observed standard-deviation.l/(w . however. § 11 we of the w balls in the bag. <T becomes zero as it should.XIV. 287 It should also be noted that the case when r is positive covers the departure from the rules of simple sampling discussed in for if we draw successive samples from different records. for drawing 5 balls out of 10. know that the correlation-coefficient for this table is . even although the results of the events at each trial are quite independent of one Similarly. the case discussed in §§ 11-12 is covered by another. the mean chance of success for the remainder' must be below it. e. and the formula is thus checked for simple cases. and the chance of success for some one event is above the average. or of certain forms of accident that are apt. second. we are evidently equally likely to get a white ball or a black. XI. or 0-707 (ripq)*. If re — in In the case of contagious or infectious diseases. if the samples are drawn.the standarddeviation of sampling given by equation (5). but even if we ignore this. or whatever they may be from which 15. years.: 288 THEOKY OF STATISTICS. as calculated from the average values of the chances if the average chances are the same for each universe from which a sample is drawn. If we find that the : . but vary from individual to individual or from one subclass to another within the universe. or an average of 105 deaths per annum. the observed standard-deviation will be greater or less than the simplest theoretical value according as the correlation between the results of the single events is positive or negative. we have (in 1894). or the standard-deviation itself approximately 10'3. the standard-deviation observed will be greater than the standard-deviation of simple sampling. iigures. During the twenty years 1887-1906 there were 2107 deaths from explosions of firedamp or coal-dust in the coal-mines of the United Kingdom. '''' := +0-00012. (n-lK' Whence. to judge from the though not wholly. taking the numbers of persons employed underground at a rough average of 560. §§ 9-14. we see the chances p and q differ for the various universes. But the square of the actual standard-deviation is 7178. but the events are no longer independent. or its value 84-7. XIII. it follows that this should be the square of the standard-deviation of simple sampling. serve to illustrate the point. the magnitude of the standard-deviation can be accounted for by a very small value of the correlation y. For if Wq denote the standard-deviation of simple sampling. due to a general tendency to decrease in the numbers of deaths from explosions in spite of a large increase in the number of persons employed . 560000 X 105 that Summarising the preceding paragraphs. expressive of the fact that if an explosion is sufficiently serious to be fatal to one individual. the standard-deviation observed will be less than the standard-deviation of simple sampling as calculated from the mean values of the chances finally. From § 12 of Chap. from the above data. materials. o. is partly. if p and q are constant. districts.000. These conclusions further emphasise the need for caution in the use of standard errors. it will probably be fatal to others also. the numbers of deaths ranging between 14 (in 1903) and 317 This large standard-deviation. Referring to Question 7 of Chap.). as the condition that the sampling shall be random haphazard is not the only condition tapitly assumed. .. it is only a conjectural and not " a necessary inference that all the conditions of " simple sampling as defined in § 8 of the last chapter are fulfilled.e. We have thought it better to avoid this term.' " Biometrika. work out the values of the siguifioant standard-deviation ap (as in § 10) for each row or group of rows there given. i.. what are the chances of 0. (An expansion of one section of ref. or that the results of the events are negatively correlated inter se. deviation the actual standard-deviation fall short of the standardof simple sampling two interpretations are again possible. two interpretations are possible either that p and q are different in the various universes from which samples have been drawn {i. 1913. (If an event has succeeded p times in successes in subsequent n trials.. XIII." Philosophical Magazine. ) : —REMOVING LIMITATIONS OF SIMPLE SAMPLING. XIII. XIII. generally the references to Chap. Chap. Sampling which fulfils the conditions laid down in § 8 of Chap. while approximately constant from one universe to another. dealing with the first problem of our § 14. — — REFERENCES. and on the fitting of such Series to Observation Polygons in the Theory of Chance.e. p. ' . is generally spoken of as random sampling. .. vol. for example. there may be a positive correlation r between the results of the different events. Kakl. 6th Series. If . 236. pp. "On Errors of Random Sampling in certain Cases not suitable for the Application of a Normal Curve of Frequency. xlvii. but taking row 5 with rows 6 and 7. 1899. Even if the actual standard-deviation approaches closely to the standarddeviation of simple sampling. 10 of Chap. Possibly. or that the results of the events are positively correlated inter se.. 1. to which may be added Pearson. XIII. from the standpoint of the frequency-distribution of the number of white or black balls in the (1) samples. EXERCISES. 19 . simple sampling as we have called it.) (2) Greenwood. . 69-90. vol. XIII. 1. that the variations are more or less definitely significant in the sense of § 13.. M. drawing samples from a bag containing a limited number of white and black balls. masked by a variation of the chances p and q in sub-classes of each universe. ix. Cf. either that the chances p and q vary for different individuals or sub-classes in each universe. " On certain Properties of the Hypergeometrical Series. 289 standard-deviation in some case of sampling exceeds the standarddeviation of simple sampling.— XIV. m m trials 1 Tables for small samples. what is the standard-deviation of the number of successes. Chap. . If for one half of m events the chance of success is p and the chance of failure q. IX. The harmonic mean number of births in a district is 5070. 2. the events being all independent ? 4. whilst for the other half the chance of success is j and the chance of failure p. The following are the deaths from small-pox during the 20 years 1882-1901 in England and "Wales :— 182 . Find the significant standard-deviation a^ 3. For all the districts in England and Wales included in the same tahle (Table VI. ) the standard-deviation of the proportion of male births per 1000 of all births is 7 '46 and the mean proportion of male births 509-2.290 THEORY OF STATISTICS. or the throwing of ideally perfect dice (homogeneous cubes). Detei-mination of the frequency-distribution for the number of successes in n events : the binomial distribution 3. a continuous curve giving approximately. Difficulty of a complete test of lit by elementary methods 16. Necessity of deducing. Outline of the more general conditions from which the curve can be deduced by advanced methods 14. and therefore. 1-2. THE BINOMIAL DISTEIBUTION AND THE NOEMAL OUEVE. If we deal with one event only." 1. The value of the central ordinate 12. 2. The table of areas of the normal curve and its use— 17. Fitting the curve to an actual series of observation. the standard-deviation of the of successes in n events was determined for the several more important cases. and the chances p and q the same for each event and constant throughout the trials. Deduction of the normal curve as a limit to the symmetrical binomial 10-11. For the simpler cases of artificial chance it is possible. Dependence of the form 4-5. for large values of n. The quartile deviation and the " probable error " 18. Direct calculation of the mean and the standard-deviation from the distribution— 7-8." in which the events are completely independent. Graphical and mechanical of the distribution on p. The two events are quite independent. according to the rule of all 291 . q and n methods of forming representations of the binomial distribution 6. however. results of this first event the results of a second. and determine not merely the standard-deviation but the entire frequency-distribution of the number of " successes. number This we propose to do for the case of "simple sampling. and XIV. Comparison with a binomial distribution for a moderate value of w 13.^ 15. The case corresponds to the tossing of ideally perfect coins (homogeneous circular discs). — — — — — — — — — In Chapters XIII. iygSuppose we now combine with the failures and Np successes. the terms of the binomial series 9. and the applications of the results indicated. for use in many practical cases.— CHAPTER XV. to go much further. we expect in iV trials. Illustrations of the application of the normal curve and of the table of areas. a. a. Si. a. fen' a. "a. . a.292 THEORY OF STATISTICS. a. a. of so much importance that it is worth while considering the form in greater detail. 2 i* + .+Pr N N^ : . In trials of two events we would therefore expect approximately iVg^ cases of no success. 3 Npq^ cases of one success. The scheme is continued for the results of a fourth event. and Np^ cases of two successes. {Nq^)q will be associated (on an average) with failure of the third also. in fact in trials of n events are given by the successive terms in the binomial expansion of N{q + p)". N(q+pY for three events ]S[\q +pr „ „ for /ozir events „ „ ^{9. successes W . {Np)q will be associated (on an average) with failures of the second event and {Np)p with successes. and it is evident that all the results are included under a very simple rule the frequencies of 0.q''-^ + -^Y2-1P+ „ . for . 3. 1. i.— XV... 2 . . 2 . 2Npq cases of one success and one failure. (2) on the value of the exponent n. n(n-Vj .. Of the iVg^ cases in which both the first two events failed. success and one failure. m(m-l)(m-2) —f23 „ . however. 292). N{q'' + n. row 2 of the scheme on p. and {Nq)p with successes of the second event {cf. and Np^ cases of three successes. : and soon. 1. the frequencies of 0.e. tailing off in either direction from the mode. of the Nq failures of the first event {Nq)q will be associated (on an average) with failures . they are distributions of greater or less asymmetry. .. The distribution is.\ ) This is the first theoretical expression that we have obtained for the form of a frequency-distribution. q are equal. The results of a third event may be as in row 3 of the scheme. viz. 3 Np\ cases of two successes. Of the iNpq cases of one {N'q^)p with success of the third. combined with those of the first two in precisely the same way.. and similarly for The the Np^ cases in which both the first two events succeeded. (2N'pq)q will be associated with failure of the third event and {^Npq)p with success.. . as in row 5 of the scheme. — 293 —BINOMIAL DISTRIBUTION AND NORMAL CURVE. .. Similarly of the Np successful first events.of the second event. evidently the distribution must be symmetrical.. successes are given for one event by the binomial expansion of N{q+p) for <i«o events . This form evidently depends (1) on the values If p and of q and p. The general form of the distributions given by such binomial series will have been evident from the experimental examples given in Chapter XIII. trials of three events we should expect result is that in cases of no success. independence. Quite generally.1 j7 / . 5. If p and q are unequal. on the other hand.1. for the same value of n. the greater the inequality of the chances. proceeding by 0.294 THEORY OF STATISTICS. the distribution is asymmetrical.) Number of . p and any end q may be interchanged without altering the value of term. and consequently terms equidistant from either of the series are equal. Values o/p {Figures given to the nearest unit.1.000 {q+pf for from O'l to 0'5. and the more asymmetrical.1 to 0. — Terms of the Binomial Series 10. The following table shows the calculated distributions for m = 20 and values of p. cases of two successes are the A. When ^ = 0. from 0. ) — 295 XV. the efFeofc of increasing n is to raise the mean and the dispersion. however. Thus if we compare the first distribution of the above table with that given by m= 100. we have the following increase : p^q B. O'l)'™. li —BINOMIAL DISTRIBUTION AND NOIIMAL CURVE. If p is not equal to' q. for the same value of p and q. the greater n. but it also lessens the asymmetry . {Figures given Number . not only does an increase in n raise the mean and increase the dispersion. the less the asymmetry.000 (O'O + to the nearest tmit. — Terms of the Bvnomial Series 10. ob. in Q. then BQ=p. cutting AB:BC::q:p. be erected Ijetween them. ? = |. we have the polygon ab"c"d"e"f" for « = 3. so that BQ. b'c. say for the case p = ^.ob + q. 47) at any convenient distance apart AR BQ line.. cd respectively cut the intermediate verticals and project them horizontally to the right on to the thick verticals.296 THEORY OF STATISTICS. ob = 3072. and so on. The process may be continued 11 . Ic' =p. Similarly. This gives the polygon ab'e'd'e for on a horizontal base verticals) : = 2. and a third.AP + q. It will have been noted that any one term say the rth— in one series is obtained by taking q times the rth term together with p times the (r-l)th term of the preceding Now if AP. draw the binomial polygon for the simplest case m = 1 . \c = 1024. choosing a vertical scale. The polygons for higher values of n may now be constructed graphically. out the intermediate verticals are projected horizontally on to the thick verticals. 3 1. series. viz. and the polygon is abed.lc. OR (figure 46) be two verticals. Mark the points where ab. (the heavy verticals of fig. For ob' — q. and erect other verticals (the lighter dividing the distance between them in the ratio of q -. binomial distributions.p. — PR and considering the two (This follows at once on joining Consider then some divided. in the diagram iV has been taken = 4096. Next. be. etc.GR. if the points where ab'.) segments into which is Draw a series of verticals binomial. . —BINOMIAL DISTRIBUTION AND NORMAL CURVE. indefinitely. 297 though it will be found difficult to maintain any high degree of accuracy after the first few constructions.XT. apparatus consists of a funnel opening into a space say a J inch in depth between a sheet of glass and a back-board. This wedge is set so as to throw q parts of the stream to the left and p parts The wedges 2 and 3 are set so as to the right (of the observer). etc. At the foot these wedges are replaced by vertical strips. —The Pearson-Galton Binomial Apparatus. This space is broken up by successive rows of wedges like 1. wedge 3 throws pq parts of the original material The streams passing these wedges to the left and p^ to the right. 48. to divide the resultant streams in the same proportions.298 THEORY OF STATISTICS. 4 5 6. in the spaces between which the — — Fig. 2 3. Thus wedge 2 throws <^ parts of the original material to the left and qp to the right. which will divide up into streams any granular material such as shot or mustard seed which is poured through the funnel when the apparatus is held at a slope. comes from the funnel and meets the wedge 1. Consider the stream of material that material can collect. are therefore in the ratio of q^ 2qp p\ The next row of wedges is again set so as to divide these streams in the same proportions : : .. (the calculation was in fact given as an exercise in Question 8. VII. 6. Arrange the terms under each other as in col. as may be desired. : : (1) (2) (3) |.3 ^ n(n-l){n-2) ^ 1. XIII.. as well as by the method of Chap.. Chap. in The sum of col... The final set. The apparatus was generalised by Professor Pearson. This kind of apparatus was originally devised by Sir Francis Gal ton (ref.e. so that they (Eef. Chap. npl q"-' + {n-l)q"--p + ^ (n - l)(n - 2) f^ 'q"-y+ . i. as by the method of Chap. a stream of shot being allowed to fall through rows of nails. who used rows of wedges fixed to movable slides. will 299 as before.g^-'p n. 1 below. VIII.). —BINOMIAL : : : DISTSIBUTION AND NORMAL CUETE. The values of the mean and standard-deviation of a binomial distribution may be found from the terms of the series directly. But this snm is .2.. and the resultant streams being collected in partitioned spaces. will give the streams proportions q^ iq^p Gq^^ iqp^ p\ and these streams will accumulate between the strips and give a representation of the binomial by a kind of histogram. and Question 6. 13.p. That is.) could be adjusted to give any ratio of q -.cf-^p 2m(»-l)?'>-2p2 ~ J V '^j^ 2 m(9s-l)(7"-V r.2 2 i" unity. we are treating iV as and the mean is therefore given by the sum of the terms col. XIII . at : : and the four streams that result 3q^p 3qp^ p^. 1 is of course unity. XV. — 1 /f — n.q"-^p n. (4) Frequency/.{n-l){n-2) ? 1. the mean M is np.2 ' ^ 3n(n-l){n-2) 1. 1) in a form that gives roughly the symmetrical binomial.<• ] = np(q -l-j>)"~' = np. and treat the problem as if it were an arithsuccesses: as metical example. taking the arbitrary origin at iV is a factor all through. (3). tions q^ bear the proporthe heads of the vertical strips. /|. it may be omitted for convenience. Of course as many rows of wedges may be provided as shown. g" Dev. for certain cases with which we have already dealt. • • • }-»¥• But the series in the bracket is the binomial series {q +pY~^ It therefore with the successive terms multiplied by 1.— 1HE0RY OF 300 STATISTICS.000 This would not only be practically impossible without the use of certain methods of approximation. The question arises whether we can pass from this discontinuous formula to an equation suitable for representing a continuous distribution of frequency. or the number N . 5100. of heads thrown in tossing a coin. The distribution will be given by the terms of the series (0'49-)-0'51)i°'""* and the standard-deviation is. the frequency-distribution of the number of male births 10. the average or smoothed form of the distribution to which actual distributions will more or less closely approximate. n. therefore. as a formula which may be generally useful for describing frequency-distributions. 2. the proportion of 2-oards in the record being p. samples of n cards each be drawn from an indefinitely large If record of cards marked with A or a. that it only applies to a strictly discontinuous distribution like that of the number of . The square of the standard-deviation is given by the the terms in col. gives the difference of the mean of the said binomial from . p. but it would give the distribution in quite in batches of ! . and its sum is therefore {n-\)p + \.e. viz.4-cards in the sample. in round numbers. . determine of sampling. . the actual frequencies only 0. o-a=TOj)|g»-i+2(TO-i)g"-^j)+3 ^"'~ sum of ^_^~ V ~y+ . that is. indeed. almost a necessity for Consider. deviating from these by errors which are themselves fluctuations The three constants N.000 births. 8.4-cards drawn from a record containing A's and a's. the mean number being. and in order to obtain it we should have to calculate some 300 terms of a binomial series with an exponent of 10. (4) less the square of the mean. Such an equation becomes. 1. . 7. The terms of the binomial series thus afford a means of completely describing a certain class of frequency-distributions of giving not merely the mean and standard-deviation in i. 50 births. The distribution will therefore extend to some 150 births or more on either side of the mean number. eafih case. Considered. Therefore .1. example. say. 2. 3.np"^ = npq. but of describing the whole form of the distribution. (T- = np{(n-l)p + \]-rfip^ = np . however. then the successive terms of the series N{q+py give the frequencies to be expected in the long run of . the binomial series suffers from a serious limitation. then the frequency of k successes is the greatest. the curve being such that the area between any two ordinates y-^ and y^ will give the frequency of observations between the corresponding values of the variable x-^ and x^. taking probably 10 births as the class-interval. we would not have compiled a frequency-distribution by single male births.. to replace the binomial series by some continuous curve. It is possible to iind such a continuous limit to the binomial series for any values of p and q. therefore greater than the former so long as derived from this by latter frequency is ~m. • (1) tails off symmetrically on either side of this greatest Consider the frequency oi k + x successes .. {k + x) (2) and therefore y.. but would certainly have grouped our observations. y~ (k)(k-l){k-2) {k+l){k + 2){k+Z) (-D(-S(--D----(-^) "(-D(-l)(-l) • • " . therefore. . say equal to 2^ . The terms of the series are The frequency of m successes is \n ^(i)"| OT \ n-m is and the frequency of m+ 1 n successes The multiplying it by {n-m)/{m+l). for simplicity..>m+ 1 n-1 or m< 2 Suppose.. We want. (-'-i^X-S . the value is \2k ^' = ^(^)^|ITt^ (k-x+l) . having approximately the same ordinates. but in the present work we will confine ourselves to the simplest case in which p = q = Q'b.. .. —BINOMIAL : DISTRIBUTION AND NORMAL CURVE. and its value is ^^-^(^r^ The polygon ordinate. that n is even. 301 unnecessary detail as a matter of practice.XV. 9. and the binomial is symmetrical. 10. 5. the point x = Q. the value of y^ will be given by the fact that the areas of the two distributions. the constant k has been replaced by the standard-deviation cr. which is necessarily small if k On this assumption we may apply the logarithmic S^ log.302 THEORY OF STATISTICS. yx = l -2^-2 . the last two data are given by the standard-deviation and the mean respectively . lead in any simple and elementary algebraic way to an etpression for y^.(l+S) = S-|. The curve represented by this equation is symmetrical about Mean.+ |--^+ S^ S* to every bracket in the fraction (3). This condition does not. in the last expression. in fact. This assumption does not involve any difficulty. for o-^ = kj2. 89. that very large. so that (a.+-— )-^ h' Therefore. To this degree of approximation. which gives the greatest ordinate y = y^. that drawn in fig. and taken as the ideal form of the symmetriThe curve is generally cal frequency-distribution in Chap. and the curve is. . known as the normal curve of errors or of frequency./i)^ may be neglected compared with (a:/A). . If we desire to make a normal curve fit some given distribution as near as may be. and neglect all terms beyond the first. however. for we need not consider values of x much greater than three times the standard-deviation or 3 ^fh|2. though such a value could be found arithmetically to any desired degree of approximation. VI. For it is evident that (1) any alteration in .. Now h is and the series be large. i<:=-|(i+2+3+ _ x(x-\) X . . or the. ratio of this to k is 3/ \l2h. (4) where. and indeed large compared with x. p. or the numbers of observations which these areas represent. A normal curve is evidently defined completely by giving the values of y^ and o. finally. as suggested in § 8. let us approximate by assuming. and mode therefore coincide.and assigning the origin of x. must be the same. median. law of error. 357-8. doubling y^ doubles every ordinate y^. and therefore doubles (2) any alteration in o. 2'50664:. of every ordinate and therefore doubling a. or number of observations N. of course. the values are.produces a proportionate alteration in the area. Ordinates of the Curve y = e ''.g. The value of a may be found approximately by taking yp and <t both equal to unity. The table below gives the values of y for values of x proceeding by fifths of a unit . calculating the values of the ordinates y^ for equidistant values of x. for the values of y^ are the same for the the area same values of area. The area is represented. of' the curve. and taking the area. —BINOMIAL : DISTRIBUTION AND NORMAL CURVE. 303 ^g produces a proportionate alteration in the area of the curve. e. see list on pp. 11. or the therefore proportional to number of observations or we must have N= a X y^^a- where a is a numerical constant. {For references to more extended tables. the area is therefore.doubles the distance from the mean. and consequently doubles the y^a-. approximately. . as given by the sum of the ordinates multiplied by the interval.) X. xjar. the same for positive and negative values of x.XV. For the whole curve the sum of the ordinates will be found to be 12 53318. the interval being 0'2 units . therefore. Thus if to— 64. . Aswill be seen.. of 1/ J^. however. Chap. For this distribution the mean deviation = 0. 12. . however. do not is for . The closeness of approximation is partly due to the fact that. the normal curve gives the terms of the symmetrical binomial surprisingly closely even for moderate values of n. and the standard-deviation is 4.a-.\/2/7r = 0'79788 o-: the proof cannot be given within the limitations of the present work. to a high degree of approximation. Deviations x have therefore to be considered up to ±12 or more. is strictly true the normal curve only. A = 32.000 observations) up to x= +15.. If THEORY OF STATISTICS.304 Stirling (1730). . ji be large. that the mean deviation the origin of the use of J2x<r (the " approximately 4/5 of the standard-deviation. In point of fact. VIII. : . the terms of the -second order in expansions of corresponding brackets in numerator and denominator cancel each other these terms. from the annexed table. viz. VIII. 17). Another rule cited in Chap. In the proof of § 9 the assumption was made that k (the half of the exponent of the binomial) was very large compared with X (any deviation that had to be considered). in applying the logarithmic series to the fraction on the right of equation (3). and this is modulus ") as a measure of dispersion. \n= sj^mr -^ Applying have Stirling's theorem to the factorials in equation (1) we The complete expression for the normal curve is therefore ^=7W* The exponent may be written x^jc^ . (6) where c= tji. which is over 1/3 of k.. § 13). 2 or 1J2 becomes meaningless if the distribution be not normal. we have.cr as a measure of " precision.. The rule that a range of 6 times the standard-deviation includes the great majority of the observations and that the quartile deviation is about 2/3 of the standard-deviation were also suggested by the properties of this curve (see below g§ 16." and of 2cr^ The use of the factor as " the fluctuation " (c/. the ordinates of the normal curve agree with those of the binomial to the nearest unit (in 10. — XV. —BINOMIAL DISTKIBUTION AND NORMAL CURVE. 305 accumulate, but only the terms of the third order. There is only one second-order term that has been neglected, viz. that due to the last bracket in the denominator. Even for much lower values of m than that chosen for the illustration e.g. 10 or 12 (c/. Qu. 4 at the end of this chapter) the normal curve still gives a very fair approximation. — Table (2) shoviing (1) Ordinates of the Binomial Series 10,000 (i ; + J)^ and 32 Corresponding Ordinates of the Normal Curve y = —— 4v25r 10,000 « Term. 306 THEORY OF STATISTICS. Nelwmbitim, Pearl, American Ifaturalist, Nov. 1906). The question why, in such cases, the distribution should be approiimately normal, a form of distribution which we have only shown to arise if the variable is the sum of a large number of elements, each of which can take the values and 1 (or other two constant values), these values occurring independently, and with equal frequency. In the first place, it should be stated that the conditions of the deduction given in § 9 were made a little unnecessarily restricted, arises, therefore, tsoo , 1200 >aoo ^600 K300 XV. —BINOMIAL DISTRIBUTION AND NORMAL CURVE. 307 still, without introducing the conception of the binomial at all, by founding the curve on more or less complex cases of the theory of sampling for variables instead of for attributes. If a variable is the sum (or, within limits, some slightly, more complicated function) of a Icurge number of other variables, then the distribution of the compound or resultant variable is normal, provided that the elementary variables are independent, or nearly so (cf. ref. 6). The forms of the frequency-distributions of the elementary variables affect the final distribution less and less as their number is increased only if their number is moderate, and the distributions all exhibit a comparatively high degree of asymmetry of uniform sign, will the same sign of asymmetry be sensibly evident in the distribution of the compound variable. On this sort of hypothesis, the expectation of normality in the case of stature may be based on the fact that it is a highly compound character depending on the sizes of the bones of the head, the vertebral column, and the legs, the thickness of the intervening cartilage, and the curvature of the spine the elements of which it is composed being at least to some extent independent, i.e. by no means perfectly correlated with each other, and their frequency-distributions exhibiting no very high degree of asymmetry of one and the same sign. The comparative rarity of normal distributions in economic statistics is probably due in part to the fact that in most cases, while the entire causation is certainly complex, relatively few causes have a largely predominant influence (hence also the frequent occurrence of irregular distributions in this field of work), and in part also to a high degree of asymmetry in the distributions of the elements on which the compound variable depends. Errors of observation may in general be regarded as compounded of a number of elements, due to various causes, and it was in this connection that the normal curve was first deduced, and received its name of the curve of errors, or law of error. compare some actual distribution 14. If it be desired to with the normal distribution, the two distributions should be superposed on one diagram, as in fig. 49, though, of course, on a much larger scale. When the mean and standard-deviation of the actual distribution have been determined, 7/^ is given by equation (5) ; the fit will probably be slightly closer if the standard-deviation is adjusted by Sheppard's correction (Chap. XI. § 4). The normal curve is then most readily drawn by plotting a scale showing fifths of the standard-deviation along the base line of the frequency diagram, taking the mean as origin, and marking over these points the ordinates given by the figures The curve of the table on p. 303, multiplied in each case by y^. : more general — — 308 THEORY OF STATISTICS. , can be drawn freehand, or by aid of a curve ruler, through the The logarithms of y in the tops of the ordinates so determined. table on p. 303 are given to facilitate the multiplication. The only point in which the student is likely to find any difficulty is in the use of the scales he must be careful to remember that the standard-deviation must be expressed in terms of the class-interval as a unit in order to obtain for y^ a number of observations per interval comparable with the frequencies of his : table. The process may be varied by keeping the normal curve drawn to one scale, and redrawing the actual distribution so as to make the area, mean, and standard-deviation the same. Thus suppose a diagram of a normal curve was printed once for all to a scale, say, of y^ — 5 inches, o- = 1 inch, and it were required to fit the distribution of stature to it. Since the standard-deviation is 2-57 inches of stature, the scale of stature is 1 itich = 2'57 inch of stature, or 0'389 inches = 1 inch of stature ; this scale must be drawn on the base of the normal-curve diagram, being so placed that the mean falls at 67"46. As regards the scale of frequency-per-interval, this is given by the fact that the whole area of the polygon showing the actual distribution must be equal to the area of the normal curve, that is 5 J2ir= 12 "53 square inches. If, therefore, the scale required is n observations per interval to the inch, we have, the number of observations being 8585, TO X 2-57 which gives n = 266'6. Though the second method saves curve drawing, the first, on the whole, involves the least arithmetic and the simplest plotting. 15. Any plotting of a diagram, or the equivalent arithmetical comparison of actual frequencies with those given by the fitted normal distribution, affords, of course, in itself, only a rough test, of a practical kind, of the normality of the given distribution. The question whether all the observed differences between actual and calculated frequencies, taken together, may have arisen merely as fluctuations of sampling, so that the actual distribution may be regarded as strictly normal, neglecting such errors, is a question of a kind that cannot be answered in an elementary work (cf. ref. 22). At present the student is in a position to compare the divergences of actual from calculated frequencies with fluctuations of sampling in the case of single class-intervals, or single groups of class-intervals only. If the ; XV. — BINOMIAL DISTRIBUTION AND NORMAL CURVE. 309 expected theoretical frequency in a certain interval is /, the standard error of sampling is n^(N -/)/N ; and if the divergence of the observed from the theoretical frequency exceed some three times this standard error, the divergence is unlikely to have occurred as a mere fluctuation of sampling. It should be noted, however, that the ordinate of the normal curve at the middle of an interval does not give accurately the area of that interval, or the number of observations within it it would only do so if the curve were sensibly straight. To deal strictly with problems as to fluctuations of sampling in the frequencies of single intervals or groups of intervals, we require, accordingly, some convenient means of obtaining the number of observations, in a given normal distribution, lying between any two values of the variable. 16. If an ordinate be erected at a distance a;/o- from the mean, in a normal curve, it divides the whole area into two parts, the ratio of which is evidently, from the mode of construction of the curve, independent of the values of y^ and of o-. The calculation of these fractions of area for given values of a;/cr, though a long and tedious matter, can thus be done once for all, and a table giving the results is useful for the purpose suggested in § 15 and in many other ways. Eeferences to complete tables are cited at the end of this work (list of tables, pp. 357-8), the short table below being given only for illustrative purposes. The table shows the greater fraction of the area lying on one side of any given ordinate e.g. 0"53983 of the whole area lies on one side of an ordinate at '460 17 on the other side. O'lo- from the mean, and It will be seen that an ordinate drawn at a distance from the mean equal to the standard-deviation cuts off' some 16 per cent, of the whole area on one side ; some 68 per cent, of the area will therefore be contained between ordinates at ± a. An ordinate at twice the standard-deviation cuts off only 2-3 per cent., and therefore some 95'4 per cent, of the whole area lies within a range of +2a-. As three times the standard-deviation the fraction of area cut oif is reduced to 135 parts in 100,000, leaving 99'7 per cent, within a range of ± 3o-. This is the basis of our rough rule that a range of 6 times the standard-deviation will in general include the great bulk of the observations the rule is founded on, and is only For other forms of strictly true for, the normal distribution. distribution it need not hold good, though experience suggests The binomial distribution, that it more often holds than not. especially if jo and q be unequal, only becomes approximately normal when n is large, and this limitation must be remembered in applying the table given, or similar more complete tables, to cases in which : : the distribution is strictly binomial. 310 THEOKY OF STATISTICS. Table shoiving the Greater Fraction of fhe Side of an Ordinate of Abscissa xja, tables, see list on pp. 357-8.) Area of {For references a Normal Owve to One to more extended XV. —BINOMIAL DISTRIBUTION AND NORMAL CURVE. 311 unreliability of observed statistical results, and the term probable error is given to this quantity. It should be noted that the word "probable" is hardly used in its usual sense in this connection: the probable error is merely a quantity such that we may expect greater and less errors of simple sampling with about equal frequency, provided always that the distribution of errors is On the whole, the use of the "probable error" has little normal. advantage compared with the standard, and consequently little stress is laid on it in the present work ; but the term is in constant use, and the student must be familiar with it. It is true that the " probable error " has a simpler and more direct significance than the standard error, but this advantage is lost as soon as we come to deal with multiples of the probable error. Further, the best modern tables of the ordinates and area of the normal curve are given in terms of the standard-deviation or standard error, not in terms of the probable error, and the multiplication of the former by 0'6745, to obtain the probable error, is not justified unless the distribution is normal. For very large samples the distribution is approximately normal, even though j) and q are unequal ; but this is not so for small samples, such as often occur in practice. In the case of small samples the use of the "probable error" is consequently of doubtful value, while the standard error retains its significance as a measure of dispersion. The " probable error," it may be mentioned, is often stated after an observed proportion with the ± sign before it ; a percentage given as 20'5±2-3 signifying "20'5 per cent., with a probable error of 2 "3 per cent." If an error or deviation in, say, a certain proportion p only just exceed the probable error, it is as likely as not to occur iii simple sampling if it exceed twice the probable error (in either direction), it is likely to occur as a deviation of simple sampling about 18 times in 100 trials or the odds are about 4-6 to 1 against its occurring at any one trial. For a range of three times the probable error the odds are about 22 to 1, and for a range of four times the probable error 142 to 1. Until a deviation exceeds, then, 4 times the probable error, we cannot feel any great confidence that it is likely to be "significant." It is simpler to work with the standard error and take + 3 times the standard error as the critical range for this range the odds are about 370 to 1 against such a deviation occurring in simple sampling at any one trial. 18. The following are a few miscellaneous examples of the use of the normal curve and the table of areas. hundred coins are thrown a number of times. Exa/mple i. How often approximately in 10,000 throws may (1)' exactly 65 heads, (2) 65 heads or more, be expected % : — ; —A 312 THEORY OF STATISTICS. is The standard-deviation distribution as normal, j'(, >/0-5 x 0-5 x 100 = 5. Taking the = 797"9. The mean number of heads being 50, 65 - 50 = 3o-. The frequency of a deviation of 3a- is given at once by the table (p. 303) A as 797-9 X -0111 .... =8-86, or nearly 9 throws in 10,000. throw of 65 heads will therefore be expected about 9 times. The frequency of throws of 65 heads or more is given by the area table (p. 310), but a little caution must now be used, owing A throw of 65 heads is to the discontinuity of the distribution. equivalent to a range of 64-5-65'5 on the continuous scale of the normal curve, the division between 64 and 65 coming at 64-5. 64-5- 50= +2-9a-, and a deviation of +2-9.0- or more, will only occur, as given by the table, 187 times in 100,000 throws, or, say, 19 times in 10,000. Example ii. Taking the data of the stature-distribution of fig. 49 (mean 67-46, standard-deviation 2-57 in.), what proportion of all the individuals will be within a range of + 1 inch of the — meanl 1 inch =0-389(r. Simple interpolation in the table of p. 310 gives 0-65129 of the area below this deviation, or a more extended Within a range of table the more accurate value 0-65136. ± 0-389o- the fraction of the whole area is therefore 0-30272, or the statures of about 303 per thousand of the given population will lie within a range of + 1 inch from the mean. Example iii. In a case of crossing a Mendelian recessive by a heterozygote the expectation of recessive offspring is 50 per cent. (1) How often would 30 recessives or more be expected amongst 50 ofFspring owing simply to fluctuations of sampling 1 (2) How many oifspring would have to be obtained in order to reduce the probable error to 1 per cent. ? The standard error of the percentage of recessives for 50 — observations is 50 \/l/50 = 7-07. Thirty recessives in fifty is a deviation of 5 from the mean, or, if we take thirty as representing 29"5 or more, 4-5 from the mean; that is, 0'636.a-. positive deviation of this amount or more occurs about 262 times in 1000, so that 30 recessives or more would be expected in more than a quarter of the batches of 50 ofFspring. have assumed normality for rather a small value of n, but the result is sufficiently accurate for practical purposes. As regards the second part of the question we are to have A We •6745x50 n being the nearest unit. 7i>=l, This gives to=1137 to the number of offspring. 2 "32 times this. and 5'52 The corresponding fractions of that is. or. iv.) . it could certainly have occurred as a fluctuation of sampling. ref. Maomillan & Co. or once in a hundred. Calculating the ordinate of the normal curve directly we find the frequency 197'8. It is true that p and q are very unequal. errors as REFERENCES. difference. Francis. as The is evident from the form of the curve. 1889. of fig. better. the true limits of the interval are 61i§— 62lf.—BINOMIAL DISTRIBUTION AND NORMAL CURVE. nearest \ in. area are 0"96071 and 0'98418. a little too small. though a first approximation only. in.. mid-value 62'44. l-759cr and 2-148<r. and less than 63" is markedly less than the theoretical value. but then n is very large (8585) so large that the — difference of the chances is fairly small compared with' tjupq Hence we may take the distribution of roughly normal to a first approximation. (Mechanical method of forming a binomial or normal distribution. so we would expect an equal or greater deficiency to occur about 10 times in 1000 trials. Natv/ral Inheritance. The Binomial Machine. difference of theoretical and observed frequencies is therefore But the proportion of observations which should fall into the given class is 0*023. or fraction of area between the two ordinates. Multiplying this by the whole number of observations (8585) we have the theoretical frequency 201-5. (1) Galton. see below. and the standard error of the class frequency is accordingly As the actual deviation is only v'O-023 X 0977 x 8585 = 14-0. This is a deviation from the mean (67 "46) of 5-02. (about one-fifteenth). 63 . interval actually lies between deviations of 4-52 in. p. . This is certainly.. London. v. or 61'94-62'94. The 32 '5. 13. The question how often it might have occurred can only be answered if we assume the distribution of fluctuations of sampling to be approximately normal. The tables give 0-990 of the area below a deviation of 2"32o-. chap. Could such a difference occur owing to fluctuations of simple sampling . and if so. 313 Example —The diagram 49 shows that the number XV. how statures often might it happen % The actual frequency recorded is 169. To obtain the theoretical frequency we may either take it as given roughly by the ordinate in the centre of the interval. for Pearson's generalised machine. use the integral Remembering that statures were only recorded to the table. the proportion falling into other classes 0'977. of recorded in the group "62 in. 0'02347. 1906.. J. F. G. cxovii. Y. 102. XX. and similar Frequency-distributions. Soc. Edgeworth. Chap. 343. Stat. X." .. vol. 1879. p... 169.. 113-141 (and an appendix. (16) Sheppard. F. p. Ixix. (4) . (16) Perozzo. pp. not printed in the Oambridge Phil. 1907. xxix.. p. E. The memoir deals with curves derived from the general binomial. 1906. For a derivation of the same curves from a modified standpoint. 702-706. Wm. Ixii. F. and others. Leipzig. 1898 . vol.: 314 THEORY OF STATISTICS.. A Rejoinder. London. "On the Representation of Statistical Frequency by a Curve.. pp. F.. The student will find other citations in 6. Kollektivmasslehre (herausgegeben von G. pp.. F.. Karl. Trans." Mem." Cambridge Phil. . LniGi. 1905. 1901. 280. Nixon. Groningen . vol. 1913. Stat. p." Jour. Boy. 1903." PhU. Trans. and from a somewhat analogous series derived from the case of sampling from limited material. Y. J. C. F. Ixi. Izxxi. 7. 1902. Soc. Soc. Dawson & Sons. Y. 1899 and vol. Macalister. clxixvi. see § 1. p. 1897. 1900. p. L. Skew Frequency Curves in Biology and Statistics. 497. a Class of Normal Functions occurring in Statistics. Series A. (2) (3) Cttnningham. vol. "Skew Variation in Homogeneous Material. Biometrika." Jowr. Boy. p. vol.... 7). vol. For the early classical memoirs on the normal curve or law of error hy Laplace. ). C. Soc. ibid.. Edgbworth. of which 6. Pearson.) Kapteyn. vol. (Includes a geometrical treatment of the normal curve. Karl. 367. The literature of this subject is too extensive to enable us to do more than cite a few of the more recent memoirs. Lund. 36-65.. V. vol. "The Law of Error. F. vol. vol. Uhakliee.. Article on the "Law of Error" in the Eneydopcedia Britannica. 310. Ixxvi. Soc. Series A.. Donald. "Das Fehlergesetz und seine Verallgemoinerungen durch Fechner und Pearson " . " Nuove Applicazioni del Caloolo delle Probability alio Studio del Fenomeni Statistiei e Distribuzione dei Matrimoni secondo I'Eth degli Sposi. Boy. cxcii. (14) Pearson. cf. Soc. and 13 are of fundamental importance. and 14. Stat. Stat.. For the generalised binomial machine. i-xiv. dei Lincei. Soc. "The i»-Functions.. iv. (6) (6) (7) (8) (9) (10) (11) (12) (13) ref. Ixiii. Gauss. 1882. Fechner. 10th edn.. 1904. Noordhoff. or Law of Great Numbers. 101. Beale Accad. 18. " On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation. p. Boy. etc. p.." Jour. x. T. Edgeworth. "The Law of the Geometric Mean. 8. vol. delta Classe di Seieme morali. Soc. see Todhunter's History (Introduction ref. Sujjplement to the memoir. vol. "Researches into the Theory of Probability" [CommuHicatitms from the Astronomical Observatory. Y. xxviii. 443. G." Phil. "The Generalised Law of Error. Lipps. "An Experimental Test of the Normal Law of Error.) YnLE. Soy. Lund) . Boy." Jovr. Boy. vol.. U.. Series 3. Y." Proc.. Engelmann. W.. Boy. Edgkworth. Frequency Curves.. Edgbwokth." Proc. 1908. ignoring the binomial and analogous distributions. Trans. Trans. 1895. W. 1898.... "On the Representation of Statistics by Mathematical Formnlse. vol. Ixx. " On the Distribution of Deaths with Age when the Causes (17) of Death act cumulatively. Series A. and compare them with those of the normal curve. (1) as found directly. Racial Differentiation. 1899. Soy. and a normal curve of the same area.. Stat. cited in (2). 157. 3. Show that if two symmetrical binomial distributions of degree n (and of the same number of observations) are so superposed that the rth term of the one coincides with the (r-H)th term of the other. i. Testing the Fit of an Observed to a Theoretical or another Observed Distribution. 1914. vol. 110. 1894. Karl. (22) "On the Criterion that a given System of Deviations from the Probable. p.e. 85-143. the second the full memoir. Earl. p. —BINOMIAL DISTKIBUTION AND NORMAL CURVE. Soc. 13.. formed by adding superposed terms is a symmetrical binomial of degree n+\.. section vi.) Jo-wr. 5th Series..." Phil. Karl. 230.. 6th Series. Calculate the ordinates of the binomial 1024 (0•5-^0•5)"'. (23) Pearson. p. The Resolution of a Distribution compounded of two Normal Curves into its Components. i. 1911. the compound curve is very nearly normal. " ' ' (19) Trans. Stat. (A binomial distribution with negative index. vi. (18) Peakson.. (2) as calculated from the standard-deviation. 1905. XV. and standard-deviation superposed thereon. Calculate the theoretical distributions for the three experimental oases (1). " Contributions to the Mathematical Theory of Evolution (on the Dissection of Asymmetrical Frequency Curves).. F." PhU. iv. ref. and (3) cited in § 7 of Chapter XIII. 1. Series A." Biometrika. p. 6. 1901. vol. and the related curve." Biomttrika.. p. 1910. "Per la risoluzione delle curve dimorfiche. also Biometrika. Mag. a special case of one of Pearson's curves. vol. ] 4. vol. 250 . Roy. 315 Roy. clxxxv... Rome. 5. 2." part ii. p. Show that if np be a whole number. is such that it can be reasonably supposed to have arisen from random sampling.) See also the memoir by Charlier. [Note it follows that if two normal distributions of the same area and standard-deviation are superposed so that the difference between the means is small compared with the standard-deviation. Pearson. 125. Kabl. 1900. On the Representation of Statistics by Mathematical Formulae. Ixii. mean. 26. On some Applications of the Theory of Chance to (20) Pbakson. vol. 71. Ixxiii. Draw a diagram showing the distribution of statures of Cambridge students (Chap.. (2). on the assumption that the distribution is normal. Y." PAtZ. in the Case of a Correlated System of Variables. Soc. vol. : . " On the Probability that Two Independent Distributions of Frequency are really Samples from the same Population. vol. EXERCISES. vol. Also memoir under the same title in the Transactions of the Reale Accademia dei Lincei. Compare the values of the semi-interquartile range for the stature distributions of male adults in the United Kingdom and Cambridge students.. x. the mean of the binomial coincides with the greatest term. viii. (The first is a short note. Table VII. Mag. Soc. pp. (21) Helgtjbko. Jour. of that memoir dealing with the problem of dissection. 1906.J. p. VI. the distrib ution 1.. . vol. Fernando de. Edge-worth. ? 11. and 4 per cent. 38 per cent. find approximately (assuming that the distribution is normal) the mean and standard-deviation of a series in which 58 per cent. Taking the mean stature for th? British Isles as 67 '46 in. In similar experiments.. mesocephalic. Example ii. might (a) 30 per cent. In what proportion of similar experiments based on (1) 100 seeds. certain crosses of Pisum. brachyeephalic. what number of seeds must be obtained to make the probable error " of the proportion 1 per cent. (the distribution of fig. and the common standard-deviation as 2 "56 in. ' ' . and brachycephcUic when the index is over 80. . (6) 35 per cent. if ever ? 10. are stated to be dolichocephalic. the mean for Cambridge students as 68'85 in. As stated in Chap.. or more.316 THEORY OF STATISTICS 7. (2) 1000 seeds. of green seeds.. Xlll. mesocephalic when the same index lies between 75 and 80. If skulls are classified as dolichocephalic when the length-breadth index is under 75. what percentage of Cambridge students exceed the British mean in stature.. the standard error being 0'51 per cent. assuming the distribution normal ? 8. satimim based on 7125 seeds gave 25 '32 per cent. centage of experiments based on the same number of seeds might an equal or greater percentage be expected to occur owing to fluctuations of sampling alone ? 9. be expected to occur. or more. of green seeds instead of the theoretical In what perproportion 25 per cent. 49). NOEMAL CORRELATION. Constancy of the standarddeviations of parallel arrays and linearity of the regression— 5. constancy of standard-deviation of arrays. The normal surface for two correlated variables regarded as a normal surface for uncorrelated variables rotated with respect to the axes of measurement arrays taken at any angle across the surface are normal distribution of and distributions with constant standard-deviation correlation between linear functions of two normally correlated variables are normal principal axes 7. IX. Outline of the principal properties of the normal dis- — : — : : — : — : — — tribution for n variables. But the generalised normal law is of importance in the theory of sampling it serves to describe very approximately certain actual 1. a knowledge of this special type of frequency-surface ceased to be so essential.CHAPTER XVI. if more than two variables are involved.g. contour lines 12-13. Isotropy of the normal distribution for two variables 14. notably the standard-deviations of arrays (and. 317 . and if it can be assumed to hold good. to test normality linearity of regression." is of great historical importance. as in Chap. or "normal correlation surface. of measurements on man) . The contour lines a series of concentric and similar ellipses 6. be familiar with the more fundamental properties of the distribution. normality of distribution obtained by diagonal addition. Chap. without reference to the form of the distribution of frequency. the partial correlationcoefficients). based on the assumption of such a distribution . as the earlier work on correlation is. expression that we have obtained for the " normal " dissingle variable may readily be made to yield a corresponding expression for the distribution of frequency of pairs of values of two variables. some of the expressions in the theory of distributions correlation. Thb tribution of a : {e. therefore. 1-3... Investigation of Table III.. Standard-deviations round the principal axes 8-11. though when it was recognised that the properties of the correlation-coefficient could be deduced. This normal distribution for two variables. can be assigned more simple and definite meanings than in the general case. The student should. almost without exception. IX. Deduction of the general expression for the normal correlation surface from the case of independence 4. assuming independence. and a similar statement holds for we the array of x^'s .. with the sam. these properties must hold good. If they are not x^. are a series of similar ellipses with major and minor axes parallel to the axes of x-^ and Xg ^^d proportional to o-j and o-j. XII. the frequency-distribution of pairs of values must. the correlation-coefficient being zero.i.j are uncorrelated. singly.318 THEORY OF STATISTICS. The contour lines of the surface. . equally frequent. that is to say. one special ajj = a consee that every section of the surface by a vertical plane parallel to the z^ axis. § 13). lines drawn on the surface at a constant height. the distribution of any array of a^'s. Equation stant. (2) (2) gives a normal correlation surface for If we put case.e mean and standard-deviation as the total distribution of ajj's. Consider first the case in which the two variables are completely independent. cKj . as the two variables are assumed independent (c/. V. 2.1 = '^2 " ''21"'''l Xj and Kj. (4) and x^ related by an equation of this form are. is a normal distribution. be given by -<4h-|) where . merely uncorrelated but completely independent. be 2'i = 3'i« (1) 2<r? 2'2 = y2« Then. and if the disand as also x^ . therefore.e. by the rule of independence. the equations to the contour lines being of the general form 3 + 4 = C^ Pairs of values of 3.. . . of course. 8) that "'2. To pass from of if this special case of independence to the general § case two correlated variables. i. Let the distributions of frequency for the two variables x-y and x^. remember (Chap. Chap. we must have for the frequency-distribution of pairs of deviations of ajj and x^.o-2(l -rjj)* ' 4. '1. therefore.o-i.1 ^' 12 ""ls.2 We have. 319 tribution of each of the deviations singly be normal. If we assign to x^ some fixed value.i 27r. we see iCi's.1' This is a normal distribution of standard-deviation r-i^-^'h^ o'2 o-j. since fx^ ."'2.1 * Evidently we would also have arrived at precisely the same expression if we had taken the distribution of frequency for Kj and jBj. (6) and Kj-j. (5) But = ^ + ^-2r.o-i. are independent. say Aj.a-2..2 -_2 (72. and reduced the exponent ^ 4. —NORMAL COKKELATION. .j. distribution of the array of x^'b of type h^.j.2 2ir.^ yi2=?''i2« ^"^ '"^ • • .o-i. Wj and ajj.g. that the standard-deviations of all arrays of x-^ are the same.2 2. we must have ^'^ ' 27r. we have the \ <ri.?!:? 01 0-1.^) Further.o-2. with a mean deviating by tion of (1) from the mean of the whole distribu- As ^2 represents any value whatever of ajj.XVI. .2 "fil 1. the general expression for the normal correlation surface for two variables -j(4++-^'. 5. independence.Lines of means Contour lines and Axes of normal correlaCion surface Fig. however.j is strictly linear. of course. we will find (1) that the standard-deviations of all arrays of x^ are the same (2) that the regression of x^ on a.. RR . a the major and minor axes are. no longer parallel to the axes of x^ and Kj. (2) that the regression of x-^ on x^ is strictly if we assign to ic-^ any value h^. Fig. — Principal Axes and Contour Lines of the normal Correlation Surface. o-jj : THEORY OF STATISTICS. 50 illustrates the calcuand CO being lated form of the contour lines for one case. : Axes of Measurement M = Mean of and IS whole surface also the summit of Che surface R 9 . The contour lines are. Similarly.CC.320 and equal to linear. hut make a certain angle with them. As each line of regression cuts every the lines of regression. as in the case of series of concentric and similar ellipses . 50. = ^ . It also follows that.XVI. aij. coincide in direction and sense with the Wg-axis. we must use a little trigonometry. i. 2(fj^j) together equations (8) and summing. 6.. and since for every angle through which it is turned the distributions of all Xj arrays and x^ arrays are normal.e. If fj. —NORMAL COEKBLATION. 321 array of Xj or of x^ in its mean. = 0. somewhat truncated. aTj. 29. 166. are uncorrelated. RR must bisect every horizontal chord and CC every vertical chord..e normal for every angle though which the surface is turned. (2) the correlation between any two linear functions of two normally distributed variables must be normal correlation. + 2r^. p. as illustrated by the two chords shown by dotted lines: it also follows that RR cuts all the ellipses in the points of contact of the horizontal tangents to the ellipses. Hence. multiplying = {a^-ai) sin 26 tan 2. 50. it follows that every section of a normal surface by a vertical plane is a normal curve. a normal surface for two correlated variables may be regarded merely as a certain surface for which r is zero turned round through some angle. and CC in the points of contact of the vertical tangents. is shown in fig. The surface or solid itself. and as the distribution of every array is symmetrical about its mean.. from the position for which the correlation is zero to the position for which the coefiBcient has some assigned value r. the angle 6 being taken as positive for a rotation of the jBj-axis which will make it. To find the angle 6 through which the surface has been turned. since ^j f. But these would give the distributions of functions like a. the distributions of arrays taken at any angle across the surface are normal. ^j-^^ ^^^ '^°' ordinates referred to the principal axes (the |j-axis being the Xj axis in its new position) we have for the relation between ^j. the distributions of totals given by slices or arrays taken at any angle across a normal surface must be normal distributions. ^j = Xy cos 6 + x^.Xj + J. sin 6 \ (8) But. if continued through 90°.ajj. as we see from fig. since the total distributions of x^ and x^ must b. The major and minor axes of the ellipses are sometimes termed the principal axes. and consequently (l)'the distribution of any linear function of two normally distributed variables x^ and x^ must also be normal . Since.<!-i<T2 cos 20 (9) 21 . ^2. Care must. They may be most readily determined as follows. equation (9) gives the angle that they make with the axes of measurement whether the distribution be normal or no.2i22" But these two values therefore of the central ordinate must be equal. while we have deduced (11) from a simple consideration depending on the normality of the distribution. the correlation between two variables may be expected to be approximately normal if that variable slightly may some more complex function) . say 2j and Sj. be taken to give the correct signs to the square root in solving. within limits. provided that these elementary component variables are independent. or nearly so. for evidently from § 2 the major and minor axes of the contour-ellipses are proportional to these two standard-deviations. (10) for Referring the surface to the axes of measurement. the major axes of the ellipses lying along ^^ but if r be negative. summing and adding. 2j -i. 8. § 13. 2j . ^j are uncorrelated. XV. + -^ = . if It should be noticed that we distribution for two variables as being a pair define the principal axes of anj of axes at right angles for which the variables fj.: 322 THEORY OF STATISTICS. about the 7. the central ordinate by equation (7) we have iV Referring it to the principal axes. principal axes are of some interest. for if As stated in Chap. Squaring the two transformation equations (8). by equation If (3) y\2-- 2ir. It should be noted that.2^ is also negative. and may be obtained at somewhat greater length from the equations for transforming co-ordinates. of a large number of other variables. it is really of general application (like equation 10). and 2j . we have ^.Sg is necessarily positive.22 also if r is positive. . The two standard-deviations. the frequency-distribution any variable may be expected to be approximately normal be regarded as the sum (or. however.^ may be very simply obtained in any arithmetical case. . 2A = <ri<ra(l-»^^)> (10) (11) and (11) are a pair of simultaneous equations from which 2j and '2.ji + oi . Similarly. . p. 206). the approximate normality of the total distributions for this table. whether the normal surface will fit the distribution of the same character in pairs of individuals we leave it to the student to test. in one instance at least. p. 160. —NORMAL COEKBLATION. We can now utilise Table III.— XVI. Stature is a highly compound character of this kind. or some sHghtly more complex function. of the normal distribution the constancy of the standard-deviation for all parallel arrays. of a large number of elementary component variables. when drawing arising as fluctuations of samples from a record for which the regression is strictly regression is appreciably linear. Subject to some investigation as to the possibility of the deviations that do occur simple sampling. viz. as far as he can do so by simple graphical methods. Chap. 204 the standarddeviations of ten of the columns of the present table. 37. as far as we can by elementary methods. and the closeness of the regression to linearity was confirmed by the values of the correlation-ratios (p. The first important property of the normal distribution is the linearity of the regression. from the column headed 62'5-63'5 onwards . 174. the intensity of correlation depending on the proportion of the components common to the two variables. p. 323 each of the two variables may be regarded as the sum. and we have seen that. 0-52 in each case as compared with a correlation of 0'51. IX. showing the correlation between stature of father and son. This was well illustrated in fig. The second important property is for two variables 2-56 . we may conclude that the : linear. X. these were 9.. the distribution oif stature for a number of adults is given approximately by the normal curve. We gave in Chap. to test.. the distributions of arrays are very irregular. as a rough test suggests that they distributions of all might have done 10. so. we find. but the totals of arrays at an angle of 45° will be given with sufficient accuracy for our present purpose by the totals of lines Eeferring again to Table of diagonally adjacent compartments. allied property of a normal correlation-table. and forming the totals of such diagonals (running up from left to right). IX. III. Chap. starting at the top left-hand : corner of the table.. Next we note that the arrays of a normal surface should themselves be normal. the following distribution : 0-25 . Owing. and not parallel to either From an ordinary correlationaxis of measurement (c/. and their normality cannot be tested we can only say that they do not in any very satisfactory way But we can test the exhibit any marked or regular asymmetry. § 6).. viz. table we cannot find the totals of such diagonal arrays exactly. a fact. however.— 324 THEORY OF STATISTICS. that the totals of arrays must give a normal distribution even if the arrays be taken diagonally across the surface. strictly normal. but. to the small numbers of observations in any array. fig. and. so far as the graphical test goes. sin ^=cos 6=1/ J2 fitting = 3'361). certainly there is no marked asymmetry. One of the greatest divergences of the actual distribution from the normal curve occurs in the almost central interval with frequency 78 the difference between the observed and calculated frequencies is here 12 units. = 0-51. the distribution is rather irregular but the fit is fair . the distribution may be regarded as appreciably normal. — NORMAL 0-3 CORRELATION.XVr. so that it may well have occurred as a fluctuation of simple sampling. rj2 325 and inserting find o-{ <rj = 2-72. Drawing a diagram and a normal curve we have 51 . = 2-75. : 100 80 eo k 4C 20 . but the standard error is 9'1. referred to therefore be written in the form (3-36)'^ (1-91)^ .= 67-70 Jf2 = 68-66 2-75 '^12 ' c{= 2-72 (7) 0-2= =" . may to the contour ellipses. Sj and Sg. 3 units for a frequency such fluctuations might of 9. 51 Hence we have from equation ^12 = 26-7 and the complete expression for the fitted normal surface is ac n \6-47 6-60 6-43 . square root (Chap. or 2 units for a frequency of 4 cause wide divergences in the corresponding contour lines. whence 20 = 91° 14'. but it is very much easier to draw To do this the ellipses if we refer them to their principal axes. and 2 the same constants for : the sons. not with a protractor. the principal axes standing very nearly at an angle of 45° with the axes of measurement.22 = 5-275 2j. § 12). and this implies a standard error of about 5 units at the centre of the table.2. To obtain 2j and 2^ we have from (10) and (11) 25 + ^=14-961 2^=12-868 22i Adding and subtracting these equations from each other and taking the square root. . 6 = 45° 37'. From (9).owing to the two standard-deviations being very nearly equal. XIII.-. Using the suffix 1 to denote the constants relating to the distribution of stature for fathers. The equations the principal axes. owing to the principal axes standing nearly at 45° the first value is sensibly the same as that found for o-f in § 10. = 1-447 whence 2j = 3-36. we must first determine &. 22=1-91. tan 26= -46-49. but by taking tan 6 from the tables (1-022) and calculating points on each axis on either side of the mean. 2^ -I. They should be set off on the diagram.'' The equation to any contour ellipse will be given by equating the index of e to a constant. #=1078 Jf.326 THEOKY OF STATISTICS. 10 and 20. we find c = l-83. 6-15.XVI. or the following 6S Fio.— Contour Lines for the Frequencies 5. 1-40 and 0'76. Pj -Pj. IX. 1-45. 52. ellipses drawn with these axes are shown in fig. 327 To the major and minor axes being 3-36 x c and 1-91 x c respectively. mean. —NORMAL CORRELATION. very much : — .. 4-70. 3-50. 2-55: semi-minor axes. : semi-major values for the major and minor axes of the ellipses The axes. Chap. and corresponding Cpntour Ellipses of the fitted Normal Surface. find c for any assigned value of the frequency y we have yi2=yi2« . principal axes M.-h^ c2_ 2(iogy'i2-iogyi2) log e Supposing that we desire to draw the three contour-ellipses for y = 5. 64 66 66 67 68 : 69 inches 70 71 n 73 Stature of Father 52.. P^ P^. 10 and 20 of the distribution of Table III. 2-67. the fit must be regarded as quite as good as we could expect with such small frequencies. and the various shapes and other particulars of its sections that were made by horizontal planes" (ref. to a mathematician. we have for the ratio of the cross-products (frequency of Xj x^ multiplied by frequency of Xi. there is a frequency of 18'75.. suggested that the contour lines of a similar table for the inheritance of stature seemed to be closely represented by a series of concentric and similar ellipses (ref. but if the ellipse be compared carefully with the table. 3. 4). 2) the suggestion was confirmed when he handed the problem.x-^ has been taken of the same sign as x\ — x^. for father's stature = 68 in. x^. fair. working without a knowledge of the theory of normal correlation. In the case of the central contour. to which all the theorems of Chap. Xa. It will be seen that the fit of the two lower contours is. in abstract terms. 102). divided by frequency of Wj x^ multiplied by frequency of — : Xi x^j. reduced. results as a whole. x^. the fit looks very poor to the eye.. y = 20. and an increase in this much less than the standard error would bring the actual contour outside the ellipse. the points on these polygons having been obtained by simple graphical interpolation between diagonal interpolathe frequencies in each row and each column tion between the frequencies in a row and the frequencies in a column not being used. x[. For if we isolate the four compartments of the §§ 11-12 apply. son's stature = 71 in. the figures suggest that here again we have only to deal with the eiFects of fluctuations of sampling. («l-a^l)(4-r>)- Assuming that the exponent is x\ of the . 12. p. Mr J. V. of course. correlation-table common to the rows and columns centring round values of the variables x^.. Hamilton Dickson (ref. on the whole. It is perhaps of historical interest to note that Sir Francis Galton. Again.. one of the squares shown representing a square inch on the original. The actual contour lines for the same frequencies are shown by the irregular polygons superposed on the ellipses. Hence the association for . son's stature = 70 in.328 THEORY Of STATISTICS. especially considering the high standard errors. The normal distribution of frequency for two variables is an isotropic distribution. asking him to investigate "the Surface of Frequency of Error that would result from these data. D. same sign as rjj. For father's stature = 66 in. from the original drawing. and an increase of a single unit would give Taking the a point on the actual contour below the ellipse. there is a frequency of 19. 14). V. of Chap. and also if the contingency or correlation be small. the influence of casual irregularities due to fluctuations of sampling may render it difficult to say whether the distribution maybe regarded In such cases some further conas essentially isotropic or no. to see whether the distribution of frequency may be regarded as approximately isotropic.ssintervals are equal or unequal.x 2-fold form for the calculation of the correlation If the table is not reducible coefficient by the process referred to. ment. or the association zero. or some process of "smoothing" by averaging the . the association is therefore same sign the sign of j-jg for every tetrad of frequencies the compartments common to two rows and two columns . the process of calculating the coefficient of correlation on the assumption of normality is to Clearly. If only reducible to isotropic form by some rearrange§§ 9-10). to isotropic form by any rearrangement. the distribution is isotropic. and could not be rearranged so as to give a grouping of normally distributed frequency. that — — to say.x 2-fold form must always be the same whatever the axes of division chosen. if r^2 is zero. therefore. even if the table be isotropic it need not be be avoided. large or small.table be not large. densation of the table by grouping together adjacent rows and columns.XVI. If the frequencies in a contingency. and the coefficient of correlation is determined on this hypothesis by a rather lengthy procedure (ref. It follows that every groiiping of a normal distribution is isotropic whether the clp. 329 this group of four frequencies is also of the same sign as r-^. the ratio of the cross-products being unity. Table II. could not be regarded as normal. These theorems are of importance in the applications of the theory of normal correlation to the treatment of qualitative The characters which are subjected to a manifold classification. this rearrangement should be effected before grouping the table to 2. 13. Before applying this procedure it is well. but at least the test for isotropy affords a rapid and simple means for excluding certain distributions which are not even remotely normal.^. and the sign of the association for a normal distribution grouped down to 2. — — . contingency tables for such characters are sometinies regarded as groupings of a normal distribution of frequency. or reducible to isotropic form by some alteration in the order of rows and columns (Chap. of the in is In a normal distribution. — NORMAL CORRELATION. might possibly be regarded as a grouping of normally distributed frequency if rearranged as suggested in § 10 of the same chapter it would be worth the investigator's while to proceed further and compare the actual distribution with a fitted normal distribution but Table IV. V. normal. is obviously not strictly isotropic as it stands we have seen. that it appears to be normal. however. for instance. We can apply a rough test by regrouping the table in a much coarser form.. isotropic : Table I. — (condensed from Table III. of Chapter IX. The frequencies in adjacent compartments. Son's Stature (inches). . correlation-table for stature in father and son (Table III. may be of service.: 330 THEOKY OF STATISTICS. within the limits of fluctuations of sampling.). IX. the limits of rows and of columns having been so fixed as to include not less than 200 observations in each array. and it should consequently be within such limits. Chap.). say with four rows and four columns the table below exhibits such a grouping. that the deviations x-^.. (n-2)ntr„. x^.1.3. referring the student for proofs to the original memoirs.e. +"r Ol "^l . . (15) ^^i2. + ---+. n — . then. a great variety of ways. x^.or 4- or two other condensations : of the original table to 3- x 4-fold form he will probably find them either isotropic. . i. .n 0^1. The student should form one 3. Denoting the frequency of the combination of deviations Xj..23 .. . %i> ^312' 6tc. x^-^^i ^tc. a„ is normally distributed.i °^ with x^ and x^^.i (13) \ ' and Z' 12 •••» = /. viz. xt ^. ••••'"-" . -2 "3. § 3 of the present chapter). Before concluding this chapter we may note briefly some of the principal properties of the normal distribution of frequency for any number of variables..in-ZI~. if the uncorrelated deviations x^.12 . . — —NORMAL x — 331 CORRELATION. X2. —'i^{n-Vln. (n-1) . . . 7/ yi2 n =1/' y 12 » - ^-J*(«ii^ "n) • (12) where 0(^1^2. . — nO^ilS .23 . O^in . if in (13) any fixed . . "1. .. 14.. and so on. ---^ —— + ...1). „. ... . . T . oil 0^1.l (n-ll Several important results may be deduced directly from the form Clearly this might have been written in (13) for the exponent. n ..„_i.lZ.. Our assumption. .. or diverging so slightly from isotropy that an alteration of the frequencies. x^ . . . allotting any primary subscript to the second deviation (except the subscript of the first).. isotropic. we must have in the notation of Chapter XII. well within the margin of possible fluctuations of sampling. . .. x„ hj 2/12 .. ..o"wi • (14) The expression (13) for the exponent <^ may be reduced to a general form corresponding to that given for two variables. '^. eralising the deduction of § 6.. .13 ..12 .1.. just as in § 3 we arrived at precisely the same final form for the exponent whether we started with the two deviations Kj and aig. that any linear function of x^. . Further.1 ^»)=3+3^+3^+ . . in the general normal distribution for n variables every array It will also follow. be completely independent (c/... commencing with any deviation of the first order.3 "^n. are normally distributed amounts to the assumption that all deviations of any order and with any suffixes are normally distributed.— XVI.X . will render the distribution isotropic.12 0-„.. genof every order is a normal distribution. ix. 255.nZ . and then throw 1^ into the form of a perfect square (as in § 4 for the case of two variables). . The expressions r-^^ n are of course the partial regressions ..2s is . les : probabilit& des erreurs de Mimoires preseviis par divers Soc. to increasing values of x^.a3.." Acad. p. .^ "^2. renders the meaning of partial correlation coefficients much more definite in the case of normal correlation than in the general case. II« s^rie. strictly . : 0"l. . if we assign any given value to x^.n «.23. .il 2.2S • • • nn. whatever the particular The latter fixed valves assigned to the remaining deviations.1231 s^nd all the following deviations. as we have Similarly. . . the on expanding x^. 1886. (8) Galton.12. But this is a linear function of h^.. say. 1846.332 THEORY OF STATISTICS. normal correlation. . /tg.. (2) the correlation between the said deviations is r^^j. des Sciences savants. .13.. iTjjj all the following deviations. linear.. n/<'"2. . Ray.2 " .i and x^^: in the normal case r„.. assigned to x-^. therefore in the case of normal correlation the regression of any one variable on any or all of the others o"i. In the general case r„.^. REFERENCES. n2.. (2) Galton. . "^S. 42. etc. the correlation coeflScient being r^^-^..S4.i3 n) ^tc.„. we obtain a normal distribution for x-^ in which the mean is displaced by values be assigned to correlation between and x-^ and x^. (1) t/ie correlation between any two deviations x^^ and x^^. between a. using k to denote any group of secondary suffixes. the correlation between the associated values of x-^ and x^ is ri2. vol. conclusion.. is. . °'l. if actually worked out for the various sub-groups corresponding.3. it will be seen. we have to note that if. «S + . on reducing ajg ^2 to the second order we shall 'find that the correlation between Xj... would probably exhibit some continuous change. etc. is normal correlation . " Analyse inathimatique sur situation d'un point. n (n-1) .24. 1889. if any fixed values be seen. ...^i represents merely the average correlation. <''l. (15) for <^.3 in the general case ri2. p. and so on. "Family Likeness In Stature..." Proo. That is to say.i ^^^ %i '^ normal correlation. ..n .. so to speak. to all the deviations except x^. (1) Bbavais.n ..n. General. xl.^j is constant for all the subgroups corresponding to particular assigned values of the other variables.i "^n. say h^. Franois. to 3:4.. in the expression case might be. '*2 +„ n3. («-l) . Maomillan & Co. Thus in the case of three variables which are normally correlated. Ifaiural Inheritance . we assign fixed values. a. increasing or decreasing as the Finally.W . . A3.. Francis. " Archives of Psychology (New York). Various Methods and their Relation to Normal Correlation. vol. vol. 1904. 1907. {Cf. 1896. D. Karl. Trans." Cwmbridge Phil. Trans. p. p. (14) Pearson... p. Proe. Y.. (Methods based on correlation of ranks difference methods. 1. Boy." FMl. 1901. "On Further Methods of Determining Correlation.. (6) Pearson. p.. p. 1907. "On the Influence of Natural Selection on the Variability and Correlation of Organs.) (20) Spearman. " On the Application of the Thooi-y of Error to Cases of Normal Distribution and Normal Correlation.. 248. vol. vol. "Regression. p." 5r«7. (12) Sheppard.. Pearson. 89. ii.. vii. (On the fitting of " principal axes" and the corresponding planes in the case of more (8) Edgewokth. Karl. "On the Generalised Probable Error in Multiple Normal Correlation. Soe. . and Alice I. Stat. (13) Sheppard. cc. 1909. 1886. Kabl. 1900.ee. (10) (11) 1897. (The suggestion of a " rank " method see Pearson's criticism and improved formula in (18) and Spearman's reply on some points in (20). U. E. Series A. Karl. (2). ii. vol. "Correlation calculated from Faulty Data. 182. 96. Series A. S (16) Pearson. vol. (4) (6) —NORMAL COERELATION.. " On the Theory of Contingency and its Relation to Association and Normal Correlation. Series A. Dulau & Co. of Psychology. Pearson. Soy. Heredity. Yule. "A Footrule for Measuring Correlation." Biometrilca. . vol. Appendix to F. "On the Calculation of the Double-integral expressing Normal Correlation. Trans. Kakl.. C. 1907. Kakl. Series A. vol. Soc. xl.. vol. etc. XVI.) ) . vol. C." Jour. 333 Dickson. vi.. 1900.. F. " On a New Method of Determining Correlation between a Measured Character A and a Character S. Thorndike. Soy. L. vii." Phil. (Based on the assumption of normal correlation. treated by a Applications to the Theory of Attributes. p. G. Yule. p. (7) Pearson. Hamilton. F. W. (15) Pearson.. : : of Psychology. 5th Series. 59." Biometrika. ) Dulau & Co. "On a New Method of Determining Correlation. London." Proc Soy. 271.. cxov. p.. 3 of Chap. "On the Theory of Correlation. vol. Karl. Biometric Series I. Ixxix. p. p. p. Correlated Averages." Phil.. "On the Correlation of Characters not Quantitatively Measurable... Trans.'' 5r»<. Java: (17) . Soc. 101.. p. J. Soc. p.. of which only the Percentage of Cases wherein exceeds (or falls short of) a given Intensity is recorded for each grade of A. Mag. Trans. 1908.) 1902." Phil. Ix. (21) iii. and Panmixia. 190. Kael. 812." (18) Drapers' Company Besearch MemA)irs. Karl. vol.1906. Pearson. U. Biometric Series IV.. xxxiv. 1898. 559. 1910. W. vol. Say." Drapers' Company Besearch Memoirs. "On (9) than two variables. G. 1910. Soc. 1892. p. "On Lines and Planes of Closest Fit to Systems of Points in Space." PAiZ. Soc. Soc. "Empirical Studies in the Theory of Measurement. when one Variable is given by Alternative and the other by Multiple Categories.. "On the Theory of Correlation for any number of Variables New System of Notation. Series A. 1. (19) Spearman. Pearson. Jour. criticism in ref. xix. vol. Soy." Biometrika. 23.. clxxxvii. Mag. 253. 63. See also the memoir (12) by Sheppard. cxcii." PM7. III. 6th Series.. London. Soy. vol. there must be some pair of axes at right angles for which the correlation is a maximum. at the medians. The coefficient of correlation with reference to the principal axes being zero.— 334 THEORY OF STATISTICS. the 1. and with reference to other axes something. Show that these axes make an angle of 45° with the principal axes.) 2.) A fourfold table is formed from a normal correlation table. Hence show that if the pairs of observed values of a^ and Xj are represented by points on a plane. (Slieppard..-„. .e. 10. EXERCISES. so that (^)=(o) (-») (3)=N/2. Deduce equation (11) from the equations for transformation of co-ordinates without assuming the normal distribution. (A proof will be found in ref. and a straight line drawn through the mean.). sum 4.(-«-. Show that = = A B . of the squares of the distances of the points from this line is a minimum the line is the major principal axis. is numerically greatest without regard to sign. taking the points of division between and a. ref 12. and that the maximum value of the correlation is if . and 5. i. 3. and so on until we have drawn n cards (a number small compared with the whole number in the bag). correlation-ratio and criterion for linearity of regression 16. correlation coefficient and regression. Standard error of the arithmetic mean 11. and limitations of interpretation 10. and our problem is to determine the standard-deviation that each such measure will exhibit. In Chapters XIII. 1-2. Kelative stability of mean and median in sampling 12. We 335 . now proceed to consider some of the simpler theorems for the case of variables (c/. THE SIMPLEE CASES OP SAMPLING FOR VAEIABLES PERCENTILES AND MEAN.. Correlation between errors in two percentiles of the same distribution 8. draw another. coefficient of variation. Suppose that we have a bag containing a practically infinite number of tickets or cards bearing the recorded values of some variable X. Chap. No one of these measures will prove to be absolutely the same for every sample. 2. Effect of removing the restrictions of simple sampling. The problem 3. XIII. standard-deviation. Special values for the percentiles of a normal distribution 5. of sampling for variables the conditions assumed Standard error of a percentile i. Simplified formula for the case of a grouped frequencydistribution 7. Let us continue this process until we have if such samples of n cards each.— — : CHAPTER XVII. and then work out the mean. Effect of the form of the distribution generally 6. we have been concerned solely with • the theory of sampling for the case of attributes and the frequencydistributions appropriate to that case. we must be careful to define precisely . Standard error of the interquartile range for the normal curve 9. median. etc. Statement of the standard errors of . Standard error of the difference between two means 13.-XVI. — 1. for each of the samples. so as to These conditions realise the limitations of any solution obtained. § 2). The tendency to normality of a distribution of means^ll. and that we draw a ticket from this bag. — — — — — — — — — — standard-deviation. In solving this problem.the conditions which are assumed to subsist. Restatement of the limitations of interpretation if the sample be small. Effect of removing the restrictions of simple sampling 15. note the value that it bears. (b) and (c) to denote the corresponding. Chap. This assumption is unnecessary.). 4. the last paragraph in § 8. It is for this reason that we spoke of the record. XIII. we must not make it a practice to take the first card from bundle number 1. it should be noticed. assumptions indicated (a) by the same letters in the section cited. or else the chance of drawing a card with a given value of X. at each drawing. is uncorrelated with the value of recorded on card 2. and so on. or a value within assigned limits. is the same at each sampling. if our card-record is contained in a series of bundles. the bag indefinitely large. the average of the following cards will be lowerthan the mean of all cards i. the interpretation of our results becomes simpler and more straightforward. using the letters (a). as before. XIII. may not be the same for each individual card at each drawing.— 336 THEORY OF STATISTICS. and so on. and the discussion in g§ 4-8. as containing a practically infinite number of cards. assume that we are drawing from precisely the same record throughout the experiment. If it be true. 3. bearing the numbers 1 to 10. for otherwise the successive drawings at each sampling would not be independent if the bag contain ten tickets only. the second card from bundle number 2. and we draw the card bearing 1. XIV. and we would refer the student to the discussion then given.e. speak of as simple sampling. as already pointed out for the : We X X : case of attributes (Chap. so that the value of recorded on card 1. XIII. on the other hand. § 8). but that each of our cards at each drawing may be regarded quite strictly as drawn from the same record (or from identically similar records) e. so that the chance of drawing a card with any given value of X. do not. we can. we draw the 10.ff. if. there will be a negative correlation between the number on the card taken at any one drawing and the card taken Without making the number of cards in at any other drawing. the average of the following cards drawn will be higher than the mean of all cards drawn . (b) We assume not only that we are drawing from the same record throughout. or a value within any assigned limits. eliminate this correlation by replacing each card before drawing the next. (c) We assume that the drawing of each card is entirely independent of that of every other. Here it is sufficient to state the assumptions briefly. were discussed very fully for the case of attributes (Chap. i. in § 1. Sampling conducted under these conditions we shall. make the further assumption that the sample is unbiassed. Chap. that the chance of inclusion in the sample is independent of the We recorded on the card (cf. § 3).e.." "the form of the frequency-distribution value of X X . for we can substitute for such phrases as " the standard-deviation of in a very large sample. As has already been emphasised in the passages to which reference is made above. 337 in large sample. for the sample. Standard Error of a Percentile. Let us consider first the fluctuations of sampling for a given percentile. e Therefore for the standard-deviation of corresponding to a proj^ortion p we have a/pq /pq yJ\ n or of the percentile 'p (1) 4.8. perhaps the majority of. A table calculated by Mr Sheppard (Table III. say p + h. 3." : relation tion of the record between the distribution of the sample and the distribufrom which it is drawn. But this ratio is quite simply determinable if the number of observations in the sample is sufficiently large to justify us in assuming that 8 is small so small that we may regard the element of the frequency curve — (for a very large sample) over'which X^ + € ranges as approximately a rectangle. If the frequency-distribution for the very large sample be a normal curve. in Tables for Statisticians and Biomet- 22 . 9. the values of %. and we denote the standard-deviation of in a very large sample by cr. as the problem is intimately related to that of Chaps. and the ordinate of the frequency curve at Xp when drawn with unit area and unit standard-deviation by y^. we know that these observed values wi U ten d to centre round p as mean. as well as observing the proportion of X's above X^.XVII." the phrases " the standard-deviation of " the form of the frequency-distribution in the origi/nal record " but in very many.. no examination of samples drawn under the same conditions can give any evidence on this head. —SIMPLER OASES OF SAMPLING JOE VARIABLES.-XIV. If we note the proportions of observations above Xp in samples of n drawn from the record. for the principal percentiles may be taken from the published tables. the standarddeviation of c will bear to the standard-deviation of 8 the same ratio that e on an average bears to 8. practical cases the very question at issue is the nature of the a very X in the original record. If this assumption be made. XIII. p. If now at each drawing. Let Xp be a value of such that of the values of in an indefinitely large sample drawn under the same conditions lie above it and qN below it. X e = ^. with a standard-deviation — X pN X ijpqjn. we also proceed to note the adjustment e required in Xp to make the proportion of observations above Xp + £ in the sample pn. which the student Standard error is (r/Vn multiplied by may sometimes Probable error find useful : is ir/Vn. and these have been utilised for the following student can estimate the values roughly by area and ordinate tables for the normal curve given in Chapter XV. Quartiles . for example a U-shaped distribution like that of fig. . or : THEORY OF STATISTICS. . be an undesirable form of average to employ. . so as to exhibit a value of y^ large compared with the standard-deviation. 19. . XV. in Appendix directly. 1-25331 1-26804 1-31800 1-42877 1-70942 1-36263 0-84535 0-85528 0-88897 0-96369 1-15298 0-91908 It will be seen that the influence of fluctuations of sampling on the several percentiles increases as we depart from the median the standard error of the quartiles is nearly one-tenth greater than that of the median. ref. On the other hand. and it will. Hence for a distribution in which y^ is small. 16. Table IV. multiplied by Median . § 17). as this is an important form of average.— 338 ricians. in so far. . etc. deciles. . Consider further the influence of the form of the frequencydistribution on the standard error of the median.. and 9 . the standard error of the median will be relatively low. . . and the standard error of the first or ninth deciles more than one-third greater. the standard error of the median will be relatively high. 18 or fig. We can create such a . and the values given in the second column for their probable errors (Chap. remembering to divide the ordinates given in that table by ^2:7 so as to make the area unityValue of « gives the values the a combined use of the I.. . 5.) : Median . Deciles 4 and 6 3 and 7 „ 2 and 8 „ 1 . . Deciles 4 and 6 3 and 7 „ 2 and 8 „ 1 and 9 „ Quartiles 0-3989423 0-3863425 0-3476926 0-2799619 0-1754983 0-3177766 Inserting these values of yp in equation (1).. we have the following values for the standard errors of the median. in the case of a distribution which has a high peak in the centre. For a distribution with a given number of observations and a given standard-deviation the standard error varies inversely as y^. having the same area. — SIMPLER CASES OF SAMPLING FOR VARIABLES.0-1 2-v/27r. being merely the reciprocal of . let us find for what ratio of the standard-deviations of two such curves. and let there be the standard-deviations of the two distributions. of the median will be less than crj^n. that is if 2sjTrp or p4 + 2/)3 + (2 . .. the value of i/p is I 2x^. in round numbers. be m/2 observations in each. 0-2 a- is of course the standard-deviation of the com- pound Let distribution. Then . standard-deviation a-Un. This equation may be reduced to a quadratic and solved by taking p + —as a new variable. if the curve is.47r)p2 + 2p -1- 1 = 0.(rJ V is 2 • • W Hence the standard error of the median (c) is equal to o-/\/m if (o-i -I- 0-2) Jorl 4. where o-j.. . The distribution . . or 0-4472 The the other. . The roots found give p = 2-2360 . To give some idea of the reduction in the standard error of . the one root standard error of the median will therefore be . in such a compound distribution. the standard error of the median reduces to "'/Jn. . the standard error that of the other. 339 " peaked " distribution by superposing a normal curve with a small standard-deviation on a normal curve with the same mean and a relatively large standard-deviation..g-j _ 2 vircTiO-J Writing (r2/<Ti = p.. .— XVII. about 2J times of the one normal If the ratio be greater.the median that may be effected by a moderate change in the form of the distribution. (a) On the other hand. which the standard error of the median is exactly equal to tTJJn is shown in fig. at a hasty glance it might almost be taken as normals In the case of distributions of a form more or less similar to that shown. for we can Let fj. class-interval at the given percentile— simple interpolation will give us the value with quite sufficient accuracy for practical purposes. Let <T be the value of the standard-deviation expressed in classintervals. 53. be the frequency-pereliminate <r from equation (1). it is evident that we cannot at all safely estimate by eye alone the relative standard error of : the median as compared with c/vm.340 for THEORY OF STATISTICS. we must have . the standard error of any percentile can be calculated very readily indeed. and if the figures run irregularly they may be smoothed. 53 it will be seen that it is by no means a very striking form of distribution . In the case of a grouped frequency-distribution. and let n be the number of observations as before. if the number of observations is sufiBcient to make the class-frequencies run fairly smoothly. i. to enable us to regard the distribution Fig.e. as nearly that of a very large sample. Then since y^ is the ordinate of the frequency-distribution when drawn with unit standard-deviation and unit area. 6. these standard errors are the same. Let us find the standard error of the first and ninth deciles as another illustration. Using the direct method. and. giving standard errors of 0-0471 and 0-0488. ^8585 1329 = 00349. . multiplying by the : given in the table in § 4. and equal to 0-027737x1-70942 = 0-0474. VII. Using the direct method of equation (2). we find by simple interpolation the approximate frequencies per interval at the first and ninth deciles respectively to be 590 and 570. The student should notice that the class -interval is. the standard error is factor 1'253 . we find the median to be 67-47 (Chap.). and this gives for the standard error of the median by (2) (the number of observations being 632) 0-1309 intervals. and VIII. in this case. take the distribution of stature used as an illustration in Chaps. (^) Jp As an example in which we can compare the results given by the two different formulse (1) and (2). i. and consequently the answer given by equation (2) does not require to be multiplied by the magnitude of the interval. that is 0-0655 per cent. and in The number of observations is 8585.. § 15).XVII. §§ 13. and the standard-deviation 2*57 in. VII.. with sufficient accuracy for our present purpose. mean 00479... In the case of the distribution of pauperism (Chap. which is very nearly at the centre of the interval with a frequency 1329. the value is practically the same as that obtained from the value of the standard-deviation on the assumption of normality. XV. on the assumption of normality of the distribution. 14 of Chap. . this gives 0-0348 as the standard error of the median. Taking this as being. 7. 341 But this gives at once for the standard error es^essed in terms of the class-interval as unit _ Jnpq %=— — 3 . On the assumption that the distribution is normal. In finding the standard error of the difference between two Example . VII. the fact that the class-interval is not a unit must be remembered. The frequency at the median (3-195 per cent. As we should expect.. —SIMPLER CASES OF SAMPLING FOR VARIABLES.) is approximately 96. identical with the unit of measurement. slightly in excess of that found on the assumption that the frequency is given by the normal curve. . the frequency per interval at the median. the distribution being approximately normal cr/^ra = 0-027737. These two percentiles divide the whole area of the frequency curve into three parts. and p^ become sensibly equal to one another. q^ =Pi = |. then r be the correlation between errors in we have gij and p^. and the correlation becomes unity. the first-named being the lower of the two percentiles. for which the values of p and q are p^ q-^.-j. and consequently correlation between errors in two percentiles If the _ \=^sJ~. = J. producing an error 8j in g-j. the correlation between errors in the two percentiles will be the same as the correlation between errors in g-j and p^ but of opposite sign. and the errors in the second percentile are directly proportional but of opposite sign to the errors in p^. q-^ and q^. V:tiPi The correlation Ml between the percentiles : tude but opposite in sign it is is the same in magniobviously positive. ej If tj and their respective standard errors. we range for the normal curve. Further. ^1 Pi Or.g'l -P2. V q^^ J . . p^ q^ respectively. since the errors in the first percentile are directly proportional to the errors in q-^. But if there be a deficiency of observations below the lower percentile. as we should expect. and p^. Let us apply the above value of the correlation between percentiles to find the standard error of the semi-interquartile Inserting q-^ =P2 =. or difference. 1 . Consider the two percentiles. Hence the standard error of the interquartile range applying the ordinary formula for the standard-deviation of a 2/J3 times the standard error of either quartile. IPii\ (3) J p-^ two percentiles approach very close together.342 THEORY OF STATISTICS. and will therefore tend to produce an error 82=-^' -Si Pi in p^. the missing observations will tend to be spread over the two other sections of the curve in proportion to their respective areas. inserting the values of the standard errors. percentiles in the same distribution. 8. find r is. the areas of which are proportional to q-^. the student must be careful to note that the errors in two such percentiles are not independent. We need not. 2) with reasonable accuracy. —SIMPLER CASES OF SAMPLING. Finally. we discussion of these points in §§ 4-8 of Chap. (4) normal distribution interquartile. might or might not be due to fluctuations of simple sampling alone. may note that the standard error of a percentile cannot be evaluated unless the number of observations is fairly large large enough to determine f^ (eqn. case. for instance.XVII. 343 the standard error of the «e»ii-iiiterquartile range 1/^/3 times the standard error of a quartile. XIV. § 2. the formulae of the preceding sections cease. of course. standard error of the semiinterquartile range in a ] (T > j = 0'78672—7= . enter again into a discussion of the efiect of removing the several restrictions. therefore. § 3). equation (1)). we have. the student may be reminded that the standard error of any percentile measures solely the fluctuations that may be expected in that percentile owing to the errors of simple sampling alone it has no bearing. FOE VARIABLES. finally. for the efieot on the standard error of p was considered in detail in §§ 9-14 Of Chap. or semirange can readily be worked out in any particular : using equation (2) and the value of the correlation given above it is best to work out such standard errors from first principles. Further. or : — : — . applying the usual formula for the standard deviation of the difference of two correlated variables (Chap. Of course the standard-deviation of the inter-quartile. It cannot and does not give any indication of the possibility of the sample being biassed or unrepresentative of the material from which it has been drawn. to hold good. save on the one question. the standard error almost certainly gives quite a misleading idea as to the accuracy attained iu determining the average stature for the United Kingdom the sample is not representative. 9.. nor can it give any indication of the magnitude or influence of definite errors of observation errors which may conceivably be of greater imIn the case of the distribution portance than errors of sampling. the several parts of the kingdom not contributing The student should refer again to the in their true proportions. Taking the value of the standard error of a quartile from the table in § 4. of statures. however. whether an observed divergence of the percentile. . from a certain value that might be expected to be yielded by a more extended series of observations or that had actually been observed in some other series. XIV. and the standard error of any percentile is directly proportional to the standard error of ^ (ef. XI. If there is any failure of the conditions of simple sampling. is observations numbe r of su ccesses in : agreeing with equation (5).) 10. if o. Laplace.344 THEORY OF STATISTICS.ir. drawn under the same conditions. The standard-deviation of the sum of the values recorded on the n cards is therefore tjn. Let us now pass to a fresh problem. 27 the preceding sections have been based on the work of : Edgeworth and Sheppard. Further. 11.be known. 6.Sheppard. Edgeworth. Standard Error of the Arithmetic Mean. The distribution being very approximately normal.pq: the standard-deviation of the total n samples of m observations each is therefore Jnm.. The standard-deviation of the values on each separate card will tend in the long run to be the same. For the distribution of statures used as an illustration in § 6 the standard error of the median was found to be 0-0349 the standard error of the mean is only 0'0277.of a. ''"•^^^ (5) This is a most important and frequently cited formula.. also § 16 below). cf. It is therefore of perfectly general application. the ratio of : . We can verify it against our formula for the standard-deviation of sampling in the case of attributes. and rath card of our sample. Suppose we note separately at each drawing the value recorded on the first. and the student should note that it has been obtained without any reference to the size of the sample or'to the form of the frequenoydistrfbution.. second. refs. (standard error of the median). 7. This is very readily obtained. ref. and the standard-deviation of the mean of the sample is consequently — 1/mth of this . in an indefinitely large sample. to test whether we may treat the distribution as approximately normal (c/. The standard-deviation of the number of successes in a sample of m Jm. ref. or. viz. the value recorded on each card is (as we assume) uncorrelated with that on every other.pq dividing by n we have the standard-deviation of the mean number of successes in the n samples. Supplement II. and in general the standard errors of the two stand in a somewhat similar ratio for a distribution not differing largely from the normal form. 5. and determine the standard error of the arithmetic mean. and. Jmpq Ijn. For a normal curve the standard error of the mean is to the standard error of the median approximately as 100 to 125 (cf. (As regards the theory of sampling for the median and percentiles generally. and identical with the standard-deviation o. third . § 4). 15. the standard-deviations of the two samples themselves differ more than can be accounted for on the basis of fluctuations of sampling alone (see below. or in which the frequency-distribution assumed a form resembling fig. § 15). If. than is the mean. in a practical case. in these to characterise some forms of experimental error.XTII. 53. and it would seem natural to take as this value the standard-deviation in the two samples thrown together. also used as an illustration in § 6. from that of errors of sampling.. the value of a. evidently e-^^. we must substitute an observed value. 12. =<4) . the standard error of the median was found to be 0'0655 per cent. If two quite independent samples of TOj and OTj observations respectively be drawn from a record.is not known of sampling. be affected considerably by small groups of widely outlying observations. which bears to the standard error of the median a ratio of 1 to 1"33. As such cases as these seem on the whole to be the more common and typical. the standard error of the difference of their means is given by the two standard errors. a priori. At the same time we also indicated the exceptional cases in which the median might be the more stable cases in which the mean might. Fig. 53 represents a distribution in which the standard errors of the mean and of the median are the same. the greater stability of the median is sufficiently marked to outweigh its disadvantages in other respects. we stated in Chap. § 18 that the mean is in general less affected than the median by errors of sampling. for example. In the case of the asymmetrical distribution of rates of pauperism. a point quite distinct zero. Further. but even more exaggerated as regards the height of the central " peak " and the relative length of the "tails. we evidently cannot assume that both samples have been drawn from the same record the one sample must have been drawn from a record or a universe exhibiting a greater standard-deviation : . the median may be the better form of average to use.. — SIMPLER CASES OF SAMPLING FOR VARIABLES. cases.(e) If an observed difference exceed three times the value of cjj given by this formula it can hardly be ascribed to fluctuations If. the average of which does not tend to be this is." Such distributions are not uncommon in some economic statistics. retical — — €?. assumes almost exactly the theomagnitude.. of course. in some experimental cases it is conceivable that the median may be less affected by definite experimental errors. and they might be expected If. . VII. however. The standard error of the mean is only 0*0493 per cent. 1'26. 345 viz. Chap.. viz. we (For a complete treatment of this problem in the case of samples drawn from two difierent universes 13. XIII. the point here . the values in the sample each divided by n... for errors in the mean of the one sample are correlated with errors in the mean of the two together. case find that this correlation is ijnjin^+n^. (7) indeed. If two samples be drawn quite independently from different universes.. than the other. the formula usually employed for testing the of the difference between two means in any case seeing that the standard error of the mean depends on the standard-deviation only. 22. the use of (6) or (7) is not justified. Chap. but if the student will refer to § 13. of the distribution. we can inquire whether the two universes from which samples have been drawn difier in mean apa/rt from any difference in significance dispersion. The distribution of means c/. We can only illustrate. but instead of comparing the mean of the one with the mean of the other we compare the mean «ij of the first with the mean m^ of both samples together. Following precisely the lines of the similar problem in § 13. ref. the distribution of means (and therefore also of the difierenoes between means) tends to become not merely symmetrical but normal. the standard error of the diiFerence of their means will be given by 4. that the distribution tends to be normal whenever the variable may be regarded as the sum (or some slightly more complex function) of a number of other variables. indefinitely large samples from which exhibit the standard-deviations cr. and hence III. XV. not prove. As an illustration of the approach to symmetry even for small values of . and we should expect the distribution to be the more nearly normal the larger n. and the symmetry will be the greater the greater the number of observations in the sample.=^+-^ This is.. If two quite independent samples be drawn from the same universe. and 0-2.. he will see that the genesis of the normal curve in this case is in accordance with what we then stated. In the present instance this condition is strictly fulThe mean of the sample of n observations is the sum of filled.: 346 THEORY OF STATISTICS.) samples drawn under the conditions of simple sampling will always be more symmetrical than the distribution of the original record. Further. . and not on the mean. 14. XVI. that the approach to normality. The warning is necessary. will arise from fluctuations of simple sampling alone. while it is appreciably asymmetrical. —SIMPLER CASES OF SAMPLING FOR VARIABLES. ref. ref. if our sample is large. distribution of If the original distribution be normal. But now find the distribution for the mean number of successes in groups of five throws.XVII. 30). 32 and the record of sampling there cited). is also given in Chap. is strictly normal. once from the fact that any linear function of normally distributed The variables is itself normally distributed (Chap. the frequency with which any given deviation from a theoretical value or a value observed in some other series. If the observations are not independent. Let us consider briefly the effect on the standard error of the mean if the conditions of simple sampling as laid down in § 2 cease to apply. 4 cases of 8 successes and 1 case of 9 successes. given as illustrations of the forms of binomial distributions in Chap. § 6). but first draw a series of samples from one record. p = 0'l. however. If the student will turn to the calculated binomials. the This follows at means. But the distribution of the number of successes for 100 events when g' = 0'9. § 3.000 throws. then another series from another record with a somewhat difierent mean and : — . he will find there the distribution of the number of successes for twenty events when g' = 0'9. even a fairly large sample may continue to reflect any asymmetry existing in the original distribution (c/. This will be equivalent to finding the' distribution of the number of successes for 100 such events. and thence tailing off to 20 cases of 7 successes in 10. is only rapid if the condition that the several drawings for each sample shall be independent is strictly fulfilled. however. XV. 347 we may take the following case. and it will be seen that. but are to some extent positively correlated with each other. and we may calculate. § 3. very greatly in symmetry thrtugh only five observations have been taken to the sample. be normal if the deviation of the mean of each sample is expressed in terms of the standard-deviation of that sample (c/. rising to high frequencies for 1 and 2 successes. starting at zero. but only to its scale. (a) If we do not draw from the same record all the time. the divergence from symmetry is comparatively small: the distribution has gained. under the same conditions. in an observed mean. p = 0'l the distribution is extremely skew. XV. even of small samples. that the distribution of means is approximately a normal distribution. of n. on that assumption. distribution will not in general. We may therefore reasonably assume. and then dividing the observed number of successes by five the last process making no difierence to the form of the distribution. we take the statures of samples of n men in a number of different districts of England. or if we draw the successive samples from essentially different parts of the same record. record the standard error of the mean will be a-J^n. If the total number of samples. : Kai. we are drawing from the same record throughout. but' the distribution will centre round a value differing by d-^ from the mean for all the records together and so on for the samples drawn from the other records. and the standard-deviation of all the statures observed is o-q. But the standard-deviation by o-q for all the records together is given iVr. if our samples are drawn from different records or from essentially difierent parts of the entire record. and the mean differs by together (as ascertained by d-i from the mean of all the records large samples in numbers proportionate to those now taken) . the second card from another part. and so on. and so on. (6) If . • • • (9) This equation corresponds precisely to equation (2) of § 9.cr'). and so on.(T^=2(Aa^)-t-2(&f). and the mean differs by d^ from the mean of all the records Then for the samples drawn from the first together. For suppose we draw first record. Hence. in large samples drawn from the subsidiary parts of the record from which the several cards are taken. but always draw the first card from one part of that record. For if. the standard error of the mean will be decreased. may be increased indefinitely as compared with the value it would have in the case of simple sampling. cr. for example. the standardo-„. the standard-deviation of the means for the different districts will not be a-jjn. but will have some greater value. Chap. writing 2(Acf) = if. . Hence. dependent on the real variation in mean stature from district to district. THEORY OF STATISTICS.. XIV. . the standard error will be greatly increased. If. k^ samples from the" second record.. . if 0-^ be the standard error of the mean. . The standard error of the mean.T + -^^"' .sJ„ '^"' = .348 standard-deviation. and these parts differ more or less. and the means differ by d^. d^. deviations are o-j. for which the standard-deviation yfcj samples from the (in an indefinitely large sample) is o-j. = ^{k~) + ^k. for which the standard-deviation is cTj. in fact. The result is perhaps of some practical interest. for 349 . .... the magnitude of the variable recorded on one card drawn is no longer independent of the magnitude recorded on greater risk of biassed error.. and then proceeded to form a set of samples by taking one man from each district for the first sample. o-Q = «„. the means of all the samples so compounded would be identical in such a case. shaking them up in a bag and taking out cards blindfold. There may. d„ from the mean a large sample from the entire record. be a a-jijn. To give another illustration. It shows that. different districts of which exhibit markedly different means for the variable under consideration. one man from each district for the second sample. district : X The conclusions seem in accord with common-sense.g. eqn. by any process of selecting the districts from which samples shall be taken by chance. suppose that. —SIMPLER CASKS OF SAMPLING FOR VARIABLES.. or using some equivalent process.— XVII. and take a contribution to the sample from each. if we are actually taking samples from a large area. 4). . XIV. if we break up the whole area into n sub-districts. we had measured the statures of men in each of n different districts. we have n Hence ^ ' n ' = ^-^ The . (10) last equation again corresponds precisely with that given for the same departure from the rules of simple sampling in the case of attributes (Chap. . if the cards from which we were drawing samples had been arranged in order of the magnitude of recorded on each. for the same number of observations. . § 11. (c) Finally. each as homogeneous as possible. the standard-deviation of the means of the samples so formed would be appreciably less than the standard error of simple sampling As a limiting case. to vary our previous illustration. and are limited to a sample of n observations . we would get a much more stable sample by drawing one card from each successive reth part of the record than by taking the sample according to our previous rules e. however. we will obtain a more stable mean by this orderly procedure than will be given. while our conditions (a) and (h) of § 2 hold good. and consequently <Tm = 0. If. it is evident that if the men in each were all of precisely the same stature. and so on. r is the . and so on. . that if the first card drawn at bears a high value. the case discussed under (6) is covered by the case of negative correlation. and (c) distinct. There are to(w-1)/2 correlations. if rj2 denote the correlation between the first and second cards. and if. or. If this correlation be positive. in addition to the mean itself. and for a given value of r the increase will be the greater. (11) As the means and standard-deviations of x-^. the others must on an average be drawn from parts containing relatively high values. Equation (11) corresponds precisely to equation (6). the next and following cards sample are likely to bear high values also. in other words. since a positive or negative correlation may arise for reasons quite different from those considered under : (a) and (6). to keep the cases (a). the standard error will be diminished. our reasons for not proceeding further with the discussion we must remind the student that in order to express the standard error of the mean we require to know. (6). the standard error of the mean will be increased.350 THEORY OF STATISTICS. As was pointed out in that chapter. of Chap.»-l. .'s will on the average be negative if some one card be always drawn from a part of the record containing low values of the variable. With this discussion of the standard error of the arithmetic mean we must bring the present work to a 6lose. such a positive correlation is at once introduced. Under stances.] . x^ x„ are all identical. e. It is as well. any sampling of the same these circumvalues on the another card. XIV. although the drawings of the several cards at each sampling are quite independent of one another. for if each card is always drawn from a separate and distinct part of the record. If r be negative. we may write therefore. To indicate 15. briefly of standard errors. Similarly. r may more simply be regarded as the correlation coefficient for a table formed by taking all possible pairs of the n values in every sample. on the other hand. the correlation between any two a. § 13.g. the case when r is positive covers the case discussed under (a) for if we draw successive : samples from different records.^»=^[l+r(. . the mean (deviation)^ with respect to the mean. the greater the size of the samples. arithmetic mean of them all. however. the standard-deviation about the mean. standard error of the standarddeviation in a distribution of / | >=. the mean (deviation)* with respect to the mean. values considerably greater twice as great or more than (12). therefore. . ref. in the general case. on reference deviations of any order (ref. Standard-deviation. 351 Similarly. but of standardIt will be noticed. again. measured from the mean and /ij the mean (deviation) ^ or the square of the standard-deviation n is assumed sufficiently large to make the errors in the standard-deviation small compared with Equation (13) may in some cases give that quantity itself. we must find this quantity for the given distribution and this would entail entering on a field of work which hitherto we have intentionally avoided or we must. XII. Either.— — XVII. — If the distribution be normal. If. (13) ^1^2-^ where /i^ is the mean (deviation)* deviations being./ ^^ ^ —^-j any form j ^ . if that be possible. 33). by no means exact the general expression is : : it is. standard error of the standard-deviation in a normal distribution ) ^ >= —y= . 17. to express the standard error of the standard-deviation we require to know. we have standard error of the coefficient of variation :]''wA'<m)r (»' . with a simple statement of the standard errors of some of the more important — — constants. as a fact. of course. equation (Gf. without the use of the differential and integral calculus. —SIMPLER OASES OF SAMPLING FOR VARIABLES.. that the standard error of the standardis less than that of the semi-interquartile range for a normal distribution. (12) ) ^. This is generally given as the standard error in all cases however. assume the distribution to be of such a form that we can express the mean (deviation)* in terms of the mean (deviation)^. To deal with the standard error of the correlation coefficient would take us still further afield. for the normal distribution. — — : — — deviation to equation (4) above.) (12) gives the standard error not merely of standard-deviations of order zero. § 8. to use the terminplogy of Chap. but the proof would again take us rather beyond the limits that we have set ourselves. the distribution be normal. Thfs can be done. however. We must content ourselves. then. For a normal distribution. and the proof would be laborious and difficult. if not impossible. Professor Pearson's original memoir on the correlation-ratio. 28. If the distribution be normal. total or partial (ref. ' ) J^ ' ' ^ As was pointed out ref.). 33). o-g^ y/n. in Chap. -7^ ^ / ^ V(l-^2)2_(l -r«)2-|-l . refs. for a normal distribution. (15) '^™ the use o'f a more general formula is the value always given which would entail the use of higher moments does not appear As regards the case of small samples. and 31.. 1). XI. that is to say. § 21. § 10). see ref. standard error of the correlation-coefficient for a fourfold table (Chap. standard error of f = 2 (18) For rough work the value of the second square root may be taken as nearly unity. The expression — standard error of the corj relation coefficient for > a normal distribution ) =l _ j-2 t=^ . in the bracket is usually very nearly unity. For the of a coefficient of any order. „. k denoting any collection of secondary subscripts other than 1 or 2. X. it may be taken as given sufficiently closely by the above expression for the standard error of the correlation coefficient. however. 10. to have been attempted... 352 . 34 the formula (15) does not apply. Correlation coefficient. the value of ^ test for linearity of regression. X. = -r^ is a Very approximately (Blakeman.—— THEORY OE STATISTICS. Equation (15) gives the standard error cf.4 for a normal distribution ) ) ""J-a* h. The general expression for the standard error of the correlation-ratio is a somewhat complex expression (cf. If the distribution be normal.. and in that case may be neglected. ref. in terms of our general notation. This : — : standard error of the ooefficient of regression \ /. and we have then the simple expression.e. _ „ \^ > =^1-^1 ""2 ) for a normal distribution t=^~ fg'^'"rv'* v» (^^) This formula again applies to a regression coefficient of any order. standard error of ^ roughly = 2/- . — standard error of correlationratio approximately 1 _I "" -if . : standard error of 612. total or partial i. In general. 18. Correlation ratio. Chap. (19) . Coefflfiient of regression. and repeated in § 9 above. § 3). If the sample be large.. XIV. it will be noted that the values of fp ^^ (2). and then replacing o. in (1).and yj. and of a. nor as to the magnitude of errors of observation. if n be small. the standard error ceases to measure with reasonable accuracy the standard-deviation of true values of the constant round the observed value (Chap. Chap. 23 . in conclusion.: XVII.in the expression for the standard error of the mean by cr ± three times its standard error so obtained. —SIMPLER CASES OF SAMPLING . of and (5).. 353 To convert any standard error to the probable error multiply by the constant 0-674489 . Following the procedure suggested in Chap.in (4) given for these constants o. i. the rule that a range of three times the standard error includes the majority of the fluctuations of simple sampling of either sign does not strictly apply. Consequently. i. by first working out the standard error of <r on the assumption that the values for the necessary moments are correct. But this is only the case in dealing with the problems of artificial chance in practical cases we have to use the values given us by the sample itself. as to the use of standard errors when the number of observations in the sample is small. that a standard error can give no evidence as to the biassed or representative character of a sample. XIV.FOE VAKIABLBS. the direct and inverse standard errors are approximately the same. in the case of the mean. but we may. some rough idea as to the possible extent of under-estimation or over-estimation may be obtained. 16. if the sample be small.g. If this sample is based on a considerable number of observations. Xiy. it will be remembered that unless the number of observations is large. e. the procedure is safe : enough. In the first place. Finally.e. or the values that they possess in the original record if the sample is unbiassed. are assumed to be known a priori. We need hardly restate once more tho warnings given in Chap.. we cannot interpret the standard error of any constant in the inverse sense. and the "probable error" becomes of doubtful significance. the values that would be by an indefinitely large sample drawn under the same conditions. XIV.e. Secondly.. we cannot in general assume that the distribution of errors is approximately normal it would only be normal in the case of the median (for which p and q are equal) and in the case of the mean of a normal distribution.. but if it be only a small sample we may possibly misestimate the standard error to a serious extent.. again emphasise the warnings given in §§ 1-3. J." Biometrika. p. vol. reference to which has been made in the lists of previous chapters : reference has also been made before to most of The probable . L. 1906. 81. iv." Biometrika. Trans. other methods.. 1906.. 371. 19.... 1916. Hekon. . " normal coefficient. p. Tests for Linearity of Regression in Frequency 1 Distributions.) — 354 THEORY OF STATISTICS.. 165. 1905. 1906.'' Phil. 1909. 411. Y. cf. and Karl Pearson. etc. Jour. xoii. "The Frequency Distribution of the Values of the Correlation Coefficient in Samples frqm an Indefinitely largo Fopnlation. Coefficient of contingency : 2. vol. of the result given in (33). i. 507. vol. xxiv.. (A proof. Coefficient of correlation (product-sum and partial correlations) : : : 10. A. 26. "The Calculation of Probable Errors of Certain Constants of the Normal Curve. 1902. 1887. 1910. REFERENCES. 28. Coefficient of correlation. vol. 33.. on ordinary algebraic lines. Laplaob. (11) GipsON. R. 9) Addendum. Thiorie des protabiliUs. 32. p. 35. pp. p. Edgewoeth. v. 31. Soc. ref. The following is a classification of some of the memoirs in the list below : General : 18. 139. Marquis de. vii. p. Layton. The Measurement of Groups and Series . A. "On Sac. 29. Edgewoeth.) Pearl. vol. p. D. p. vol.. xiv. xxii.. Edgewoeth. 332.. Y. 1915.. 1908. ' . 7. 1885. 1903. 499. Edgewoeth. BLji KEMAN.. "The Choice of Means. 20. x. "On the Conditions under which the Probable Errors ' of Frecjuenoy Distributions have a real Significance.) Heron. vol. F.. : 24. Pibeeb Simon. vol. 1906.. W. 6) 7) 8) Theory of Errors of Observation and the First Principles of Statistics. D. 23. 3) BowLEY. the Probable Errors of Frequency Constants." Biomstrika. for the case of three variables. 30. p." vol." Biometrika. errors of various special coefficients. 5th Series. " On the Probable Error of a Partial Correlation Coefficient. 651 and . Ixxii. 6.. vol. Eldbeton. 1814. L. " ^iomeirifo. 13. the memoirs concerning errors of sampling in proportions or percentages.. L. 36." Cambridge Phil. Fit of Theory to Observation. Averages and percentiles 5. A. vii. F. (12) (13) (14) (15) (16) Winifred. 381.) IssERLis." Buymetrika. p. "Tables for Testing the Goodness of (10) FiSHEE." Proc. Stat. & E. Palin. 386." Biometrika. "On "On Coefficient of Mean Square Contingency. : Theory of fit of two distributions 9. "Tables for Facilitating the Computation of Probable Errors.. vol.. p. J. vol. v.. 191. Roy. etc. are generally dealt with in the memoirs concerning them. vol.. 26. "Observations and Statistics: An Essay on the .. Series A. 190. 411. Address to Section F of the British Association. 268. vol. C.. 4) 5) BowLBY. Raymond. (A diagram giving the probabl" error for any number of observations up to 1000. Ixxi. F." Biometrika. Standard deviation 17.. 2" iin. the Probable Error of the 2) Blakbman. "An Abac to determine the Probable Errors of Correlation Coefficients. . p. London. p. Y. 23. As regards the conditions under which it becomes valid to assume that the 'stribution of errors is normal. " Problems in Probabilities. F. 14.. 12. 34. 5th Series. 1910. Y. iv. (With four supplements. Coefficients of association : 34. Mag. 1886. Boy.. p.. Mag." Phil.. . U." Biometrika. (19) (20) Peabson.) "SinDBNT. Filon. 1914. 1898. in certain cases.. v. ix. and on the Influence of Random Selection on Variation and Correlation. U. x.. Stat. vol. (Probable error of the correlation coefficient for a fourfold table. "On the Probable Error of a CoeflScient of Correlation as found from a Fourfold Table... and others (editorial). 1912." Biometrika. (34) Reference may also be made most part with the effects of errors to the following. "On the Probability that two Independent Distributions of Frequency are really Samples from the same Population. x. 1908.. 273. Karl." Biometrika.. Ixxvi. vii. vol. p. "Student. p.. Boy. "On the Probable Errors of Frequency Constants. 250. G. 112. Soc. Kakl. 1913. H. vol. and vol. Boy. "On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation. 101. 210. Kabl. G." FMl. "On the Probable Error of a Coefficient of Mean (26) Khind. Pearson. 1903." Jour. Boy.. p. (27) (28) (29) (30) (31) Skew Frequency-distributions. 355 Peakl. 1909-10. oxoi. 1906. 1907. p. 5th Series. Shbppakd. p. 91. and vol. N.. Pearson. vii. Trans. 1908." iiomeirite. 386. Karl.. 229. 302. viii. p. p. 1906. 1911. 192-3 at end." " On the Probable Error of a M. "On the Probable Error of the Correlation Coefficient to a Second Approximation. vol.. 181. vol. 1898." Biometrika. (22) Pearson." Biometrika. with small samples. (On the amount of divergence. (17) — SIMPLER CASES OF SAMPLING FOE VARIABLES.. (Dseful for the general formulae given.) Pbabson. 590. (23) (24) (25) Pearson.. " On the Curves which are most suitable for describing the Frequency of Random Samples of a Population. p. p. a." Biometrika. v.. (See pp.. SoPBR. "Tables for Facilitating the Computation of Probable Errors of the Chief Coiistants of vol. vol. Roy. ix. ix. 1913. vol. Series A. ii. etc.. Karl. vol." "On the Distribution of Means of Samples which are not drawn at Random. p. 22. vol. "On the Theory of Correlation for any number of Variables treated by a New System of Notation.. 85." Phil. p. Soc." Biometrika. Yule. Series A. A. vi.) "On the Methods of Measuring Association between two Attributes." Biometrika.. 1900. Karl. cxcii. vol." Proc..) ) ) XVII. which deal for the other than errors of sampling:— . vol." Phil. 1909. "On certain Points concerning the Probable Error of the Standard-deviation. p. 1. 1908." (The problem of the probable error Biometrika. vol. Square Contingency. Soc. 182. based on the 1913. H. p. from the standard error (18) al\J1n in the case of a normal distributioti. p. 1. vol. p. 1915. p. G. p. E. 384. and L..Ti." Biometrika." BiomeVrika. (21) Pbakson. Pkaeson. Karl. x.. SoPBB. 1914. Mag. (32) (33) "Student. F. Trans. 172. general case without respect to the form of the frequency-distribution. 1. vol.. 127 and p. of association coefficients. (The standard error of the mean in terms of the standard error of the sample. vol. vi. p. vi. 157. "On the Probable Errors of Frequency Constants." Biometrika. Ixxix.." "On the Probable EiTor of a Correlation Coefficient. vol.ea. vol. Karl. Soc... " On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be reasonably supposed to have arisen from Random Sampling. "On the Probable Error of the Bi-serial Expression for the Correlation Coe&cient. E. "Note on the Significant or Non-significant Character of a Sub-sample drawn from a Sample.. Series Yule. W. p. Eaymokd. VI. A. (2) 1000 observations. Soc. Slat. The standard-deviation of the same distribution is 21 "3 lbs. "Relations between the Accuracy of an Average and Constituent Parts. with the ratio of the standard error of the semi-interquartile range to the semi-interquartile range. 1911. Boy. find the standard errors of the two quartiles 1. Ixxv. for values of r-=0. (142-5 3. Find the standard error of the mean. 1). 6. vol.'' EXERCISES. (Standard-deviation 2 '67 in. 0"6. . lbs." Jow. Work out the standard error of the standard deviation for the distribution of statures used as an Olustration in § 6. Slat. Roy.. BowLBT. For the data in the last column of Table IX. that of its L. p. For the same distribution. 5. find the standard error of the median (154"7 lbs. For the same distribution. .). 0-2. 8585 observations. "The Measurement of the Accuracy of Jowr. find the standard error of the semi-inter- quartile range. 168-4 lbs. assuming the distribution normal. Calculate a small table giving the standard errors of the con'elation coefficient. Soc. (36) BovfLBT... 2. and compare its magnitude with that of the standard error of the median (Qn. vol. L. 95. 0*8.. Chap. 0-4. an Average.) Compare the ratio of standard error of standarddeviation to the standard-deviation.. 4. assuming the distribution normal. p. 855.356 (35) THEORY OF STATISTICS. A. p. 1897.). 77.. Ix. based on (1) 100. and Asher & Co. Oube-roots. the student find a slide-rule exceedingly useful particulars and prices will be found in any instrument maker's catalogue.000. stereotype edition. but. of course. net. if greater accuracy is needed. Fob heavy arithmetical work an arithmometer invaluable . rocals of all Integer Nvmiiers up to 10. For greater exactness in multiplying or dividing. Cubes. & F. 1910. Barlow's tables Crelle's. or Peters' tables for more advanced work.000 . will : A : (1) Barlow's Tables of Squares. Spon. a 50-cm. London). rule. extended multiplication There are many of these. and four of tables are very useful. If it is desired to avoid logarithms. plain 25-cm. of trigonometric functions). but attention may perhaps be directed to the recently issued eight-figure tables of Bauschinger and Peters (W. i.APPENDIX I. is. in which the scale is broken up into a number of parallel segments. especially work not intended for publication. owing to their cost. are invaluable for calculating standard-deviations of ungrouped observations and similar work. as a rule. Cotsworth's. and Recip- N. ii. London and New York . CALCULATING TABLES. beyond the reach of the student. . E. Engelmann. vol. price 6s.. It is hardly necessary to cite special editions of tables of logarithms here. a Fuller spiral rule. A. arithmetic machines are. 357 . containing logarithms of all numbers from 1 to 200. TABLES FOR FACILITATING STATISTICAL WORK. For a great deal of simple work. vol. Leipzig. Square-roots. 6d. or if greater accuracy is desired. may be preferred. price 18s. rule will serve for most ordinary purposes. seven-figure tables must be used. London. logarithms are almost essential five-figure tables suffice if answers are ouly desired true to five digits . pensive and recommended for the elementary student. containing logs. or one of Hannyngton-pattern rules (Aston & Mander. Zimmermann's tables are inexdifferent forms are cited below. (Seven-figure logarithms of the function. price 15s. are contained in Tables for Statisticians and Biometricians. " An Abac to detertnine the Probable Errors of Correlation Coefficients. F.. W.. differences of O'OOl of the argument.) four-figure products.) German or in English. viii.. a. London price with thumb index. Several tables of service will be found in the works cited in Appendix II. . p. p. The Direct Calmlator. 1904. Reimer. etc. 1902. ii. second edition. New York. price 15s. London. (Product table to B. price 5s. Chapman & Hall . 1914. . vol. Statistical Methods.) 358 (2) THEORY OF STATISTICS.. 474. H." Biometrika. vii. 385..." Biometrika. ETC. p.000 more convenient than Orelle for forming Introduction in English. New Sechentafeln fur MuUiplilcation imd Division.. vol.bd. (Tables of area and ordinates of the normal curve." Biometrika. M. (Multiplication table giving all products up (3) Cbeue. "Tables of the Tetrachoric Functions for Fourfold Correlation Tables. 1899. 411. G. proceeding by 1909. vii. 25s.... 6s. (7) C. (12) Heuon. (Gives products up to 100 x 10. Reimer. majority of the tables in the list below. Series 0. p. gamma functions.) Berlin ." Biometrika. 155. L. p. logarithms." British Association Report. (Products of all numbers up to 100 x 1000 subsidiary CoTSWOBTH... p. English edition. for all numbers up to 1000 at the foot of the page.. p. e.g. (A diagram giving the probable error for any number of observations up to 1000. probable errors of the coefficient of correlation.) (13) Lee. (6) ZiMMEEMANN. "Tables for Facilitating the Computation of Probable Errors. . square-roots.. 1910. J.. 21s. Wjnifp. a table of Gamma Functions in Elderton's book (12) and a table of six-figure logarithms of the factorials The of all numbers from 1 to 1100 in De Morgan's treatise (11)." Biometrika. (5)Peteks. J. (Functions occurring in connection with Professor Pearson's frequency curves. i. (10) EVEKITT. ref. vol. "TablesforTestingtheGoodnessof Fit of Theory to Observation.) (8) DuFEELL. iv. H. together with others. Ernst & Son. Biometrika. 1000 X 1000. net). and of the Sums of Powers of the Natural Numbers from 1 to 100" (gives powers up to seventh). SPECIAL TABLES OF PTJNCTIONS. Eldbrton. 1910. p. D. 43. cf. loith especial reference to BioVariation . v) and S{r. W. French or German. Mechentofel. Can be obtained with explanatory introduction in to 1000x1000. which were originally published in Biometrika. "Tables of Powers of Natural Numbers. powers. London. Sechentafeln. cubes. P.) logical (9) Davenport. 1906. B. Asher & Co. 1912. (Tables for facilitating the calculation of the correlation coefficient of a fourfold table by Pearson's method on the assumption that it is a grouping of a normally distributed table . XVL) (11) Gibson. G. edited by Karl Pearson (Cambridge University Press.. "Tables of the Gamma-function. B. (4)Eldebton. vol. vol. without index. etc. vii. Berlin . Alice. price 15s. 385. 437. vol.. Berlin . " Tables of F{r. ) M'Corquodale & Co. . p. W. i4ofChap. . nebst Sammlung haufig gebrauohter Zahlenwerthe. and vol.. cube-roots and reciprocals. : : tables of squares. John Wiley . v) Functions. p. "£«)?ne<riAa." Biometrika. p. ' ' 359 Leb. 1907. p. (15) Rhixd... when the 'tail' is larger than the body. for the ordinates which divide the area into a thousand equal parts. 127 and p. but also a table of the ordinates to the same degree of accuracy.. "Tables of the Gaussian Tail-functions.." Biometrika. "Tables for Facilitating the Computation of Probable Errors of the Chief Constants of Skew Frequency -distributions. "New Tables of the Probability Integral. vol.. ii. —SPECIAL TABLES QF FUNCTIONS. in terms of the standard-deviation as unit. Aliok. a. 208. (Includes not merely table of areas of the normal 1903. (17) Shbppaed. ETC. 386. F.. "Table of Deviates of the Normal Curve" (with introductory article on Orades and Deviates by Sir Francis Galton). x. vii.) APPENDIX (14) I. . p. (A table giving the deviation of the normal curve. 1914:. 17i. W. curve (to seven figures).. F. v. vol. W. vol.) (16) Sheppabd. vol. 1909-10. 404. Biometrika. (1) AlET. Barth. Leipzig. A. (Part 2 on the theory of correlation applications . are in the library of the Royal Statistical Society. Newsholme's scientifiques. BoEBL. Paris.) (3) (4) (6) (6) (7) Bektkani).. Jacques Bertillon's Cours SlSmentaire de statistique (Socidt^ d'^ditions international in scope).. The student may find the following short list of service.. Sammelforschung J. 1889. Ars Wissenschaften. the latter containing. und psych. The economic student who wishes to know more of the practical side of statistics may be referred to Mr A. : to experimental psychology. opus posthumum: Accedit tractatus de seriebus infinitis. L. Wahrscheinlichktitsrechnung Teubner. 1895 Vital Statistics (Swan Sonnensohein.. . London. Bowley's "Elements" (6 below). 1909. (8) Bruks. Leipzig^ 1906. The great majority of the works list. Dr A. 1879. 1901 Srd edn.. Ueber Korrelation Beihefte zur Zeitschrift fiir ang. mentioned in the following with others which it has not been thought necessary to include. und Kollektivmasslehre 360 . by the same writer (useful as a general guide to English statistics). Sir G. 1899) will also be : of service to students of that subject. Psybh. Srd edn. 108. (2) Bernoulli. . Betz. Moments de la tMorie des probabiliMs Hermann. APPENDIX II. B.) ) . the Algebraical and Nwmerical Theory of Errors oj 1st edn.. J. 1713. Oalcul des probabilit4s Gauthier-Villars. On Observations . Elements of Statistics P. 3rd edn... . S. London. 107. et epistola gallic^ scripta de ludo pilot reticularis. L. 1907. . King. 1st edn. . as supplementing the lists of references given at the ends of the several chapters. A. (Applications to p^chology. as a rule. SHORT LIST OF WORKS ON THE MATHEMATICAL THEORY OF STATISTICS AND THE THEORY OF PROBABILITY. W. to An Elementary Manual of Statistics (Macdonald & Evans. conjectandi. L. BowLEY. E. . 1911.. 1910). Brown. Nos. The Essentials of Mental Measurement Cambridge University Press... H. and to M. 1911. W. original memoirs only. (A German translation in Ostwald's Klassiker der exakten J. 1861 . Paris.. F.. London.. W. (19) Lexis...) (22) QuETELET. Marquis de. Calcul des probability .) .) (13) (14) Fechnee. G.. G... Marquis de. W. Jena. A. Jena. (16) JoHANNSEN. 1896.. Elemente der exakten ErUichkeitslehre . Leipzig... 1814.wnd Moral(17) statistik . Laplace. (11) (12) De MoiiGAN. T. a. Macmillan. D. Abhandlungen eur Theorie der Bevolkertmgs. 361 CoURNOT. Treatise on the Theory of Proialilities (extracted from . . Jena. Leipzig. 1904. L. 1814. A.. .. Fischer. 1897. Galloway. J. . . 2ndedn. (20) PoiNCABi. (15) Gauss. correlation. A. loith especial reference to its Logical Bearings cmd its Application to Moral and Social Science amd to StatiMcs 3rd edn. 1837.. Lettres sur la tMorie des probabilit^s. H. C. & E. E. London. edited by 6. 1888. Science Press. . WahrscJieinlichkeiisrechnung imd ihre Anwendung auf Fehlerausgleichung.. 1839.) E. 1913. Die OrundzUge der Theorie der Statistik (23) Thokndike. with supplements 1 to 4. F. 1856. Pibkrb Simon. Exposition de la tMorie des chances et des prohdbilU4s. (21) PoissoN. the Enayclopcedia MetropoUtana). Fischer. interest. (25) Westergaard. Schnuse. ThSorie analytigiie des probability (18) 2nd edn. S. Kollektivmasslehre (posthumously published. L. Essai pMlqsophique sur les probability. i.. . 1849. 1906. New York. Eldeeton. (German translation by C. p. traduits par J. —SHORT LIST OF WORKS. Frequency Curves and Correlation C. 1908-10. J. "W. 1843. Paris. F. Statistik wnd Lelensversicherung Teubner.) Laplace.. Treatise on ProhaUlUy (republished from the 7th edn.. Venn. Pikkke Simon. 1890. of the Encyclopaedia Britannica). Gauthier-Villars. H. separately printed with some modifications. An Introduction to the Theory of Mental and Social Measurements. 1841. Layton. T. The Logic of Chance: an Essay on the Fovmdations cmd (24) Province of the Theory of Probability. APPENDIX (9) II. (Deals with Professor Pearson's frequency curves and with illustrations chiefly of actuarial . prMdies des rigles ginirales du calcul des probabilitis. 1903. (10) CzTJBER. 1837.. 1846. Bertrand. Mithode des moindres carris: Minwires sur la combinaison des observations. (The introduction to 18. H. Becherches sur la probability des jugements en matiire criminelle et en matiire civile. Lippsj Engelmann. (Verylargely concerned with an exposition of the statistical methods. (English translation by 0. appliquie aux sciences morales et politiques. vol. a'^ Ausgabe. Downes. Fischer. y) = 0. That as is. Similarly.SUPPLEMENTS. 171. x' and y' being a pair of associated deviations. Differentiating with respect to a^.) To those who are acquainted with the diiferential calculus the It is on the lines of the following direct proof may be useful. we have %{x-a. and differentiating with respect to %{x-b^.' — a^ + a 6j^ .yf with respect to a^ and to \ and equating to zero. it is required to determine values of a^ and h-^ in the equation (where x and y denote deviations from the respective means) that will make the sum of the squares of the errors like M = a. %{x) = %{y) = Q. Taking first the case of two variables (Chapter IX. I. we will find . But and consequently we have Dropping a^. 2/' minimum.x' that will make the sum of the squares of the errors like v = y' a minimum. a^ = 0.^ + h^. ^ = |^^-^^ cry 2(2/2) if on p. we determine the 2/ values of a^ and h^ in the equation = aa + \x -a^ + b^. and XII. § 3.rj)y = 0.). _ %{xy) o-„ 362 . {Supplementary to Chapters IX. &j^. The required equations for determining a^ and 6j^ will be given by differentiating %{u^)--^%{x-a^ + h^. proof given in Chapter XII. DIRECT DEDUCTION OF THE FOBMULJE FOB BEGBESSIONS. then the probability of having at least m successes in the m + m trials is evidently the sum of the m' + 1 terms of the expansion But this probability. successes and 2 failures. the chance of either : this is /I™. successes and 1 failure. . m OT(OT. especially § 7).. The now be directed to the limit reached when either p ox q becomes very small. may term P^. . q. the probability of this is The first m+2 m the (m + 2)"" trial 2 ! ^ ^ In a similar way we find for the contribution oi successes and 3 failures.23 . This gives the equations of the form there stated. as made up of m + m =n trials. m m+l of (b) is (c) mp'^ . .+ l)(m + 2) „ 3! 3 . + Oi„. a number ar^ involved. as above. for which the chance of success at each trial is p. If a constant term be introduced. + 1 trials might give (S) The first the latter not to happen on the (m+iy^ trial (a condition already successes and 1 failure. —THE LAW OF SMALL CHANCES. trials might give not to be a failure (So as to avoid a repetition of either of the preceding cases) . {cf. as the failure might trials. . its "least square" value will be found to be zero. But the probability of the latter at a specified trial.) We have (p + qY when n is large student's attention will seen that the normal curve is the limit of the binomial and neither p nor q very small.. the equations for determining the coefficients will be given by diflerentiating .34 . THE LAW OF SMALL CHANCES. (Supplementary to Chapter J[V. which we of (p + q)" beginning with ^". the complete probability out oi occur in any one of m m m . 363 of variables as in Chapter XII. and. 3(^1 — Oi2. (n-1) • i*n) with respect to each coefficient in turn and equating the result to zero. is p™ q.. giving m + 3 trials. Let us regard the n trials of the event. II. n • £^2+ . covered by (a)). . The required For instance result might happen in any one of «»' + 1 ways. §§ 4 et seq. can be expressed in another and more convenient form with the help of the following reasoning.— SUPPLEMENTS If. but n is so large that np or nq remains finite. (a) Each of the first m trials might succeed . we have .-.is the chance of exactly one failure. we have verify.-).-.x. Let us also suppose 2' that n is so large that (7) nq^ = \\& finite... if n be large and g small. under similar conditions. if m j>"-^[l+(r.(».->(.23! ('-3"(>-r"t-^---"since — and smaller fractions can be neglected. so that — = ratio of also very small. and (8) reduces to e-\ Put m'= 1.2. when n and.and putting m = ?i — 9w'. we put »i' = 0..=. so that e-^. If . n But (1 j • • m' " !.. becomes 1.364 Ultimately we reach THEOEY OF STATISTICS.-2)g + ("-y-l) g2] ii ! Let us now suppose that q failures to total trials is is very small.A. we have the chance that the event succeeds every time. For instance. L 2 ! m m ! This expression is of course equivalent to the first m' + 1 terms of the binomial expansion beginning with /j™. and the terms . is shown in books on algebra to be equal to e -\ where e is the base of the natural logarithms. is infinite ('-3-->Hence. so that = 2. as the student can = m ... and we get the chance that the event shall not fail more than once. Writing = . e-'^(l + A). The convergence of seen from the fact that r cannot exceed -. etc. In other words.. 365 within the bracket give us the proportional frequencies of 0. and we shall give one of the methods of proof adopted by modern statisticians.when q Expanding the second bracket. failures. p. but the result has been reached independently by several writers since Poisson's time.. (?> + ?)" = (! -? + ?)" = (! -#(1 + ^4-)' • W inde- The first finitely small. 19. . 2 it (9a) which reduces to the series is — when T very small. right Hence the second bracket on the ri] of (9) may be written and (9) is identical with (8).. 273). The investigation contained in the preceding paragraphs was published in 1837 by Poisson. 2. (8) is the limit of the binomial (p + j)" when q is very small but nq finite. so that (8) may be termed Poisson's limit to the binomial . bracket on the right is equal to e~*.q q is r . and the to substitution of this value in (9o) reduces q^ (l-y)\' which vanishes with q.1 SUPPLEMENTS —THE LAW OF SMALL CHANCES. we have is The ratio of the (r + 1)"" to the r"" term is ——r -t- J^^l \ . which the student may perhaps find easier to follow than that of Poisson (see ref. 1. .x. 0-2= -6079.366 THEORY OF STATISTICS. Chapter XIII. 6. of distribution. of this theorem is due to the fact that study of problems involving small. the mean is -61. effectively different values of represented by (8). . we have for the mean the actual value of q (or p) The theorem has .^. in the statistics used in par. 12.2-\2 = \. Chapter XV. certain relations must obtain between the conUsing the stants of the statistics (see par. have the mean equal to the square of the standard deviation. .. tribution given by the binomial distribution of M being large. + A. Bortkewitsoh and also been applied to cases in which. Hence any statistics produced by causes conforming to Poisson's limit should. and for cr^ e-^(\ + 2\^ + ^^+ . 12 of Chapter XIII. within the limits of sampling. it may safely be assumed It should be noticed tliat. o-=-78. = A.. tables of which for X have been published by v. .). indeFor instance. n things in JV pigeon-holes being of equal size and equally accessible).)-X2 = A. For instance. if (8) is the real law to be very small. the pigeon-holes the dis- .+i!t •) = \e-^( .-«(wx./l F-IX iV-lN" would be others. although is unknown.. if we desired to find the (all The frequent rediscovery its value is felt in the pendent probabilities. method of par. we theoretical frequencies from (8). . putting have the following results — Deaths. 367 we now compute the \= '61.: SUPPLEMENTS If — GOODNESS : OF FIT. ». 4 . . that ellipse Conversely.. It was proved on pp. Hence. the above principle remains valid although it With ceases to be possible to give a graphic representation. . n^ we actually find m^. 166 and p. m„ = TOo + TOi+ . »ij. in Chapter XII. at once suggests that the same principle will apply when there are two variables. or. the generalisation of the theorems of Chapter IX. . 246. when the number of variables is 3. these data. . but throughout the equation of the contour of equal probability is of the ellipse type (c/. n„ = N'. shown to hold between the normal curve and the surface of normal correlation. . have a normal system of frequency in two variables x and y. then the chance that on random sampling we should obtain the combination x y is measured by the corresponding ordinate of the surface. . m^ m„. tions less likely to occur than a. .). . if we prefer. The problem to be solved is whether the observed system of deviations from the most probable values might have arisen in . m^ + m^+ . where N . = volume divided by the total volume of the surface will be the the probability of obtaining in sampling a result not worse than X y' . with four variables another dimension is involved.. The reader who has compared the figures on p. this slices from x = x' and y = y' down to a. and followed the algebra of pp. should be distributed into n+\ groups conn„ each. y = y' to x — y=co. and y = 0. 319-321 that the contours of a normal Now suppose we surface are a system of concentric ellipses. .' y ellipse. three variables the contour ellipse becomes an ellipsoidal surface. and the feet of all ordinates of equal height will lie upon an ellipse which will therefore be the locus of all combinaAny combinations of X and y equally likely to occur as is x' y tion more likely to occur than x y will have a taller ordinate. 331-332. will have no difficulty in seeing that. in number.' y' will be represented by ordinates located upon ellipses wholly surrounding the a. and the four-dimensioned frequency " volume " must be dissected into tridimensional ellipsoids . and so on . Let us now suppose that if a certain set of data is derived from a statistical universe conforming to a particular law. . . and then the fraction is the chance of obtaining as bad a result as x' y'.368 relation there THEORY OF STATISTICS. or a worse result.. we may sum from x = x'. if we dissect the surface into indefinitely thin elliptical slices and determine the total volumes of the sum of . n^. Instead of this taining respectively n^. and as the locus of its foot must also be an ellipse. . . combinawill be contained within the x y ellipse. e. The reason for adopting the latter device is that. is then the equation of the " ellipsoid " delimiting the two portions of the "volume" corresponding to combinations more or less likely to occur than m^. Accordingly. But as the limits of integration of the angular (not of the vectorial) element will be the same in the numerator and denominator. m^ m„. .. Let us now suppose that the distribution of deviations is normal.SUPPLEMENTS —GOODNESS OF FIT. . 331. . When n such elements are transformed. iV being given. . a vectorial element dr. and a " solid " angular element. m^. To reduce this ?i-fold integral to a single integral. etc. . and a term in r. In this book we have been concerned with summations the elements of which were finite. Since. The reader is probably aware that when the element summed is taken indefinitely small the summation is called an integration. dx dx 24 . the radius vector. which we will write for the present in the form = a constant. In the first place the ellipsoid. . when two rectangular elepnents dx. the summation from the ellipsoid to the ellipsoid oo X^ ..\^\ power and there is an infinitesimal vectorial element. i. . dy are transformed to polar co-ordinates. to find the chance of a system of deviations as probable as or less probable than that observed. the integral vectorial factor is raised to the n . Then the equation of the frequency "solid" is of the type set out in equation (15) of p. fixing the contents of any n of the classes determines the n + V'\ there are only n independent variables. adding together the elliptic elements from the ellipsoid -^ to the ellipsoid oo and to divide this summation by the total volume. x"'^ . dx^. dr. and the infinitesimal element being written dx. Hence the multiple integral reduces to a single integral and the expression becomes 2 i X e-W e-ix2 . referred to its principal axes. while ^ may be treated as the vectorial element or ray. In the present case we have to reduce an n-fold integral the summation relating to n elements (fa.!("-i . is transformed into a spheroid by stretching or squeezing. the following method is adopted. the symbol / replacing or S. we have to dissect the frequency solid. 36§ random sampling. we replace them by an angular element dQ. and the system of rectangular co-ordinates transformed into polar co-ordinates. these cancel out. 342 we reach sampling in the p*'^ and y"" classes. its integration. desirable. The arithmetical process is illustrated upon the two examples of dice-throwing given on p. x^ is determined by evaluating the standard deviations of the n variables and their correlations two at a time (the higher partials being deducible if the correlations of zero order are known). usually denoted by the letter P. 358. 257. we have ' '•=V4-"i)^ for the standard error of sampling in the content of the p*^ class while by a similar adaptation of the reasoning on p. the proof given assumes that deviations from the expected frequencies follow the normal law. for if it is very small the It is distribution of deviations will be skew and not normal. There are three points which the student should note as regards the practical application of the method. the reduction of which. all n+ l classes of the frequency Values of the probability that an equally likely or less likely system of deviations will occur. have been computed for a considerable range of x^ ^^d of n' — n+l= the number of classes. to group together the small frequencies in the " tail " of the frequency distribution. can be effected in terms of X ^7 methods described in text-books of the integral calculus. but the student should have no difficulty Its in following the steps given in pp. therefore. infra). for the correlation of errors of value is 71=0 In the summation extending to distribution. Everything turns. therefore. 47. As we have seen. With these data. as is done in the second illustration below. By an application of th. so as to make the expected frequency a few In the case of the first illustration it might have units at least. 370 THEORY OF STATISTICS. and are published in the Tailes for Statisticians and Bio-metricians mentioned on p. x^ "an be deduced (the actual process of reduction is somewhat lengthy. 258. This is a reasonable assumption only if no theoretical frequency is very small.e method of p. successes with that of been better to group the frequency of . In the first place.. 370-2 of ref. upon the computation of the function x. of Successes. 5.StJPPLKMENTS — GOODNKSS OF FIT. 4. 258). . a throw of a success {p. 371 Twelve Dice thrown 4096 times. or 6 poimts reckoned No. assumes that the In a large number. the constants of a frequency curve from the observations themwe determine the selves. . and negative again for II and 12 successes. distribution according to the . This is almost the first case supposed. Area in both tails 9999998551 -0000001449 -0000002898 ( -t- so that the probability of getting such a deviation or —) on random sampling is only about 3 in 10. If we regroup the signs of m' — m. The value found f or P (•0015) by the grouping used is therefore in some degree misleading.— 372 I THEORY OF STATISTICS.000. not from a priori considerations "independence values" of the frequencies for a contingency table from the given row and column totals. of practical cases in which the test is apWe determine. : P : Greater fraction of the area of a normal curve for a deviation 5-13 Area in the tail of the curve . example on the preceding page aU the differences are negative up to 5 successes. the proof outlined is theoretical law known a priori.000. and it may happen that x" ^^^ quite a moderate value and is not small when all the positive differences are on one side of the mode and all the negative differences on the other. and the frequency of success. for example. The method used pays no attention to the order of these signs. a priori considerations. in the section headed "Comparison Frequencies based on the Observations. and in fact we have already found (p. . From Table II. or that the differences are negative in both tails so that the standard deviation shows In the first an almost impossible divergence from expectation. attention should be paid to the run of the signs of the differences m' — m. 267) that the mean deviates from the expected value by 5*1 (more precisely 5'13) times its standard error. of Tables for Statisticians we have In the second place. . so that the mean shows a deviation from the expected value that is quite outside the limits of sampling." Finally. perhaps almost the majority. 12 successes with that of 11 successes. . again not from This general case is dealt with below. . plied this condition is not fulfilled. positive from 6 to 10 successes. we find Successes. In the second example the signs are fairly well scattered. The regrouped distribution is — P : Successes. Such a regrouping of the frequency distribution by the runs of classes that are in excess and in defect of expectation would appear often to afford a useful and severe test of the real extent of agreement between observation and theory. or practically 27. 373 is this comparison n' is 3.— SUPPLEMENTS For —GOODNESS OF FIT. the mean being in almost precise agreement with expectation. and the regrouping has a comparatively small effect . . and about "000001 a value much more nearly in accordance with that suggested by the mean. ^^ is 26-96. x'- .374 THEORY OF STATISTICS. is uniform over the whole range from to 1. Thus for a rough grouping into four classes the above series of trials gave : P.— SUPPLEMENTS say. . 375 when we fulfil the conditions of simple sampling. —GOODNESS OF FIT. r be the number of rows. ref. we are taking it as one more than the number of algebraically independent frequencies. observed frequency j central figure. to Lower Tint on . Table XXI. so Twenty observers then that there were 256 cards in the pack. (Yule. medium. that this is a reasonable rule if he considers that when we take n as the number of classes. ref. difference S. or darlc) assigned to each of two Pieces of Photographic Paper on a Card : 256 Cards and 20 Observers. each one naming each tint either "light. the comparison frequencies being given a priori." or "dark.). The same statement must hold good Hence. Y. Sixteen pieces of photographic paper were printed down to different depths of colour from nearly white to a very deep blackish brown. Upper figure. 5 of Chap. for every row. went through the pack independently.." The student will realise Table showing the Name {light. Name assigned Card. if THEOEY OF STATISTICS. and the tables must be entered with the value TO' = (r-l)(c-l) + l. combining scraps from the several sheets in all possible ways. c the number of columns. two scraps on each card one above the other. the number of algebraically independent values of S is (r— l)(c— 1). Small scraps were cut from each sheet and pasted on cards.) bottom figure. since the total number of observations is fixed.376 zero." "medium. independence frequency . 5 of Chapter Y. The following will serve as an illustration (Yule. SUPPLEMENTS 4225/785 — GOODNESS OF TIT. 377 . .378 THEORY OF STATISTICS. and considered the difference between the proportions of white flowers for prickly and for smooth fruits respectively. with the by the ^^ method.... is That agreeing. Table XIV... The same result would again have been obtained had we worked from the columns instead of from the rows. 6 of Chapter III...— SUPPLEMENTS — — GOODNESS : OF FIT... 379 Interpolating in the table of areas of the normal curve on 310. Example ii. Greater fraction of area for a deviation of '84 in the normal curve Area in the tail Area in both tails . as great as or greater than that actually observed is -401. of "7995 -2005 "401 to say. we have p. the probability of getting a difference. ..— (Data from ref. within the accuracy the arithmetic. or taking the required figure directly from Table II. of Tables for Statisticians.. of either sign.) The following table shows the result of inoculation against cholera probability given on a certain tea estate : . were cited. For the association table there is only one algebraically independent value of S. a positive association. 6 of Chapter III. telling us what is the probability of getting by mere random sampling a series of divergences from independence as The question is usually great as or greater than those observed. — P is '96470 '03530 "07060 both methods must lead to the same result... Add up all the values of application of the present general rule. perhaps.. its The ratio of the difference to •01853/'01025. — — All may give. this better answer is given by the method is not quite satisfactory. but the values of may run so high that we do not feel any great conThe question then arises fidence even in the aggregate result. different estates in the same group. Hence if we are testing the divergence from independence of an aggregate of association tables. Chapter IV. in view of the fallacies that may be introduced by pooling {cf.. y^ for the different tables. P x'- . from which the data of Example ii. answered by pooling the tables . for the aggregate as whether we cannot obtain a single value of a whole.380 THEORY OF STATISTICS. we take the following values of -^ and of They refer to six for six tables that include that example. but. take n' as given by w'=l + 2(i'-l)(c-l). thus obtaining the value of -j^ for the aggregate.. or 1-808. standard error is therefore Greater fraction of normal curve for a deviation of 1'808 Fraction in tail Fraction in the two tails . Thus from ref. §§ 6 and 7). we must add together the values of x" and enter the P-tables with n' taken as one more than the number of tables in the aggregate. It may often happen that we have formed a number of contingency or association tables more for similar data from different often the latter than the former As before. and enter the P-tables with a value of n' equal to the total of algebraically independent frequencies increased by unity fields. P A : that is.. An Aggregate of Tables. on random sampling. The formulse for the general case. roundly.SUPPLEMENTS — GOODNESS OF FIT. can therefore regard the results as significant with a high degree of confidence.000. as for the special case in which the frequencies with which comparison is made are given a priori. Experimental Illustrations of the General Case. and the value of x^ computed for each table for divergence from independence. 1-3 times. grouping together the frequencies from x^ = 15 upwards.000. 28 29 -000094 -000061 of is -000081. For the two cases we have to' = (3 X 3) 4." It will be seen that the agreement between expectation and observation is If the goodness excellent for so small a number of observations.e.000 trials but 81/64 or. we should only expect to get a total of y^'s as great as or greater than this. ra' = (lx7)-f-l = 8 Differencing the columns for corresponding to these two values of n'. Adding up the values of x^. i. 381 association between inoculation and protection from attack positive for each estate. we obtain the theoretical frequency-distributions given in the columns headed "Expectation" in Table A. The numbers of beans counted in each of the sixteen compartments of the revolving circular tray mentioned on p. can be checked by experiment. we find P. may. for the observed event (S(x^) = 28-4 and all associations positive) is therefore only -0000013. should therefore only expect to get an equal or greater total value of y^ and tables all shomng positive association. giving P=0"52 in the first case and 0-22 in the second. of fit be tested by the x^ method. so that n' is 4. (2) with 2 rows and 8 columns. I think. 81 times in 1. and entering the column for ra' = 7 (one is The P more than the number of tables considered). not 81 times in 1. 374 above were entered as the frequencies of a table (1) with 4 rows and 4 columns. but for only one of the tables is the value of so small that we can say the result is very unlikely to have arisen as a fluctuation of sampling. The observed distributions of the values of x^ in 100 experimental tables are given in the columns headed " Observation. x'. go further for all the observed associations are positive. and in six cases there are 2* or 64 possible permutations of sign. x^ is found to be 2-27 for the tables and 4-36 for the 2x8 tables. P 4x4 .000 trials.1 = 10 whence by interpolation the value P We We : We P — and respectively. the total is 28-40. — 382 Table THEORY OF STATISTICS. calculated from Independence-values. in the second as 8.ents. 50. in Tables with 16 Compartm. compared with the Actual Distributions given by 100 Eomervmenial Tables. Theoretical Distnbution of -j^. In the first case n' must be taken as 10. (Ref. A.) x' . and P 0-97. The values of y^ for the 350 fourfold tables of Table B were added together in pairs. 383 given for evaluating P for an aggregate of by the experimental data of Tables C and D. and testing goodness of fit between theory and observation. — m Sum of X^'s. n' is 8. y^ is 5-53. n' is 9. grouping the values of v^ 7 and upwards. and 4 in the second. The results of theory and observation are compared in the first pair of columns of Table C. we get the observed distribution on the right of Table C. . n' must be taken as 3 the first case. compared with the Actual Distributions given by Experimental Tables. Table C.SUPPLEMENTS — GOODNESS OF FIT. Testing goodness of fit. According to theory The theorem is last tables illustrated the resulting frequency-distribution for the totals of pairs of y^'s should be given by differencing the column of the P-table for M =3. and the theoretical distribution by difi'erencing the column of the P-table for n'=i. giving 175 pairs. x^ is 2-18. and Pis 0-60. Grouping the values of x' for the 350 experimental tables similarly in sets of three and summing. Grouping values of x^ 8 and upwards. Theoretical Distribution of Totals of x^ {calculated from Independence-valves) for Pairs and for Sets of Three Tables with 2 Rows and 2 Columns. — Sum of two . and is 0-39. compared with the Actual bistrihwtion given by Experimental Tables. n' is 5. -being given in the second column. P Table D. Taking the two groups at the bottom of the table together and testing goodness of fit.384 THKOEY OF STATISTICS. ^^ is found to be 4-11. from Independeiice-values)for Pairs of Tables with 4 Sows and 4 Columns. Theoretical Distribution of Totals of -j^ {calculated. 385 Table of the Values of V for Divergence from Independence in the Fourfold Table. .SUPPLEMENTS —GOODNESS OF FIT. 386 THEORY OF STATISTICS. . . 1918. Ass. Roy. Irving. and Cost of Livingin Australia. Contingency (2) (p.. tlieir Development and Progress in many Countries. Soc. vol.. (edited by)." Commonwealth of Australia. March 1921. vol. ADDITIONAL EEFERENCE3. 1918. 1921. Frances. Karl.. in Frequency Curves. 1920. 1916. 1921. ' On measuring association with special reference to 4 x 3-fold classifications. xi.." Biometrika. The Maemillan Co. Ixxxiii. "The Correlation Coefficient of a Polyohorio Table. Price-Indexes. 121. vol. Gaz. p. The Mode (5) (p. (Gives a proof of the relation noted on p.. (An extension of the method of contingency coefficients to p. Pub. written by a specialist for each country. Biometrika. H. Toohee. L. Working Classes. p. Stat. i.) T." ." Jour. W. xi. F. M. p. L. 93. Stat. Roy. DooDSON. the General Theory of Multiple Contingency with Special Reference to Partial Contingency. 103. : There are useful discussions as to method in the following (6) (7) (8) (9) Knibbs. The History of Statistics. 1912.M. 167. Soc.." Jour.." Biometrika. 40. (11) Flux. S<(rfjs<ics. (14) March... and J. W. "The Theory of Measurement of Changes in the Cost (10) of Living. p. T. p. Soy.. p. Stat. "The Measurement of Price Chsinges. 343. 1.. 1921. '' (4) RiTOHiE-ScoTT. History of (1) 387 Official Statistics (p. Cost of Living Cohimittee. Index-nvunljers (p. vol. Median and Mean. of Compilation. 73). Stat..." Quart. 1918). xii. J." Jour." Biometrika. arithmetical examples are provided in the undermentioned paper. 1919." Rev. 1. Soc. Econ. "The Course of Real Wages in London. Ixxvii. L. p. No. 4.. 6). 455. March 1920 and Feb. New York. A. A. "The Measurement of Changes in Cost of Living. Report (Cd. 429. vol. p. iii. 6. 1918. vol. "Fisher's Formula for Index-numbers. (13) Persons. 130). 1913-14. (A collection of articles. xi. 1900-12. KoBEN. 145. Kabl. 8980. A. mainly on the progress of official statistics. 533.. Roy.— ) ) — ) SUPPLEMENTS— ADDITIONAL KEFEKENCES. 1916.. (12) Fishes. Arthur "Relation of the Mode. "Les modes de mesure du mouvement general desprix. For the student of the cost of living in Great Britain the following are useful : (15) "Labour Gazette Index Number: Scope and Method Lab..." Jour. Bennett. Soc. (Considers various methods of ' Pearson.. vol. "Prices. Ixxxii. 159. H.. 1921. Ixxxiv." Metron. Wood. 130). Stationery Office. Report No. p. "On Criteria for the Existence of Differential Death-Rates. Stat. (3) Pkaeson. vol. 1916-17. BowLET. classification subjected to various conditions . Amer. vol. p. vol. Labour and Industrial Branch. "The Best Form of Index-number.. "The Correlation between a Component. p." Biometrika. v. "On foTcnings Tidskrift. 51. "The Correlation Function of Type A. vol. 298." Biometrika. "On the Time-correlation 4. Correlation (23) : Effect of Errors of Observation. (p. Psychology. usually employed instead of "correction.. "General Ability. Bernard. 1921. Karl. 84. 1921) ciitical notices of the same in the Labour Gazette for Aug. WiCKSELL. 854. p. xiii. Pearson. WiCKSELL. Boy. and the Sum of the Remaining Components of a Variable.. 32 Eccleston Sq. Problem.." Quart. L.. 25. Pearson. Svenska Vetenskapsakademiens Handl. 1912.." Jour..." Meddelande fran iMnds Svenska AktuarieAstronomiska Ohservatorium." Kungl. 209). " On the Application of Goodness of Tit Tables to test Regression Curves and Theoretical Curves used to describe Observational or Experimental Data. (17) BowLBY. Bd. p. American Stat. J. 188). 1917.. G. p. Pub." Metron. 33. 1921." Correction or Standardisation of Death-rates (p. 226). 1917. (26) Yule. Jour. Jowr. : History (p. p. 225).. and between the Sum of Two or More Components. p. and and Nature.. L. 209). its Existence Brit." Biometrika. U. S. " Final Report on the Cost of Living of the Parliamentary Committee of the Trades Union Congress" (The Committee. xi. S. Deaths. " On a General Method of Determining the Successive Terms in a Skew Regression lAne. Prices and Wages in the United Kingdom.— 388 (16) THEORY OF STATISTICS. vol. XV. K. .] (1911). vol. Miscellaneous 226). Arthur. Ixxxiv. 1921.. 1913. . No. 1920 (Clarendon Press). Ass.) Correlation in Case of Non-linear Regression (20) (p. London. 1920. A. vol. (21) WiCKSBLi.. 6578. Correlation (18) xiii.. etc.. Soe. (Criticises and extends the work of Slutsky.. Sept.. No.. (22) Pearson. Spearman. 1921 and review by A. and Sept. K.. D. 1921. Hart. D. ..." and a "potential or standard death-rate " is termed an "index death-rate": for the methods of standardisation in present use see is The term "standardisation" now (24) Seventy-fourth Annual Report and Marriages in England and Wales Correlation (26) : of the Registrar General of Births. [Cd. Iviii. Harris. i. p. vol. Econ. (p. Bowley. "Notes on the History of Correlation. 1917. Fit of Regression Lines (19) (p.. (27) S. 497.. C. Oxford. 237. "An Exact Formula for Spurious Correlation. 1914-20. D. Logarithmic Correlation. with an Application to the Distribution of Ages at First Marriage. vol. 1916-17. vol. Stat. and G. vol." Biometrika. 2^ livr.." Biometrika. 273). the description of type frequency curves contained in references (1) and ' (3) of p. Stat. 1916-17. 1918. 309. Series A. ' (p. the Partial Correlation Ratio. (39) Edgbwobth.) ) ) — SUPPLEMENTS Partial Correlation (28) —ADDITIONAL llEFERENCES. Mathematical Theory of Statistics " (1912). "Tables to facilitate the Calculation of Partial Coeflicients of Correlation and Regression Equations. 105. Ixxix. xi. 273). Stat.." Genetics. . xvi. .. "On the Mathematical Representation of StatistiIxxx. T.. 1916. vol. 273. Karl. vol. "On Numerical. C. Udny Yulb. tion of some Bacteriological Methods employed in Water Analysis. 1921. (29) Vj{-i--rls){\-A) and ris'-^a/ N/(l-r?3)(l-4). 1919. the undermentioned deal with the forms which are suitable for the representation of particular classes of data.) The advanced student who desires to compare the merits of different frequency systems proposed. Second Supplement to a Memoir on Skew Variation.. de Stat.. 389 252)." Jour. Soc. Roy. p. Series A.. (p. F. "On .'' (Completes Bhil. issued from the Astronomical especially "Contributions to the (These papers are concerned with the general theory of frequency systems . the Statistical InterpretaGkbbnwood. and Partial Correlation Ratio Kblley. vol.. xci. BoRTKiBWicz. ix. Dbtlefsbn. 50. 225. xiii.. p. 211. 411 . Bm/. should consult the following : (38) Charhbr. " Realiamus und Formalismus in der mathematisohen Statistik.. x." Explanation of Deviations from Poisson's Law in Practice. Numerous papers Department of Lund. "Ueber die Zeitfolge Zufalliger Ereignisse. p.) Peakson. pp. Boy. p. L. Frequency Curves (37) Pearson. Part (30) IssEKLis. 1915. vol. Random Occurrences in Space and Time when MoKANT. VON.. (Applies a criterion developed from Poisson's limit to the discrimination of water analyses numerous arithmetical examples. (Continues the discussion initiated by the paper of Miss Whitaker. tome xx. cal Data. especially statistics of epidemic disease.." Bulletin. vol.. 314). 266. iii. 492. of the University of Texa s.. 1916." Allgemein.. 429. Y." Proc. p. cited on p.. 322. vol. 1917. "Fluctuations of Sampling in a tion. 1918.BiomciJ'ifca. 599." Jov/mal of Hygiene. 1916.. . vol. de rinstitui Int. "An "On See also reference 40. No. " On the Partial Correlation Ratio. V.. L. Arch. ocxvi.. p. 6. 36. ii. Soc. (33) (34) (35) (36) L. . Karl. Ixxxi. Trails. 27." Bull. J. L. vol. 1916. "Student. p. p. Mendelian Popula- The Law of Small Chances (32) (p. 466 65. BoBi'KiBWioz." .. followed by a Closed Interval. Sampling of Attributes (31) (p. p. p. 1906-12.. 1915. Soc.. M. (Tables giving the values of L. VON. A. E. (After correspondence with Mr Fisher I wish to withdraw the statement on p.) from Contingency Tables. Karl. D. L. Soc. vol. and Hilda P. xxxi." Appendix A (Contains a to vol. xoiii. 1916-17. xi. pp. with experimental illustrations. (52) SOPER. vol. "On the Distribution of the Correlation Coefficient in Small Samples. 1917. of Census of the Commonwealth of Australia. ^ p." Jour. p. vol. G. 255. vol. H. M... U. Sir Ronald. 212 and 225. vol.. Sir Ronald. 1910-11." Phil. G. 328." Journ. the Study of u. (The appendix to this paper summarises the author's results and those of Sir Ronald Ross . full discussion of the application of various frequency systems to vital (41) Brownlee." Pts. 367). xi. Ixxxi. p. p. STATISTICS. 262.) ) ) 390 (40) THEORY OF Brownlee. Ixxxv. 1920. J. vol. Probable Errors : General References (p.. Karl.. p." Biometrika. 1922. 292. p. G. (51) IssERLis. (45) (46) MoiR. 369. 1916-17. vol. xxx. 87. 1913. . Soc. that a full proof [of the general theorem as ap])lied to contingency tables] seems he has convinced me that his proof covers the case.. Karl. H. Ix. 1916.. "The Mathematical Theory of Population. Boy. vide infra. x." Proc. 1917. 97 of this paper. Hudson. p." Biometrika. Soc. Boy. A.. 1922.." Jour. " On the Application of the x^ Method to Association and Contingency Tables. (Numerous graphs of mortality rates in different xviii. D (6th series). Boy.. "The Mathematical Theory of Random Migration and Epidemic Distribution. Roy. "On the Interpretation of and the Calculation of P. A. (63) Pearson. and III. priori Pathometry. p." Biometrika. Actuarial Soc. II.. Edin." Trans.. Ixxxv.. (49) Fisher. and Others. "Certain Aspects of the Theory of Epidemiology in Special Reference to Plague.) Application of the Theory of Probabilities to (42) Ross. 204. vol. Ixxxiii. 28.. 85. Medicine. Soc. Stat. (48) Pearson. p. vol. "Mortality Graphs.. Sect. with particular reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents. Stat. . xcii. Epidemiology aiid State Medicine. 75. vol.. i. (50) YcLE. (44) Knibes. Pearson. " On a Brief Proof of the Fundamental Formula for testing the Goodness of Fit of Frequency Distributions and on the Probable Error of P. Boy. "An "An statistics. 1916. 355). J. "On the Value of a Mean as calculated from a Sample.. Greenwood.. vol. Theory of Probabilities to the Study of a priori Pathometry. vol. U. Goodness of Fit (47) (p. R. Stat. A. Soc. "On the Probable Error of Biserial ?.." Proc. Mag.. Boy." Proc.. still to be lacking : See also reference 19. (A modification of the goodness-olfit test to cover such statistics as those indicated by the title. 311. p.. Proc. vol. Boy. " An Enquiry into the Nature of Frequency Distributions representative of Multiple Happenings. Soc. Soc. 315 and p. p. Application of the (43) Ross. Soc. 95.. "Multiple Cases of Disease in the same House. p." Jour. Stat. H.. 1918. classes and periods. America. and Yule.. 1918. Boy. " Journal of the Board of Agriculture. xiii. v. T. Wood. 16. Robinson. 1912. xii. p." Metron. R. E." Univ. D. etc. M. 263." Jour.. J. Harris." Jour." Proc. L. 1921. (With an appendix by ' ' Student " desftribing the chessboard method of conducting yield trials. S. of work has been done on this particular branch of the and the following references may be useful Berry. (App. of Illinois Agr.. D." Biometrika. and R. Gauthier-Villars. T. xiii. B... J. p. L. Le jeu. Exp. 1916-17. 89. 1914. 283. 1910. Supplement 7. 430. Mercer. p. 1911.. Contingency without Approximation. Mitchell. 140 and 185. A.. Science. 1039. of Agronomy.. "The Interpretation of Experimental Results. Biometrika. Collins. Paris." Proc. iii.. p. 144. "On the Mathematical Expectation of the Moments of Frequency Distributions... "The Feeding Value of Mangels. of the Partial Correlation Coefficient in Samples of Thirty. Hall. 1921. Soc." Jour. des prohahilitis..." Amer. 1916.. A. Young. feeding experiments. and H. 391 of a vol. and F. 6. Agr. M. " On a Criterion of Substratum Homogeneity (or Heterogeneity) in Field Experiments. Science.. vol. "The Experimental Error of Field Trials.. D. p. p." Jour. 1913. H. vol. (Contains a collection of papers on error in field trials. A. and Kakl Peakson.. . Russell. vol. Agr. vol. et le hasard. Stkatton. and O'Brien. pp. vol. vol.. Agr. vol." Pt. "Some Experiments to estimate Errors in Field Plat Tests.... Wood. 417. Roy. vol. i. 1905. "On the Probable Error of Sampling in Soil Surveys.. Paris. R. 1910. "On the Probable Error of a Coefficient of Correlation deduced from a Small Sample. B. (72) Baohklier. S. "On the Probable Error Coefficient of xi. L. A. J. p. 1918-19. Wood." Jour.. II. T. H." Jour.Siomein'fe. Flammarion. 1920.U." Jour. (62) Hai. xiii. (58) 1920. Naturalist. "Errors in Feeding Experiments subject. Science. Station. Agr. 1911.. p.testing. p. and A. p.. Bulletin 165. Wood. Fisher. vol. Agr. la chance. 225. 1916. Agr. Andrew. 1921." ." "An Experimental Determination of the Probable Error of l)r Spearman's Correlation Coefficients. 361). (55) Editorial. "The Element of Uncertainty in the Interpretation of Feeding Experiments.. Surface. W. a. horticultural work. iii. 113. B. p.l. 1916. "On the Probable Errors of Frequency Constants. F. TIL. "An . W. Amer. viii. p. Geindley. T. Agr. W.) — SUPPLEMENTS (64) —ADDITIONAL REFEKENCES. etc. "The Interpretation of the Results of Agricultural Experiments. xi. Lloyd. 3. Pickering. Science. 1921. p. i. S. TcHOUPROFF. 275. (69) Errors of Sampling in Agricultural Experiment. Baohblier..) Lyon. 107. Sci. 4. "Variation in the Chemical Composition of Mangels. milk. A. B. Soc. p. J.. 1911. ''A Method of Correcting for Soil Heterogeneity in Variety Tests. 215. G. and vol. vol. (56) "STtTDENT. and Raymond Pearl.. B. Science. H. and W. E." Biometrika. A. : A good deal (60) (61) with Cross-bred Pigs. xovii.. vol. No. Calcul tome i.. iv.. T. vol. (63) (64) (65) (66) (67) (68) (69) (70) Works on Theory (71) of Probability. A. Research. Bebuy.. vol. iii.. p. Experimental Determination of the Distribution (57) BispiiAM.. S. Addendum (78) York (Maomillan). King. (77) Fisher. i. Bruxelles (Dewit). Seidel. 17 on p. J. 1920. 1917 (Layton). FoRCHER.. London. (79) Jones.. Hugo. to Frequency Curves amd Correlation.. & Columbus. L. Statis(80) : tique thiorique. P. Introduction to Mathematical Statistics. " Ajjplieations of Mathematics to Statistics " the two Parts can now be purchased separately. JuLiN. Wien. Adams Co. (81) Keynes. Palin. L." 392 (73) THEORY OF STATISTICS. The Mathematical Theory of Probabilities and its Application to Frequency Curves and Statistical Methods. 1913 (Veit). 1921) in the -series entitled " Les maitres de la penste soientifique. W. An inexpensive reprint of Laplace's Essai philosophique {ve{. London. 1921. 1915.. Die statistische ForsehwngsmetAode. Arne. London. 1921. (This edition has been much extended in Part II. Die statistische Methode als selbstandige Wissenschaft. Paris (Rivifere). A. Bell & Sons.. A First Course in Statistics. BowLET. Principes de slatistique thiorique etappliguie tome i. 4th ed. The Combination of Observations.. C.. Elements of Statistics. (75) OzUBER. 361) has been published by Gauthier-Villars (Paris. W. New (76) Elberton. 1918..) David. 1917. 1921... ^ Treatise on Probability. 1921. Macmillan. J. vol. C. (82) West. Leipzig.. M. Cambridge University (74) Bkcnt. A. London. . E. Press. ANSWEES TO. AND HINTS ON THE SOLUTION OF. THE EXERCISES GIVEN. . there would have been 3176 male and 3393 female deaf-mutes. . CHAPTEE . 3. 380/570=2/8. among 1. since 294/490 = 3/5. negative association. Percentage of Plants above the Average Height. since (b) («) (^5)o = 1457. III. : 2. independence. since 256/768 = 1/3. 48/144 = 1/3. .394 THEORY OF STATISTICS. Deaf-mutes from childhood per million among males 222 females 133 there is therefore positive association between deaf-nmtism and male sex if there had been no association between deaf-mutism and sex. (a) positive association. (2) . 216-6.. 0-651. Median. A.2. Standard deviation 21-3 lb. 7. 156-73 lb. (r=17-3. upper quartile 168-4... VI. 0-68. ninth decile 5.. 6|d. 0-6391. 0-29. The assumption that observations are evenly \/n. 20. ratio 114'9. B. lie between '177 . 2. 3. Mean.pq.. V. Ratios: m. <lC. 2nd qual. M=n% 6.. The proof is given in Chapter XV.) 4.|>0.. ratio 115-2.2-l) AB (3) >i{Zx + 23?). Mean deviation 16-4 lb. Note that x cannot exceed J. 4. inference is possible from positive associations of inference is only possible from negative associations if x lie between "183 . BO and '224 . If the terms of the given binomial series are multiplied by 0. proof is given in Chapter XV. 146-25. found by fitting a theoretical : frequency curve.) 150-6 lb. and •215 . CHAPTER 1. =0-77./s. . 7. whence $=12-95. 2. 10s.= 17-5. (1)921. (Note mean and the median should be taken to a place of decimals further is desired for the mode the true mode. = £94 Skewness. that the than Mean. 1200. distributed over the . Lower quartile 2. 8. (4)115-2. VII. 154-67 lb. 1.. . Approximately lower quartile =£26-1.d. (2) Means 77-4. Geometrical means 77*2.i4C and . the standard deviation is the higher the coarser the grouping. 9s. 142-5. CHAPTER 1. 89-0. . CHAPTER 2.) 6330.1.. e/3. An and BC. 0-36. an inference is possible from negative associations if a. 2^d.. (1) 116-0. 100. 1st qual. 3.<i(6a. 200. (2)916. (1) j¥=73-2. 2. 3 note that the resulting series is also a binomial when a common factor [The full is removed. . is 151 '1 lb. VIII. Median.). o. . (3) 88-9.5x .. § 6.-2a. § 6. subject to the conditions y-^O. (2) Jf=73. is Mode (approx. £35-6 approximately. As in (2). n.963. (3) (Note that while the mean is unaffected in the second place of decimals. no inference is possible from positive associations of .] CHAPTER = 0-61.p. Mode (approx. 5. upper quartile =£54 -6. (r = 18-0. (True mode 0-653.d..) 396 No THEORY OF STATISTICS. Note that x cannot exceed J.d. 6.507. 3. r=-l-0'81. expressed as a fraction of the class-interval. X=0-5r+0-5. 0=2/3. corrected standard-deviation is 0'9954 of the rough value.8=-0-13.— ANSWERS the ETC. <2=077. 3 for out-relief ratio. 2. (The latter. n-2.a= -fO-OOr. is given its proper sign. 58 per cent. The 4. 397 intervals does not affect the sum of deviations.) 2'556 in. (against 1240 per cent. hence the entire correction is d(ny -jij) + jj2(0-25 +iP]. and hence errors in the quantities are If r become considerusually of more importance than errors in the weights. 3. The others may be 10. rij3=-f0-60. 0"43. 2 for pauperism. ir3=3-09: r. 8.is=0-694. r=l-3X-l-l-l. and the former consequently probably too high. 12. : 2. CHAPTER XII. except for the interval in which mean or median lies : for that interval the sum is Wj (0'26 +d^). No effect at all. against 2'572 in. " 11. Using the subscripts 1 for earnings. §10. ru. Chap. ffy=2-280. <rx IX. ras= -HO-759. (2) If the down from symmetry. ffs-i2=70-l. = l-414. and in the weights the value found for the weighted mean true value -i-d-r. but it does not hat the second term would become the most important in practical cases. Notice that the n^ and of this question are not the same as the iVj and N^ of § 16.. which can be independently calculated. X^ 9-i. seem probably able errors in the weights may be of consequence. Cf. 1. % CHAPTER 1.) 1. of course. and In this expression d is. (1) written e. 0-30. is too low. CHAPTER XL 1'232 per cent. 9 -31 -1.2s=2-64. XIV. 5. ^ «j(«(-F«) d is the important term. rm= -0-436. mean value of the errors in variables is is d. Estimated true standard-deviation 6 '91 : standard-deviation of fluctuations of sampling 9 '38. (Tx-irn The If r is small. fa = 579. TO EXERCISES GIVEN.3 '37 Xj -h -00364 = . X. and the divergence is 5-4 times this. = 5"23. (a) Theo.$)-6i-Z per cent. M=S. but . and would frequently occur as a fluctuation of simple sampling. Theo. ilf=3-47. The standard deviation of the proportion is 0-00179.1= = '-13'a = '-23. (r (t <r (c) 3. Xi = 53 + 0-127 . = l-llS Actual Jf= 2-48.^.1= — !• +1. r = 5 Actual if= 50 -11. '•23i4= -0-433. (r=l-225: o- „ . The diflFerenoe might therefore occur frequently as a fluctuation of sampling. and the difference from expectation 18. : {A$)l(. S. M=2-97. = = 3. 2.. 4. r]4.2= 12 "9 per cent. Difference from expectation 7 "5 standard error 10-0. n-2 Hence if r be negative. 0-4. or Case III. CHAPTER 1. = l-323 : : = 1-26. The actual 8. 7-13. •/•g4. 9. seem to indicate any real variation. the cannot be numerically greater than unity and r 1). fl: (b) . (Ji)/iV=71'l percent.24 «ri. 0-2. p = l-(T^/M. : CHAPTER Row. The standard deviation of the number drawn is 32. and thence €.significant.. 49-2..3= '"l2-3 -1. (A)/N=S7'6 per cent. 7i=110-4. 5. 5. and might. . only fluctuations of sampling.23i = 9-l?. . . XIII.: (^ j3)/(3) = 80 "0 per cent. actual actual Theo. »•«.. M=50. »-]2. iIf=2-5. therefore. and thence ei2=3-40 per cent.iB= +0-397. The actual difference is less than this. rather infrequently.398 2- THEOKY OF STATISTICS. (a) (AB)/(S) = 69-1 per cent. = l-40. occur as a fluctuation of simple sampling.f2+0-587 Jfg + O-OSiS X^. The correlation of the pth order is r/(l +^r). -0-149. 6. M=6. correlation of order 4. Standard deviation of simple sampling 23 -0 per cent. The test can be applied either by the formulae of Case II. 0-3. XIV. and therefore almost certainly .24= +0-803. 10. (r = l-14. ir = l'7S2 : Actual Jf=6-n6. r.i2= 12-5. n^M/p :^=0-510. 103 = 105 -4. There is no significance. Difference 5-8 per cent.134 'u 3J= +0-680. standard-deviation does not. J!f=3-5. cannot exceed (numerically) l/(n -J-12. (i) {AB)/{B) = 70-1 per cent. The actual difference is 1'7 times this. Case II. is taken as the simplest. Difference 10-9 percent. ir=l-732. 13= -0-553. 71=12-0 :p = 0-454.3 = 723. ETC.ANSWEKS. .. 399 CHAPTER (1) XV. TO EXERCISES GIVEN. 50. 4. n=100. 3. upper Q. and the question is solved on determining its magnitude. frequency 1554. standard error 0-28 lb.orO'76 per cent. standard 4. 6.400 THEOKT OF STATISTICS. . 2. 0'18 lb. of the standard-deviation : error of the median.. Lower Q. Estimated frequency 1472. 5. In fig. of that range. 1. sector with an angle between one and two right angles. 0'24 lb. standard error 0-26 lb. the standard error of the semi -interquartile range is 1"23 per cent. less than the standard error 0'34 lb. suppose eyery horizontal array to be given a slide to the nght until its mean lies on the vertical axis through the mean of the whole distribution : then suppose the ellipses to he squeezed in the direction of this vertical The original quadrant has now become a axis until they become circles. 00196 in. frequency 1116. 17 per cent. r. 0-0316 0-0304 0-0266 0-0202 0-0114 0-0 0-2 0-4 0-6 0-8 0-096 0-084 0-064 0-036 . CHAPTER XVII. O'l «=1000. metic. deaths from (law of small chances). 54-56. generally.. use of terms " error of mean square " and " modulus. 265-266. 15. the problem. refs. Association. 25-59 def. Airy. in ignorance of third. 52-53 . Ages. 0. Association. Agriculture. 51-54 refs.. deaths and occupation.. of estates in 1715. followed by citations of the authors' papers or books in the Usts of references. hair and eye-colour data cited from. or theoretical results of general interest . Theory of Errors of Observation. 204-205. 57. 32-33 . of grandparent." 144. 61. women . testing by . See Mean. 391. of husband and wife . 236-237. (correlation). Ammon. 2. 28 . 164 . deaf -mutism and imbecility. 42-59. 45-46 ... 78 . 333 . 3). arithmetical treatment. Gottfried. in normal correlation. and offspring. defects in schoolchildren. 177. inoculation. 42-43.order frequencies.. .] Ability. of dweUing-houses 83 . Abriss der Staatswissenschaft. 35-36 . arith(table).. parent. 53-54 . 173 constants (qu. standard-deviation of. 216-217 . 377-378. degrees of. [The references are to pages. Aggregate. 33-34. at death of certain (table). in all such oases the number of the question cited is given. 189. 46-48. testing. complete independence. coefficients of. refs. 44-48 . number of. In the case of authors' names. def. 34-35 eye-colour : . del. Pearson's coefficient case of . Annual value examples deaths and sex. Anderson. of classes. 198 . illusory or misleading.. Aohenwall. constancy of difference from independence values for the secondorder frequencies. 333.. comparison of percentages. B. 37-39 . Array. citations in the text are given first. 388. O. 159 56-57 based on normal correlation (refs. colour and priokfiness of Datura fruits. 10-11. 39-40. eye-colour of father and son... 29-30 . Sir G. 100 diagram. table. 101.. 30-35 . 90-102 relative positions . for n attributes. INDEX. 40. errors in. Accident. 48-51 total possible . refs. partial. 208. 319-321. ). Agricultural labourers' earnings. generally. Arithmetic mean. 401 26 . general. The subject-matter of the Exercises given at the ends of the chapters has been indexed only when such exercises (or the answers thereto) give the constants for statistical tables in the text. use of ordinary correlationcoefficient as measure of association. experiment. diagram. total and partial. 379381. 360. Asymmetrical frequency-distributions. refs. See Earnings. 36-37. correlation difference method. Eefs. 44 . Caktd des probabilites. 360. illustrations sampling of. hasard. 25-59. Brownlee. 149-150... Averages. 1 and qu.. 389. 357. W. . A. 9-10. J. refs. 391. Elementary Manual of 360 . use of 1. H. 108 . 95. refs. ultimate classes. J. von. 12 positive classes.. 121-122. See Mean. 10 order and aggregate of classes. tables of squares. 107108 . R. 37. Bateson. L.. refs. 354 360. 369. 313. 391 . data cited from. 299-300 deduction of normal refs. Reports on index-num130-131. on sampUng. frequency curves coefficient. .. J. L. Laws of Tlwught. 354 . 294. Address by 354. positive and negative. cost of living. graphic method of forming 371 a representation of series. 122. refs.. 295-297. def.. Attributes. E. table. 332. refs. average in sense of arithmetic mean. 391.. 389 .. Betz. A. refs. generally. Brown. 261-262. W. 273. cost of living. Bertrand. 6. 392 Bennett. see Stature. Measurement of Groups and 354 . stature. Series. . on pauperism.. .. 113-114. 226.. Oours eUmentaire de statistique.. L. refs. errors in feeding experiments.. (epidemiology). 188. effect of errors on an average. A. 23. L. 279-281. refs... 356 . forms of. Bernoulli. W. refs. mean. A... 273. 391. consistence of class-frequencies. 252.. 17-24 (see Consistence) . in correlation. refs. law of small chances. median. 193. correlation. 390. Boole. Barometer heights. T. G. W. L. curve from.. 2). 259 (qu.. 1-59. Bravais.. 301-302 258. law of small chances. generally. .. W. 402 of THEOKY OF STATISTICS... refs. Axes. Ueber Korrelalion. Asymmetry in frequency-distributions. and modes. Bias in samplin<j. refs. 360. Bowley. 321322. Bielfeld. def. Blakeman. ref. Booth. 377-378. 343. notation. P. Theorie des probabilites. 291-293 295 of. refs. 96 diagram. Brown. variation in mangels. . index correla- tions. J. 360. F. errors of sampling in partial correlations. Mode. probabilites. Baohblibe. . theory of. 360.. principal.. 106-132 . 297-299 refs. diagrams. Borel. 274. Statistics. J. 78. Elements of Statistics. 387.. 209... Charles. . Bateman. 195. 360. 97 means. 129-130. data cited from. Refs. refs. time-distributions.. la chance etc. association of. refs. and mode in. " statistics. 107 . 7... Ars Conjectandi. . J. . 336-337.. 387 Prices and Wages. law of small chances. British Association." Binomial in 291-300 . refs. F.. 10-11 . measures of. 388. effect of experimental errors on the correlation226 TTie Essentials of Mental Measurement. 107. . . ref. Beeton. 254-334 (see Sam- pling of attributes). 314. 67. 1914-20. 354. 14-15 . . See also Frequency-distributions.. Bowley on sampling.. Median.. 13-14 . Calcul des 391 Lejeu. refs. data cited from. 88.. Miss M. weight. . Weight bers . J.. medians. genesis sampling of attributes. desirable properties of. calculated series for different values of p and experimental n. Bortkewitsch.. 109 refs. Bispham. tests for linearity of regression. mechanical method of forming a representation of series. Berry... von. Bertillon. 366 . direct determination of mean and standard-deviation.. probable error of contingency coefficient. 377-381 (see Association) word of. Barlow. et le L.. Baron series. . 353.. J. 82-83 .foragroupedtable. resolution of a compound normal curve. refs. 104. refs. 167. 175-177. 76. of correlation. (qu. 360. . 164 . Cave. 379-381. 226. refs. 211-212 . 30 . Beatrice M. example of Brunt. conditions. classifica- tion of ages. 103 .. (including correction of moments generally). for one or two attributes..). INDSX. Correlation. 8). 21-22. by dicho- tion to correlation tables. standard-deviations of arrays. 10 . correlation difference method. . of variation. 9 . 7. 391. generally. data as to ages of husbands and wives cited from. 76 homogeneous and heterogeneous. construction of tables.. etc. partial or multiple contingency (refs. between movements of two variables. 363-367 . 175. Classification.181-]88. Cholera andinooulation. 205 . 202-204. 225-226. refs. def.. 33-34 classification of occupations. 208-209. 315. . 61-63 .. Contingency tables. 159. calculation of coefficient for ungrouped data. 208. 140. 174. def. 256. H. def. errors of agricultural experiment. 12 .. of divergence from independence. 314. of standard. correlation . 208. 10 .. 388. 197-199. 16. 113-114. 105 . 14-15 data as to infirmities cited from. direct deduction. 388. 223-225 . 198 .. 60-74.. Class. 115. fluctuation method. in sense of complex causation. Coefficient. of class-frequencies for attributes. 387. Consistence. 403 W Collins.. 68-71. 76 . 8. 177- 231-233 direct 181. D. 149. 177.. def.deviation for grouping of observations. 213-214 . . 389. 60 treatment of. representation of frequency-distribution by surface. order of a class. Chance. Cave Browne Cave. 212. Bruns. refs. Childbirth. The Combination of Observations. H. treatment of table by coefficient of contingency. for cases of non-linear regression. rough methods for esti- mating coefficient. isotropy. Census (England and Wales). 3) 189 . 167 174. 265-266. 375-377. Chances. difference E. 250-251. by elementary methods. method. 10. refs. 164. 9). for age and sex-distribution. refs.. 80. 6. 225. as example of a heterogene. refs. def. 201-202 . CharUer. positive and negative classes. contingency. 362-363. F. 76. correlation-coefficient. 8). . S. 64-67 . standard error. 37-39 . deduction. of success or failure of an event. Contrary classes and frequencies (for attributes). C.).. 72 . 18-19 . 388 elementary methods . 273.. 170def. refs. correlation-coeffiConsistence of cients. 204. 391 . naming a pair. L. 198 refs. 392. 389. application of theory of sampling. 64-67 testing . for three attributes. 282-284. cor- . Class-interval. 79-80 desirability of equality of intervals.. appUcacoefficient of. influence of magnitude on mean. V. 376-377. 20. and refs. Correction of correlation-coefficient for errors of observation. Colours. difference method.. 328-331 .. frequency-distribution. 7. 71-72 of a variable for frequency. ahrschdrdichkeitsrechnung und KoUeklivmasslehre.. 24. 80-81.. see Correlation. 8 class symbol. 9 manifold. Cloudiness at Breslau. generally. Correction of death-rates. 351 .. (qu. 23. regressions. ous classification. law of small. (qu. tabulation of infirmities in. of contingency.. 8. ultimate classes. refs. 17-24 . 10. tomy. in theory of attributes. class-frequency. 199-201. of association. 157—253 . diagram. 355. theory of frequency-curves.iUustrations. 157. 116. refs.. choice of magnitude and position. case of equality of contrary frequencies (qu. deaths in. 59. on standard deviation.. 165-167 . generally. distribution or correlation table. standard error of (refs. 349350 . cance of generalised regressions and correlations. 160 diagrams. partial regressions and correlations. 233-234 normal equa. 333. 206-207 . 159 . 4) 334 applications to theory of quaUtative observations (refs. direct. correlation . 189. out-relief. 189 . isotropy of normal correlation table. 247. ooefSpient. diagram. Illustrations and Examples. 249-250 consistence of coefficients. Changes Movements of infantile and general mortality. 175 diagram. Partial. 325 . 239-241 geometrical representation.— . regression. 1) 289. 173 . Correlation. 251252 limitations in interpretation . Refs. 388.). 321-322 . normal. 225-226. application to theory of weighted mean. 177181 constants. pauperism and . 237-238 .). Normal. 387. 216-217. diagram. 3). 199-201. 221-223 . of correlation. Movements of marriage-rate and foreign trade. correlation in theory of sampling. contour lines. 185-187. 218—219 effect of adding unoorrelated pairs to a given table. 250-251 . correlation due to heterogeneity of material. Ratio. facing 166. deduction of expression for two vari- constancy of 318-319 . Sex-ratio and numbers of births in different districts. efEeot of errors of observation on the pauperism and out- reUef. correlationratios. 189. correlation between Twodiameters of a shell {Pecten). 245ers. 3). treatment by partial correlation. and motherLengths of daughter-frond in Lemna minor. 197-199. see below. 229-231. 317-334. standard-deviation of arrays and ables. 229-253. 342. 234-235 . 352. of the partial correlation-co- . 327. 236-237. 189 . . . on assumption of normal correlation (Pearson's coefficient) (refs. For Illustrations. of . Refs. . the problem. 404 THEORY OF STATISTICS. 213-214 . 207 . 238 arithmetical treatment. 188. 328-331 outline of theory for any number of variables. 176 . 189. 236 reduction of standard-deviation. constants (qu. 331-332 . to fourfold form round medians (Sheppard's theorem)^ (qu. coefficient for a fourfold table. correlation between indices. 175. Statures of father and son. 252 . 320-321 . 389. 332-333. coefficient for a normal distribution grouped linearity of regression. 196-197. 158 . 219-220 . Weather and crops. signifitions. 215-216 . constants (qu. 387. possible pairs of values. 271. 321 . relief. partial correlation. outproportion of old and population. 7) 275 and (qu. fundamental theorems on product-sums. 182-185. for all 40. Correlation. correlations . . : 319-320 normality of linear functions of two normally distributed variables. partial. testing normaUty of table. 245-247 coefficient of «-fold correlation. Earnings of agricultural labour- N in pauperism. standard error of coefficient. fallacies. of contour-lines fitted with ellipses of normal surface. 217218 . 195-196. 238245 representation by a model. and regressions in . 387. constants (qu. 241-245. diagram of diagonal distribution. 322-328 . constants (qu. . 207 . Discount rates and percentage of reserves on deposits. 312-328 . . testing for normality of correlation table for stature. 192-195 . 247-249 expression of . 208-209. direct deduction. standard-deviations of arrays and comparison with theory of sampUng (qu. 3). 163. 204-207. 2) 189. diagram. 239 correlation-ratios. 362-363 notation and definitions. 3). (qu. terms of those of higher order. 333. constants (qu. Ages of husband and wife. 3). 162.ratios. 388. Old-age relation-ratio. 174. Fertility of mother and daughter. 286-289. 161. facing 166.. principal axes. of a sum — or difference. 252. De Morgan.. 144-. 154 calculation of. Cosin. refs.. mean. 77 . proof that arithmetic mean exceeds geometric. probability. association with with occupation 32-33 sex. data cited from.) 387. 358. refs. 252-253. 358. .. Czuber.. 100. correlation of movements. . refs. 282-284. tables. 282-284. other names for. table. 37. efficient. J. omega-funcref a. deaths from explosions in mines. 204. 211212 range of six times the s. Cournot. Cotsworth. refs. 15 census tabulation of. standard. 144-145 refs. 361. refs... 23 . deaths in childbirth.. . A. little 135 affected by small errors in the mean. 177. E. D. 62-53. E. 104. 211 .. Darwin.. round . Refs. 158.. 389. 45-46. amongst offspring of deaf-mutes. 144 . : . Detlefsen. criteria (refs. root-mean-square. correlation. N consecutive natural numbers. 138-141 influence of grouping. 287-288 inapplicability of the theory of simple sampling. 188. quartile. 14-15. . diagram. (qu. refs. values of estates in refs. 98. applications of theory of sampling deaths from accident. 273. 134-144 def.. 234. Wahrscheinlich. 204-207 error. 143 . and Deaf-mutism. association of. 387-388. data cited from. the median. 146 of magnitude with standarddeviation. 12. For' standarddeviations of sampling. Eefs. Crelle.. B. Deciles... 389. of rectangle. Davenport. 33-34. 236-237 . tiache ForschungsmetJiode. effect of errors of observation on. . table. 285-286. A. 10) 275. 134 relation toroot-mean-square deviation from any origin. for age and sex-distribution. keitsrechnung. 142-143 table.. 331-332. death-rates. refs. . 135-137. 265-266. 361. see Error. (qu. data cited from. 102. 143 . A. partial correlation in case of normal distribution of frequency. A. . 287288 . 389.. . 97 . of an index. 134-135 is the least possible root-meansquare deviation. 150-162 337-341. and 1716. Deviation. Defects in school children... Deviation. of a series . 140. INDEX. refs. A. 252. 7) tages Crops and weather. correction of. 38. 361 Die statis- curve. 197-199 . 252 . a. 388 . Dareishire. in England and Wales. compounded of of others. standard error of. 265. L. Theory of Probabilities. 146-147 . 304. of generalised deviations (arrays). 155-156 comparison of advanwith standard-deviation. 378. Charles.. infantile and general. partial. 319-320. 392. C. 1881-1890. statistical cited from. 299-300. 309. De Vries. 223-225. 332-333. standard 352 . data as to Pecten Refs. 405 partial association and partial correlation.. 214r-215 . contains the bulk of the observations. Correlation ratio. tions. Formal Logic. B.. association between colour pricldiness of fruit.. is least . 196197. of arrays in theory of correlation. of normal . See Quartiles.. theory of Crawford. . E. 358. (partial correction for age-distribution). 314. 135 calculation for ungrouped data.d. 209 . 226. refs. 140-142. association with imfrequency becility. Cost of living. 210-211 .. from diphtheria. Cunningham. multiplication table. Datura.. M. standard. 52-53 . standard. 130. of binomial series. 205. 144-147: refs. 145-146.. H.. def. 134 . G. .. illustrations of 128. 269-270. See Deviation. multiplication table. . Deaths.. generally. 260-261. correlation. for a grouped distribution. fluctuations of sampling in Mendelian population. ref. 38. 163 diagram. relation between geometric and arithmetic 9). 252. 354 . 6. 2. 154 .. actual or virtual. Exclusive and inclusive notations for statistics of attributes. 246. For general references. Refs. 154. of arithmetic mean. 344. . 72 . 256-257 . .. Dice. Normal 166. mother. normal correlation. 239-247 diagram of model. 406 THEORY OF STATISTICS.. ref. translation of Meitzen's Theorie der Statistik. 147 . for father and son. 358. table. 49-51 . 328. 144. T. 273. 267. 63. of dispersion. of moments. J. parent. See Sampling. probable errors. 38. 358 Palin. curve. Fallacies. Difference method in correlation. 98 . when chance of success or failure is small. quency-distribution.. records of throwing. 180 by partial correlation. . 258 median. numbers. illustrations. 344-350 . A.. 147 . Hamilton. ). association between father and son. theory of. standard. 34-35. 273. 48-49. genertheory of ally... Dufiell. mode. curve of. R. 354-355. D.. Everitt. 1. Doodson. as illustrating theory of sampling. 273. p. median.. dissection of normal Elderton. 46-48. Explosions in coal-mines. 310-311. 70-71. refs. correlation-ratio and test for linearity of regression. 185187. tables of function. 123 See Deviation. 371 . . probable. 389. and mean. table. of number or proportion of successes in n events. gamma(qu. measures 156 . theory of. correlation due to .. 226. — — 215-216 . refs. See also Sampling. annual value of. refs. . 3) 274. 264^-265 . P. 252. terms formeasures dioe-throwings probable error of (Weldon). unsuitability of.. Y. Estates. ages at death from. 144 . F. 133- standard. 107. Refs. . and child. errors. See Value. mean. tables for testing fit. 358.. Falkner. law of . 289. correlation with : out-relief. pauperism and Edgeworth. 392. 352 . range as relative. 156.. Indexcorrelation. 70-71 association between grandparent. 352 . . etc. correlation between Duckweed. 149 a measure. refs. curve. 97. of coefficients of correlation and regression. 135-137 . 337-341 . def. Discounts and reserves in American banks. . standard Distribution of Frequency. owing to changes of classification. 273.. refs. mean square. 265-266 . 288. 314.. 333 law of error (normal law) and frequency-curves. 14^15. in interpreting correlations " spurious " correlation between indices. refs. facing . .. Dispersion. constants (qu. see Error. W.. in theory of sampling. 351 . of mean square. Error. 197-199 . etc. 2) 189. of standarddeviation and coefficient of variation. 361. 390-391. of quartiles. Eye-colour. in interpreting associations theorem on. when numbers in samples vary. J. 388. sampling. diagram.and daughter-frond. tables for calculating Pearson's coefficient for a fourfold table. Diphtheria. calculation table of powers. H. etc.. 130-131 188. 145 . theory of. Frequency Curves and CorSee relation. normal correlation surface. See Freof . quartiles.. 267 . Dickson. Duncker. mean Earnings of agricultural labourers calculation of standard-deviation. diagram. 354. Error. 387. of percentiles (median. 315. 177-181. 239. 144. F. Quartiles. in sense of semi-interquartile range. 61 66-68 non-isotropy of con tingeney table. 154. 258-259. 333. mean refs. mean deviation.. G. (qu. 358 . testing for significance of divergence from theory. 370-373 . 53-54 contingency with hair-oolour.. Deviation. 144. refs. deaths from. ref. Fit of a theoretical to an actual frequency-distribution. 380experimental illustrations. normal. ability. of barometer heights at Southampton. T. Fisher. of material. measures of dispersion. 375-386 contingency tables. ally. H. 367-386 . 384 . N.. . testing goodness of fit. 301-313 tion. Normal curve . 375-377 association tables.. 370-373 . . Feohner... 161. . of degrees of cloudiness at Breslau. of fecundity of brood mares. construction 84. illustrations and examples. A.. extremely asymmetrical (J-shaped). etc. illustrations : of death-rates in . index-numbers.. 389-390 . refs. testing goodness of fit. L. refs. probable errors. hypergeometrioal . 388. 314. table. 354. 387.. E. . 87-105. . G. ref. 361. of pauperism in . of weights of males in the U. 95 . 83 . 385-386 refs. See Binomial series . 77 of ages at death of certain women. ' . . . 96 . frequency-distribution of consumptivity. 144. correlation. Galton. diagram. theoretical forms. 3 . P-table for use with association tables. 78 of . annual values of dwelling-houses in Great Britain. 152 . of. 175 . 76 formation of. 361.. 87-90. A. Frequency-curve. . 90-98. normal.. W. refs. mode. 96 . 291-300 . Forcher. Feeding trials. refs. diagram. 104 grades and percentiles..gener- 289 . 301-313. 226. refs. .v. 98-102. goodness of fit in contingency tables. ref. 88. See also Correlation. 391. (qu. 129. 131 . errors in. Fountain. facing 166 jSee forms and ex164-167 ... of headbreadths of Cambridge students. 377380 aggregate of tables. Fay. Sir Francis. ref. 98 of annual values of estates. 10. binomial series. ref. errors in. 251-252.. different districts of England and Wales. median. errors of sampling in correlation-coefficient. diagrams. Gabaqlio. 83-87 ideal forms symmetrical... U-shaped. 381 381-384. 78 of stigmatic rays on poppies. index-numbers of Frequency prices. averages. Hereditary Oenius. refs.. inheritance refs... Galloway. 289. Irving. data cited from Marriages of the Deaf in America. 79-83 graphic representation of. . normal curve (q. A. experimental illustration. frequency-dis- tributions. moderately asymmetrical. of comparison frequencies given a priori. 154 . of a class. mean. Fecundity of brood-mares. 387.. and 3) 389-390. normal.K. normal curve... 150. R. 375. Teoria generate delta statistica. 93 . Flux. 392.. 129. comparison frequencies based on the observations. Fisher. measurement of price-changes. of. 96 . 94 . 166. . 102-105 . 90 . constants (qu.). 218difference of sign of total and partial correlations.. Correlaseries (ref.). 407 heterogeneity 219 .. . 104. 391. ideal forms of. 6. England and Wales. A. of statures of males in the 84 U. regression. 391. amples 166.. 105. refs. 374^375 . A. Die statiatische Methode als selbstdndige WissenacJiaft. tables for. 226. 76.. 367-375 cautions. 314. ref. measure of dispersion. def. 195-196. diagram. 103 . 208. teBting. of percentages of deaf-mutes in offspring of deafmutes. 102 . .).K. Field trials. Frequency-surface. 104.— INDEX. . 390. 314. Frequency-distributions. : Frequency-polygon. Fertility of mother and daughter.. Correlation. 87 ... Filon. H. 370-373. T. 100 . refs. 354. . 3) 189 . ages at death from diphtheria. Kollektivmasslehre. Mathematical Theory of Probabilities. (ref. . Treatise on Prob- 392. 208. 315. 131. G. 358. 390 Fluctuation. of petals in Ranunculus bulbosus. 361. 358 . refs. etc. correlation between movements of two variables. 354.. Hart. in rural and urban districts. ref. 270-271. 128. 116. monic. refs. 152. 6. HaU. together with the Observations on the Bills of Mortality more probably by Captain John Oraunt. refs. compound normal dissecting curve. Hull. inoculation statistics and association. 390.. 61-63. Hypergeometrical Series. 159 . influence on standard-deviation. of movements 201. . correlation between ages. 203-204 of Histogram. Principlee of Statistical Mechanics. binomial machine. 328 299 data cited from. Graphic method. See Mean. intra-class coefficients.. S3 . Geometric mean.. 313. 61annual value of. correlation. 289. 354. 173 . errors in. of contingency. Greenwood. Hair-oolotjr : and eye-colour. normal curve. Gray.. H. 46. constants. general ability.. index correlations. H. 188. refs.. interpolation for median or percentiles.. 226 Natural binomial machine. miscellaneous. 68. refs. H. . 272. multiple happenings. inhabited and uninhabited. 226. 85. '' mean 314 . 208. refs. Gauss. Observations on the Bills of Mortality. de. B.204. and 358. 113-114. 115. 388 . H. H. fre- of application of correctionforage-distribution. refs. Gibbs. J. . 388. cited re Cosin's Names of the Roman Catholics. C. M. 253 theory of partial correlation. Inheritance. ref. of representing correlation between two variables. 100. Tables for computing probable errors. 391. method Refs. of. 118. Horticulture. choice of class-interval. Harmonic mean.. refs.. intelligence. short method of calculating coefficient of correlation. 208 . J. . Refs. . har- 176. correlation. refs. John.... 84. 79-80 . Grindley. table. on mean. error in field experiments. Helguero. geometric mean. 84 . Head-breadths of Cambridge students. 153. Graunt. 34. Refs. influence Hooker. 315.. F. 4) 131 .. law of small chances. table. of least squares. 69 . 391. table. ref. 151-152 . C." 144. refs. 154. Husbands and wives. 70. 289 . between two variables. 313 .. 354.. refs. forming one binomial polygon * from another. 188. Heron.. 140. normal correlation. of statistics generally. 295-297. 5-6.. Harris. 332 relation between indices. . (qu. 332. 40 .. Grouping of observations to form frequency-distribution. Hollis. data cited from. frequencycurves (epidemiology). probable error of a partial correlation coefficient.. cor154 . diagram. 130 percentiles. 226 .. . 62 median. 180-181 . 6. theory of sampling applied to certain data. refs. of correlation. The Economic Writings of Sir William Petty. 388. 208 .eto. P. 212. abac giving probable errors of correlation coefficient. (qu. 4. . Geiger. geometric. errors of agricultural experiment. ex- ample 67 .. of estimating correlation coefficient. Geometric (logarithmic) mode.. association. Gibson. errors of feeding trials. 196 .. 389 . John. A. 200. 387. correlation between weather and crops. use of term error.. 269. A.. 270. R. 3) 155. Willard. 252. quartiles. F. Houses. 209 . 391. relation between fertility and social defective physique status. 40 application of law of small chances. . errors of sampling (small samples). Galton'afuiiotion(correlationcoeffioieiit).. of representing quency-distributions. Grades.. diagram. construction History. 390.. (qu.. D. 408 THEORY OF STATISTICS. 83-87 . Winifred. 66- non-isotropy. 391. . weather and crops. See Mean. 252 . 3) 189. S. T. 209 .. D. Hudson. data cited from.. measures of 208.. 265-266 366-367. John. Illusory asaooiations. refs. refs. criterion of. def.. Little. W. 388. 5.. Lyon. T. 389 .' 144. 390. price index numbers. use of harmonic mean. T. M. 360. Linearity of regression. fertility and fecundity. ref. Laplace. 154. refs. following law of small chances. refs.. dependence (association. 392. earnings of agricultural. 185-187. 361. tables.. 352 . use of geometric 127. refs. correlation between lengths of mother. census tabulation of. 314. age statistics. Refs. LobeUa.. error in soil survey. 67-71 of normal correlation table. 129 refs. 38. 390. C. system Lee.. Julin... 14^15 . 273 . form of contingency or correlation table in case of. ' reasoning numerically definite (theory of attributes). ). Essa. appUoation of theory of sampling to certain data. ratio. 328- Larmor. 354 . 226. quency-distribution. Principes de Statistique. refs. goodness of fit test for. 361 .. 56-57 . 128. 359. data as to agricultural labourers' earnings cited from... 15 tions in Currency and Finance. refs. Logarithmic increase of population. J. logarithmic mode. Stanley. indexnumbers. use of word " statis 331 . Erblichkeitslehre. test for. 33-34. Johannsen.. def. Pierre Simon. 273. 164. L. Keynes. refs. Infirmities.. classification of. frequency distributions. Abhandlungen zur Theorie der Bevolkerungs. frequency-curves. crops and rainfall. King George.. 252. 25-28 . . 137.. fre. 71 . 389.. Imbecility. mean deviation least about the median. 205-206. 73 Isserlis. L.. A. 130.. E. 391... v. Sir tical. tistih. Pure Logic and Investigaother Minor Works. 154 . case of complete. See Earnings. Theorie der Massen erscheinungen. 391. W. in a.. 392. Lexis. Refs. A Treatise on Kick Probability. correlation between. Kapteyn. cholera. 226. tistics. Independence. 126127 . errors of agricultural experiment. refs. A First Course in - Statistics. J-shaped 98-102. in correlation table. 375-381. partial correlation- 252. 130-131. Alice. deaths from. laotropy. 48-51.. association between deafmutism and imbeciUty. 208. correlacontingency... of a horse. 387 . Marquis de. probable error of median. 361. W. J. 68 . 344. 215216 . refs. Jacob. graduation of. associations with deafmutism. 272... 387-388. 226 tables of functions. generally." 4. refs. . 358. inheritance of 160. 269-270. Indices... 40 Feohner's Kollektivmasslehre. 80-81 . G. Koren. refs. L. normal curve. 122. Elemente der exakten ... Index-numbers of prices. Befs. for attributes. use of term " precision. 33-34. tion. for attributes.. Refs. probable error of mean. W. G.. J. 392.. H.. 287. Intermediate observations. Jevons. C. refs.. etc. Skew Frequency-curves in Biology and Sta- 130. 126 use of geometric mean for. 161. 314. conditions for real significance of probable errors. INDEX.and daughterfrond. Inoculation. 130 .und Moralstatistik. Inclusive and exclusive notations for statistics of attributes. 129. refs. J. M. 105. Lemna minor. 125-126 . Oeschichte der Sta- Jones. 361. 388. 96. 409 Kelley. 14-15. W. refs. 392. philosophique. 38. Theorie anaZytique des prob abilites. . Knibbs.. refs. S. Lloyd. 15 . Lipps. of mean. History of Statistics^ Labouebrs. 354. examples. F.. 379-381. . logarithmic or geometric mode. 113. Mendelian breeding experiments as illustrations. slight influence of outlying values on. errors of feeding trials. refs. nature of. H. 121-122. 84. W. 144.. correlation. weighting of. 226. Theorie und Teehnilc der Statistik. 3-5. 125-126 convenience for index-numbers. for ageand sex. statistical. Macdonell. 117 graphical determination of. 130. .. 223-225 refs. 304. (refs. refs. diagrams. Maxwell. less than arithmetic mean. position relatively to mode and . harmonic mean. 199-201. 226 . cultural experiment. when numbers in samples vary.) . . 128-129. def. 122 . 108. 220-225 of binomial series. 115 . 128-129 proportions of albinos in litters.. 264-265. B. purport of. refs. Robert von.. . origin from normal curve. standard error of. in estimating intercensal populations. def. def. generally. sum of deviations from. mean and mode. of series compounded of others. 391. 114. 109-113. is less than arithmetic and 128 geometric means. Mean. Median. (qu. Maoalisteb. 8) 156 . is zero. comparison with median.. arithmetic. Sir Donald. (qu. H. position relatively to Meitzen. 221-223 ap. approximate determina- of. (refs.. A. generally. weighting error.. use of word " statistical. errors in. Mode. 108-109". 5. 314.. weighted. 128 .. 124 . . 220-225 March. refs. 225. standard. R. (refs. 389. unsuited to disseries. .. errors of agri- 355. Geschichte. 120 . See Deviation.. 220 . 123 . W. 337-341. from mean and median. John.. 123. influence of grouping. refs. mean is the best for all general purposes. of series compounded of others. .. median. 115-116. Mean deviation. correlation of movements. Deviation. 299 standard error of. . Mercer. . Mice. Clerk. 144 . 116-120. weighting of. calculation. of sum or difference. 118 comparison with arithmetic mean. 120-123 def. Mitchell. ref. def. Milk testing. 267-268. 119 summary comparison with median and mode.) 387 diagrams. 116.. 123-128.. 37. law of geometric mean. 114. 108 . 121-122. 225. calculation of. 387. fluctuations of sampUng in. 116-117. 124 . 115. plication of weighting to correction of death-rates. 114. data cited from. 9) 156 use in averaging prices of index-numbers. 144. for a grouped distribution. 119 advantages in special cases. Mohl. 108 generally. def. Marriage-rate and trade.. 114. 264-265. 390. 6. 208 index-numbers. 108 . 264-265.. 122-123 .distribution. 128. Mean square difference error. 130.. 391. etc. standard Methods. mean. refs. generally. calculation. 116 . 128. 127-128 ." 4. 129 in theory of sampling. 119-120 . 108116.. tion. Geschichte . P. 90.. . 124 . 113. def. of series of ratios or products. refs. . fluctuations compared with theory of samphng. 126-127 . 410 THEORY OF STATISTICS. between weighted and unweighted means. 113.. . 129 difference from arithmetic mean in terms of dispersion. 120. . difference from arithmetic mean in terms of dispersion. refs. . use on ground that deviations vary with absolute magnitude. Milton. weighting of. harmonic. use of word " statist. 109 calculation of... 121diagrams showing position relatively to mean and median." 1. indeterminate in certain cases. See Error. generally. 387. 113-114.) 387. 38. 273. L. 334-350... geometric. continuous observations and small 116-117. Modulus as measure of dispersion. 391. numbers in litters. 226. bi273. 333. . Mortality. 3. 304 series comparison with binomial for moderate value of n. H. 389. NEGATrvE 10. refs. calculation of first. calculation of general methods of curve-fitting. 358-359. J. 355 errors in variety tests. See Death-rates. P. with out-relief. Ref.. proportion of aged. 182185 . . data cited from. 96. 307-308 . partial. 117. contingency. 391. 226 . correlation of characters not quantitatively measurable. correlation of. 192-195.. experimental test of normal law. Moore. . Karl. refs. Morant. 149 inheritance of skewness. 314. normality in fluctuations of sampling of the mean. Refs. etc.. refs. 333. 122. 333. 72-73. Raymond. tility (ref. Pearson. 346-347. dissection of compound curve. Refs. 65 ? mode.. Vital Statistics. New O'Brien. Dictionary of Political Economy. 390... 225 selection. with earnings and out-relief. 209. Order. I. . 154. 160. 144 ooefiicient of variation. 310. 120 . 390. . 391. Refs.. 311-313. 110 second and general.def. normal. 2) 189. Bramley. INDEX. 306 tion . 215 nomial apparatus... fitting of principal axes and planes. probable errors^ . Literatur 411 der iStaatswissen- scJiaften. bibetween indices. 151-152. quartile deviation and probable error. 314 . 177-181. For normal correlation. J. 226 ..W. series. probable errors. testing fit of theoretical to actual distribution. . H. . 209 . 105. 138-140 145-146 . normal distribution of number of seeds in Nelumbium. 289. moments. Partial association. D. 78. . more general methods of deduction. refs. and standard deviations. percentiles. methods. etc. 314. 387 frequency curves. refs. A. . 354. refs. 161. weighted mean. Ref. 310-311. refs. Nixon. 113 cajtable.. 162. Norton. . G. numerical examples of use of tables. Palgkave. means. 105. 208. See Correlation. in two 197-201 . tables. V. diagrams. outline of general. 309-310 .. contingency. 304 table of ordinates. in England and Wales. Pearl. Normal curve of errors . 388. 252. G.. 241. und Moir. inheritance of fer208. .. and its use. 6. . 306. 148 . 96. 40. 389. See Association. . 333 . Partial correlation.. birth-rates.. correlation and correlation-ratio. hypergeometrical series. errors in feed- ing experiments. and fecundity. 315 . quartiles. 195 spurious correla. from. 388. fitting to a given distribution. 226. 305-307 . 303 mean deviation and modulus. dissection of curve.. . 315. deviations. compound normal . 3X5 . 299 deduction data cited of normal curve. 161. of generalised correlations. 226 . 315..) 154. 245-247 . 359. reproductive moments. L. deduction correction Pauperism.. 239-241. 209. 233-234. 135 . (mortality). 111 of median. the table of areas. 63. frequency-curves . Moment. deviation. regressions. 118. 93 culation of mean. del. Statistical Studies in the York Money Market. correlation with out-reHef. Pareto. Newsholme. classes and attributes. data cited from. medians. . partial. 301-302 value of central ordinate. 10 . 314. Sir R. (qu. of a class.. .. 304-305 from binomial 245. 208. 122 stanmean dard-deviation. 289. 70. 389 nomial distribution and machine. etc. standard-deviation. and modes for other years. 225.. fertility. 90. Cours d'economie politique.' variables. 92. correlation between indices. . 149 . 188.. . see Correlation. Movements. 130. 388. for age-distribution. inheritance of fertility.. frequency-distributions. Random sampling. 78 . L. works on. 341-342 . to standard-deviation... in terms of.. 358. Peters. Positive classes and refs.. 147-149. applications of theory of probability to correlation of ages at marriage. methods of correlation based on (refs. facing 166. 52-53. sex-ratio. 283. to median as measure of relative dispersion. 289. for tabulation. refs.. standard errors. S. D. 362-363. 365.d. 102 . 162 diagram. (qu. QtfABTiLE deviation. Precision. Principal axes. J. 154. W. refs. Petals of Ranunculus bulbosus. U. See Quartiles... ratio of q.. L. correlation. frequency of. 304. 310 . ref. 391 . 208. def. refs. 144. refs. 152-153 use for unmeasured characters.. S. 147. 355. Pecten. Percentage. 149. frequency of petals. auffioienoy of. unsuitability of dispersion. PoppieSjStigmatic rays on.. 126 . 152153. non-linear. Ranks. 257. Poynting. 359. Regressions.. standard errors of. 333 . Pickering. 201-202. 226. pling of attributes. 134 generaUy.. refs. errors of agricultural experiment.d. 391. . bilites. refs.. Quartiles. .. 387-388. multiplication table. H. index-numbers of. tables for computing probable errors. of. 130 data cited from Reports.). 341-343. 175 total and partial.. Calcul dea proba361. quartile deviation and semi-interquartile range. . refs.. . . generally.. 147-148 . . 153 . Probability. advantages of q. 126 . advantages and disadvantages. dard errors of. 412 THEOKY OF STATISTICS. attributes. in sense of simple samphng. refs. Perozzo. Relative dispersion. unsuitability of median in case of such a distribution.. ref. 163. as a measure Petty.. 197-199. mean. 333. difference between deviations of quartiles from median as measure of skewness. 253. 150 de. q.. 273. of. 158 . 388. constants. . expression of other frequencies. 256257 . Percentiles. 352 . estimation censuses. . .d.. 6. when numbers in samples See also Samvary. direct deduction. standard error of.. 148. 199-201.. 125-126 . Poisson... refs. Rhind. 13-14. refs. J. 333. use Prices. Becherches sur la probabilite dea jugements. Range. 143. as a 310 measure of dispersion. 3) 189. 130. refs. Reserves and discounts in American banks. 117. Sir W. 390. 130-131.. Quetelet. 285-286. estimates of 224. tables for statisticians. Poincare. 367. theory of. Lettres sur la theorie des probabilites. 149-150 . 98. 388 population. 387. 117. 133. 366. 337-341 . termination. of normal curve. A. 273 .. index-numbers. . 361. 321322 . 116. refs. 10 13 13 . 361. 361. determination. of harmonic mean. J. 352. refs. 272. refs. applications of theory of sampling to experiments in crossing. law of small chances. 222. Population. 129 . 201 . Peas.M. 233 stan: . 263.. correlation between errors of sampling in. 149 . H. between Registrar-General correction or standardisation of death-rates. 391-392. number of positive classes. defs. 32-33. . ref. 150-153 def. ratio of q. 358. refs. Banunculus. 314. 267-268.d. 151-152. 175-177 def. 337341. unsuitability of median in such a distribution.. Economic Writings. correlation of fluctuations. 205-206. correlation between two diameters of shell. 148-149 . 102 unsuitability of median for such distributions. 284. A. of geometric 355. 77.f requency.. Persons.. in correlation. 208209. 264^265.. Saunders.. when numbers in samples vary.. 2) 289. use of words " statistics. 307 . 254355. 273. " statist. series Hypergeometrical series . 163. . . correction of the standard-deviation for grouping. normal normal curve. 355. theory of. Sir R. W. curve . Skew or asymmetrical frequencydistributions. normal curve and correlation. 313-315. constants. errors of agricultural experiment. 390-391. Ross. (qu. 391. " statistics. G. 346-347 . 391. 1. 175.. 273. 225 . of correlation-ratio and test for linearity of regression. 347-350 . Correlation. Sheppard. 317-334. (qu. 207 . of arithmetic 345-346 . 37. del. refs. 262-264. 344.. . E. inverse interpretation. 300-313 correlation. measures of. 337-341 dependence of standard error of median on the form of the disof difference tribution. conditions Sampling of attributes assumed in simple sampUng. standard error of tion of standard-deviation and coefficient of variation. 264^-265 ." 2. 333... (qu. 279-281 effect of removing conditions of samsimple sampling. refs. appUcation of the theory of sampUng to. arithmetic and harmonic means. 176 . 289. 291-300 . 390391. 272also Binomial 389." W. of coefficients of correlation and regression. 351 . 352 . 266. Significant differences. use of word when n is small. E. 276-279 . Russell. mean. 255256. . 287 binomial distribution. W. (qu. Robinson. correlation of polychoric table.. Scheibner. Scripture. 149-150. : See Quar- Sex-ratio of births correlation with total births. 212. . 8 deviation of number or proportion of successes in n events.. (qu. . . 259-262 . generally. 107 .. R. of variables. 267 . 265-266 standard error. 273. ref. Ritchie-Scott.. 90-102. data cited from. P. use of word 3. 354-365. 335standard errors of percentiles Refs. refe. refs. See also Frequency-distributions. 337 standard errors of percentiles. INDEX. 289 . 262-264 . standard 1. conditions asin simple sampling. the problem. sample with theory." tiles. (median and quartiles). 271-272 interpretation of standard error 299-300 . 3) 189 . Sampling. 9) 156. 359. E. J. . 4) 334 normal curve tables. Skewness of frequency-distributions. Rutherford. . Shakespeare. A. refs. random in sense of simple sampUng. theory of sampling. difference between arithmetic and geometric. comparing a. and qu. 258-259 . standard: 354-356. error of ratio of male to female births. Sampling sumed 337 . Normal normal." " statistical.... W. calculation and correction of moments. when chance of success or failure is small. law of small chances. 281-289 pling from limited material. comparing one sample with another combined limitations to with it. 352 . (qu. effect of removing conditions of simple sampling on standard error of mean. limits as a measure of untrustworthiness.. 413 percentiles. Sir John. 11) 275... Semi -interquartile range. 387. 338-340 . error in soil surveys. 268-271 . . refs.. W. between two 341-343 . 314.. application tw sex-ratio.. normality of distribumean. 264r-256. 7) 275. examples from artificial . chance.. Miss E. 344-350 of difference between two means. Sinclair. 390. refs. 267-268 comparing one sample with another independent therefrom. tables of normal function and its integral. refs. (See 273. theorem on correlation of a normal distribution grouped round medians. frequency-curves (epidemiology). diagram.. 256-257. 5) 355.. Small chances. 414 THEORY OF STATISTICS. I. James. " Student " (pseudonym).. Statures of males in U. mean 153 . John. refs. 273. Todhunter.. (qu. . 154. refs. probable errors. 164. 5. 391. . . for . refs. correlation of... method 215- Standard-deviation.. 160 . of a frequencydistribution. 329-331 . 253 . Slutaky. standard. E. . 216 . dia- Thorndike. 223- 225 . law of. errors of agricultural experiment. 387. refs. methods of measuring correlation. refs. 208-209.l distribution. E. lines and planes of closest fit.. 322-328 Names etc. 8. gram. 390 of bi-serial expression for correlation coefficient. 141 . refs. J. Symons. Stigmatic rays on poppies. probable error of correlation coefficient. F. : ment in meaning matical expectation of moments. (qu. distribution fitted to normal curve. expression for factorials of large numbers.. refs. Refs. Tatham.. use of word " statistics " in British Rainfall. J. 88. C. 209. 3.K. 12 . effect of errors of observation on the standard-deviation and coefficient of correlation.. refs. 391. 333 Theory of Mental and Social Measurements. 226. Statist.. (qu. 1 1-14. 389.. def. for age-distribution. of a correlation table. property ninth deciles. L. Statistical. F. of fitted contour-lines. Soil surveys. def. Tohonproff. fit of regression lines. methods. median for such distributions. tables. 91 calcula90 means and tion of mean. 5. Tocher. 174 .. standard-deviation. 355 .. . effect of errors of observation. C. H. A. classes and frequencies. 304. 87-90. 273. diagonp. 389 . 213-214. 226. 164.. 388. freunsuitability of quency. 78 .... refs. test- Type of array. 355. 5.. and . correction of death-rates.. M. M. refs. Normal curve. 273. 388 rank ... def. 333. refs. cited re Cosin's ing for normality. A.. 391. 344-345 of standard-deviation and semiinterquartile range. Tabulation. errors in. Spearman. 341. contingency. 343. See also Frequencydistributions . 6. purport of. 391 errors of Spearman's correlation coefficients. H. 12-13. tables of exponential binomial limit. 209. J. dard-deviation. Stevenson. Robert. diagrams. 325. Stratton.. standard errors of of first mean and median. Symmetrical frequency .. 306 . of median. . sufficiency of . Snow. 355 . 388.. 81-83 . Surface. C. 388. introduction and develop- ment 1-5 . 1. 391. percentiles. introduction and develop- of statistics of attributes. T. 225-. refs.. errors in variety tests. 1) 131 stanof word. for father and son... occurrence of the word in Shakespeare and in Milton. 388. 361. refs.. Spurious correlation of indices. 3) 189 . E. See Deviation. 305-306... 117. M. probable of correlation. 112 medians. 197-201 refs. Soper.. History of the Mathematical Theory of ProbTraohtenberg. facing 166. Account of Scotland. 2 .. 265-266. refs. E. 307-308. error in field trials (chessboard method). 226. correction of. Royals.. Stirling. Statistics. 1) 155 . refs. refs. mathe- 1-5 theory of. diagram of isotropy. 333.distributions. for tabula- tion. I. 3-5. Southey. 130. Time-correlation problem. birthrates.. estimates of population. 327. 37 . constants. 100. 116. law of small chances. Society. 226. Ultimate def. refs. def. deviation and quartiles. 3. 363-367 . 89.. in the meaning of the word. of the Roma/n Catholics. (qu. E. Standardisation of death-rates. diagrams. 391. 314.. 273. 273. Working classes. Wages of agricultural labourers. E. U - shaped frequency . refs. weighted also Mean. . estimating Weather and interoensal populations. 101. Allyn tics. F.. age statisrefs. refs. 252 . 273. use words " statistics. Wood Frances. 253. W. R... Wicksell. . table. 192 data cited from 78. 361.. .. Andrew. 382. multiplication table. refs. Variation. refs.. 388 application of law of small chances. 258-259. 226. 391. theory of. crops. Venn. . 273 . EDINBURGH.. (qu. C. numbers. 370-373. 105. 149 . . Earnings. refs. 154. time-correlation problem. F. 273 .. 130.. etc. study of defects in school-children. Refs. LTD. Vigor. 273. errors in. 387. probable sex-ratio. 388. 361. 226. 130.. E. Whittaker." attributes.. sampHng in Mendelian ratios. 2) 131. data cited from. D. standard error of. 4) 131 . 140. measure of relative dispersion.. 391." " statistical. specification of. table.. Befs. 387. Yule. John. A. Young. quartilea. J.. A. G. coefficient of. refs. Weight of males in U. INDEX. relative dispersion. 57 . Young. refs. 389. T.. Westergaard. Universe.K. 39. generally. 15.. B.. 3) 155. ZiMMEBMANN. index-numbers.. 75... see Mean. 163. 149. statistical. dice-throwing experiments. refs. problem of pauperism. 389 . Water analysis. 226 . H. 351-352. real. geometric Median Mode. Introduction Mathematical Statistics. 415 refs.. 208. 150. F.. . . 5 . 94 mean. consistence. 93. of estates in 1715. Zizek. 163. probable errors. 40. 83 . error of coefficient of contingency. Logic of Chance. law of small 102-105.. 188. West. 100 diagram. refs. Value.. isotropy. correlation between indices.. 252. Lucy. correlation. fluctuations of sex-ratio. 208. 129. Waters. 23.. correlation. index correlations. notation for statistics of attributes.. refs. Variables. def. 390.. birth-rates. Die statistischen Mittelwerthe and translation. W. F. 358. (qu. 196197 . see 185.distributions. 95 diagram. 122. cost of living. goodness of fit in association and contingency tables.. Variety tests. . 17 17. 1.. refs. median. def. 390. 73 correlation. association. 18. facing 186. 2) 155. use of term characteristic lines (lines of regression). refs. def. 387-388.." " in EngUsh. citation of Bielfeld. .. ref 4.." Weldon. Variates. influence of bias in statistics of qualities. median. refs. annual.. A. C. table. 1. 391. D.. 6. WiUcox. 392. 7. (qu. S. 208. Versohaeffelt. methods. 388. Refs. frequency-curves. Theorie der Statistik. 177 . to H. 75253 .. of dwelling-houses. Warner. mean deviation and quartiles. 15. 226.. H. U. PRINTED IN GREAT BRITAIN BY NEILL AND CO. errors of agricultural experiment. sex-ratio.. 355 . W.. Weighted mean. of the (qu.. and mode. pauperism. 388. standard-deviation. history of words " statistics. Wages. refs.. 259. Documents Similar To An Introduction to the Theory of StatisticsSkip carouselcarousel previouscarousel nextFashion Design Case and DastarCorpora in Cognitive Linguisticsall_of_statistics_finalContract TheoryVarian - Microeconomic AnalysisMath Stats TextAuthorship and Datecantonese-english_SLIBootstrapThe Principles of Readability10.1.1.87Introduction to Mathematical StatisticsSalanie 2005 - The Economics of ContractsMood - Introduction to the Theory of StatisticsDeceiving Authorship DetectionR BootstrapApplying GLMMs to Word Counts to Analyze the Literary Style of Detective Short Stories by A. Conan Doyle, G. K. Chesterton, and E. W. Hornung to Detective Short StoriesSampling Theory and MethodsHypothesis Testing3 Vol 4 No 1220120109 Whos Hot and Whos NotBootstrap01. Exploratory Data AnalysisTexas AuthorshipClustering de Textos CortosAuthorship Analysis 4 CcIntro BootstrapShakespeare Authorship FAQQuantitative Data analysisFooter MenuBack To TopAboutAbout ScribdPressOur blogJoin our team!Contact UsJoin todayInvite FriendsGiftsLegalTermsPrivacyCopyrightSupportHelp / FAQAccessibilityPurchase helpAdChoicesPublishersSocial MediaCopyright © 2018 Scribd Inc. .Browse Books.Site Directory.Site Language: English中文EspañolالعربيةPortuguês日本語DeutschFrançaisTurkceРусский языкTiếng việtJęzyk polskiBahasa indonesiaSign up to vote on this titleUsefulNot usefulMaster Your Semester with Scribd & The New York TimesSpecial offer for students: Only $4.99/month.Master Your Semester with a Special Offer from Scribd & The New York TimesRead Free for 30 DaysCancel anytime.Read Free for 30 DaysYou're Reading a Free PreviewDownloadClose DialogAre you sure?This action might not be possible to undo. Are you sure you want to continue?CANCELOK
Copyright © 2025 DOKUMEN.SITE Inc.