Module 1. Descriptive statistics

Lesson 1

DEFINITION AND SCOPE OF STATISTICS

1.1 Introduction

Statistics is a field of mathematics that pertains to data analysis. For the last few centuries, statistics has remained a part of mathematics as the original work was done by mathematicians like Pascal, James Bernoulli, De-Moivre, Laplace, Gauss and others. Till early nineteenth century, statistics was mainly concerned with official statistics needed for the collection of information on revenue, population etc. of a state or kingdom. The science of statistics developed gradually and its field of application widened day by day. In fact, the term statistics is generally used to mean numerical facts and figures.

1.2 Meaning of the Word Statistics

The word ‘statistics’ seems to have been derived from the Latin word ‘status’ or the Italian word ‘statista’ or the German word ‘Statistik’ each of which means a ‘political state’. In ancient times the governments used to collect the information regarding the ‘population’ and ‘property of wealth’ of the country- the former enabling the government to have an idea of the manpower of the country (to safeguard itself against external aggression, if any) and the latter providing it a basis for introducing new taxes and levies.

Seventeenth Century saw the origin of ‘vital statistics’. Captain John Graunt of London known as the father of vital statistics was the first man to study the statistics of birth and death. Computation of mortality table and the calculation of expectation of life at different ages led to the idea of life insurance and the first life insurance institution was founded in London in 1698.

The theoretical development of the so called modern statistics came during the mid seventeenth century with the introduction of Theory of Probability and Theory of Games and chance. The chief contributors being Pascal, De-Moivre, James Bernoulli, Laplace, Gauss, Sir Francis Galton, Karl Pearson, W. S. Gosset, Helmert, Sir R. A. Fisher.

1.3 Indian History of Statistics

In India, an efficient system of collecting official and administrative statistics existed even more than 2000 years ago, in particular during the reign of Chandragupta Maurya (324-300B. C). From Kautilya’s Arthashastra it is known that even before 300 B.C a very good system of collecting vital statistics and registration of births and deaths was in vogue. During Akbar’s reign Raja Todarmal, the then land and revenue minister maintained good records of land and Agricultural Statistics. In Aina-e-Akbari written by Abul Fazal we find detailed accounts of administrative and statistical surveys conducted during Akbar’s reign.

1.4 Definition of Statistics

Statistics has been defined differently by different authors from time to time. In ancient times statistics was confined only to the affairs of the state but now it embraces almost every sphere of human activity.

Webster defines statistics as “classified facts representing the conditions of the people in a state-especially those facts which can be stated in numbers or in any other tabular or classified arrangement”. This definition confines statistics only to the data pertaining to the state is inadequate as the domain of statistics is much wider.

Bowley defines statistics as “Numerical statements of the facts in any department of enquiry placed in relation to each other”. He himself defines statistics in three different ways:

(i)        Statistics may be called as the science of counting.

(ii)      Statistics may rightly be called as the science of averages.

(iii)    Statistics is the science of the measurement of social organism, regarded as a   whole in all its manifestations.

The above definitions are inadequate. The first because statistics is not merely confined to the collection of data as other aspects like presentation, analysis and interpretation etc. are also covered by it. The second because averages are only a part of the statistical tools used in the analysis of data. These are not only the tools but others being Dispersion, Skewness, Kurtosis, Correlation & Regression etc. The third because it restricts the application of statistics to sociology while today the statistics has found its application in almost every field of science.

Perhaps the best definition seems to be one given by Croxton and Cowden, according to whom statistics may be defined as ‘the science which deals with the collection, analysis and interpretation of numerical data’.

Statistics, therefore is defined as the science of collection, compilation, tabulation, analysis and interpretation of quantitative data. It is essentially a branch of applied mathematics i.e. mathematics applied to the observational data. Statistics essentially mean the procedure by which we understand data.

Statistics used as a singular word i.e. ‘statistics’ means as particular kind of estimate compiled from set of observations, usually according to some algebraic formulae. ‘Statistics’ used as singular is a name given for the body of scientific methods (the statistical methods) which are meant for the collection, compilation, analysis and interpretation of numerical data.

‘Statistics’ used as a plural noun means numerical data which result from a host of uncontrolled, and mostly unknown causes acting together. It is in this sense that the term is used when our daily newspaper give vital statistics, crime statistics, sports statistics, agriculture and dairy statistics, food production statistics etc.

Statistics has two broad functions:

·         Descriptive Statistics - The first function is to describe and summarize the information in such a way so as to make it more usable 

·         Inductive Statistics - The second function is to draw inferences about the population from the information contained in the sample which is only a part of the population and thus we pass from the particular to the general. Here the induction has to be achieved within a probabilistic frame work.

1.5 Application of Statistics

Early applications of statistics were mainly concerned with reduction of large amounts of observed data to the point where general trends become apparent. At the same time, emphasis in many sciences turned from the study of individuals to the study of the behavior of aggregates of individuals. Statistical methods were found suitable for such studies, aggregate data fitting consistently with the concept of a population.

Statistical Science has wide applications in Dairy production, processing and management. In dairy production, the productive and reproductive performance of various breeds/species of animals is carried out through various statistical measures. For example, age at first calving, body weight, lactation length, dry period, inter-calving period etc. are closely monitored for best production performance of female animals. In the field of Animal Nutrition, many experiments have been devised to discover the significance of various vitamins, proteins, diets in the different phases of animal production. Similarly, production parameters like daily/monthly lactation yield, fat, SNF, protein and other minerals as well as microbiological parameters in milk are closely monitored for getting best quality and safe milk for human consumption. In industry, Statistics is very widely used in ‘Quality Control’. In production engineering, to find whether the product is conforming to specifications or not, Statistical techniques viz., control charts and Acceptance Sampling, plans etc. are of extreme importance which will be discussed in modules of Statistical Quality Control. In dairy processing, various value added dairy products are developed for which proportions of ingredients are required so as to get least cost product mix which fulfills certain minimum requirements. The chemical, microbiological and sensory attributes of such developed dairy products are also monitored over different periods of storage time. Various statistical techniques are employed in order to fulfill such requirements. Statistics is also playing an important role in Engineering. For example, such topics as the study of heat transfer through insulating materials per unit of time, performance guarantee testing programs, production control, inventory control, standardization of fits and tolerances of machine parts, job analyses of technical personnel, time and motion studies and many other specialized problems in research and development make great use of probabilistic and statistical methods. Agricultural engineering, which combines the practices of engineering and agriculture has also benefited greatly from the use of statistical methods. In dairy management cost of calf rearing maintenance of animals is required to be worked out. Similarly cost of milk production for various categories of animals is also required to be computed across different seasons/regions etc. taking into consideration various fixed and variable costs that enter into cost. There is also requirement for computing cost of processing of milk into various dairy products. Similarly there is also requirement to monitor milk production, utilisation and marketed surplus across various categories of producers and consumers and also assess the demand and supply of milk and milk products. All these aspects require employment of various statistical techniques to achieve the desired objectives.

1.6 Distrust of Statistics

We often hear the following interesting comments on Statistics:

(i)     ‘An ounce of truth will produce tons of Statistics’,

(ii)   ‘Statistics can prove anything’,

(iii) ‘Figures do not lie. Liars figure’,

(iv) ‘If figures say so it can’t be otherwise’,

(v)   ‘There are three types of lies – lies, damn lies and Statistics – written in the order of their naming and so on.’

Some of the reason for the existence of such divergent views regarding the nature and function of Statistics are as follows:

·        Figures are innocent, easily believable and more convincing. The facts supported by figures are psychologically more appealing.

·        Figures put forward for arguments may be inaccurate or incomplete and thus might lead to wrong inferences.

·       Figures, though accurate, might be moulded and manipulated by selfish persons to conceal the truth and present a distorted picture of facts to the public to meet their selfish motives. When the skilled talkers, writers or politicians through their forceful writings and speeches or the business and commercial enterprises through advertisements in the press mislead the public or belie their expectations by quoting wrong statistical statements or manipulating statistical data for personal motives, the public loses its faith and belief in the science of Statistics and starts condemning it. We cannot blame the layman for his distrust of Statistics, as he, unlike statistician, is not in a position to distinguish between valid and invalid conclusions from statistical analysis.

It may be pointed out that Statistics neither proves anything nor disproves anything. It is only a tool which when rightly used may prove extremely useful and if misused, might be disastrous. According to Bowley, “Statistics only furnishes a tool, necessary though imperfect, which is dangerous in the hands of those who do not know its use and its deficiencies”. It is not the subject of Statistics that is to be blamed but those people who twist the numerical data and misuse them either due to ignorance or deliberately for personal selfish motives. As king points out, “Science of Statistics is the more useful servant but only of great value to those who understand its proper use.”

1.7 Limitations of Statistics

·         It does not deal with individual measurements.

·         It deals with quantitative characteristics.

·         Statistical results are true only on an average

·         It is only one of the methods of studying a problem

1.8 Statistical Agencies

The responsibility of collection, processing and tabulation and their dissemination lies with statistical agencies. Following are the major agencies at national level:

1.     Central Statistical Organisation (Department of Statistics, Ministry of Planning and Programme Implementation), New Delhi.

2.     National Sample Survey Organisation (Department of Statistics, Ministry of Planning and Programme Implementation), New Delhi.

3.      Registrar General of India (Ministry of Home Affairs), New Delhi.

4.      Directorate General of Commercial Intelligence and Statistics. (Ministry of Commerce), Calcutta.

5.     Directorate of Economics and Statistics (Department of Agriculture and Cooperation, Ministry of Agriculture), New Delhi.

6.      Labour Bureau (Ministry of Labour), Shimla and Chandigarh.

7.     Department of Economic Analysis and Policy, Reserve Bank of India, Mumbai.

8.     Office of the Economic Advisor, Department of Industrial Development, New Delhi.

9.      Directorate General of Employment and Training, (Ministry of Labour), New Delhi.

10.  Ministry of Food Processing Industries, New Delhi

11.  Agricultural and Processed Food Products, Export Development Authority (APEDA), New Delhi

12.  National Dairy Development Board (NDDB), Anand

Apart from these, each Government of India ministry has either a full-fledged statistical division or section. Public sector organizations have their own arrangements for collection and maintenance of statistics. In states and Union Territories (UTs), there are State Statistical Bureaus. On the whole, statistical system in India is a decentralized one; the responsibility of collection and dissemination of statistics is divided between the union and state governments. Statistics is collected by other bodies are All India Statistical Operations such as Census of India, Annual Survey of Industries (ASI), National Sample Survey etc.

Official statistical websites are:

·         http://www.nic.in/stat

·         http://www.mospi.nic.in

·         http://www.nddb.org

·         http://www.apeda.gov.in

·         http://www.mofpi.nic.in

·         http://www.censusindia.net

·         rgindia@hub.nic.in

·         http://www.rbi.org.in

At last it would be worthwhile quoting our Hon’ble Director General, Indian Council of Agricultural Research and President, Indian Society of Agricultural Statistics, Dr. S. Ayappan that “Statistic is like a salt in food, no one recognizes its importance when it is there, everyone feels its importance when it is not there”. This clearly emphasizes the importance of Statistics in all branches of Agriculture and Dairy Sciences.