Module 4. Concepts of sampling methods

Lesson 15

ELEMENTARY CONCEPTS OF OTHER SAMPLING TECHNIQUES

15.1  Introduction

In the preceding lesson we have seen the nature of simple random sampling and how a random sample can be drawn. The use of simple random sampling requires an up-to-date frame, i.e. a complete and up-to-date list of the population units to the sampled. In practice, since this is not readily available in many enquiries, it restricts the use of this sampling design. Moreover, in field surveys if the area of coverage is fairly large, then the units selected in the random sample are expected to be scattered widely/geographically and thus it may be quite time consuming and costly to collect the requisite information or data. If each element is drawn individually from the population at large, it is an unrestricted sample. Restricted sampling is where additional controls are imposed, in other words it covers all other forms of sampling. Because of a more effective distribution of the sampling units, restricted random sampling is generally more efficient than unrestricted random sampling which is possible using simple random sampling. In the restricted random sampling, the cost of sampling are minimised for a given precision of the population estimate. As an alternative to the simple random sampling design, several complex probability sampling designs are available which can be used that are more viable and effective. Efficiency is improved because more information can be obtained for a given sample size using some of the complex probability sampling procedures than the simple random sampling design. In this lesson we will discuss some of these sampling techniques.  

15.2  Stratified Random Sampling

In simple random sampling without replacement, sampling variance of the sampling distribution of mean is given by .This implies that variance of the sample estimate of the population mean is

o   inversely proportional to the sample size

o   directly proportional to the variability of the sampling units in the population .

Since the precision of an estimate is reciprocal of sampling variance, so apart from increasing the sample size n, the other way of increasing the precision is to devise a sampling technique which will reduce S2, i.e. population heterogeneity. A sampling plan which will effectively reduce the variability in the population is called stratified sampling. When the population is heterogeneous with respect to the variable or characteristic under study, then the technique of stratified random sampling is used to obtain more efficient results. Stratification means division into layers or groups. A stratified random sampling is one where the population is divided into mutually exclusive and mutually exhaustive strata or sub-groups and then a simple random sample is selected within each strata or sub-group e.g. cows in a big herd can divided into different strata on the basis of breed, age groups, body weight groups, lactation length, lactation order, daily/lactation milk yield groups etc. Stratified random sampling involves the following steps:

1.   Stratify the given population into number of sub-groups or sub-populations known as strata such that:

a)   The units within each stratum (sub-group) are as homogeneous as possible.

b)   The difference between various strata are as marked as possible, i.e., the stratum means differ as widely as possible.

c)   Various strata are non-overlapping. This means each and every unit in the population belongs to one and only one stratum.

The criterion used for the stratification of the universe into various strata is known as stratifying factor. In general geographical, sociological or economic characteristics form the basis of stratification of the given population. Some of the commonly used stratifying factors are age, sex, income, occupation, educational level, geographical area, economic status, etc. Thus in stratified sampling the given population of size N is divided into, say k relatively homogeneous strata of sizes N1, N2,…,Nk  respectively such that

         

2.   Draw simple random sample (without replacement) from each of the strata. Let random sample of size ni , be drawn from the ith strata, (i = 1,2,…,k) such that

         

Where n is the total sample size from a population of size N.The sample of n units is known as stratified random sample (without replacement) and the technique of drawing such a sample is known as stratified random sampling.

The size of the sample from each stratum can either be proportional, optimum or disproportional to the size of each stratum.

15.2.1  Proportional allocation

In this, the items are selected from each stratum in the same proportion as they exist in the population. The allocation of sample sizes is termed as proportional if the sample fraction, i.e. the ratio of the sample size to the population size, remains the same in all the strata. Mathematically, the principle of proportional allocation gives

               

15.2.2  Optimum allocation

In this case the size of the samples to be drawn from the various strata is determined by the principle of optimization, i.e., obtaining best results at minimum possible cost. In optimum allocation, number of units in ith sample ni’s, (i = 1,2,…,k) are determined so that

(i)   Variance of sample estimate of the population mean is minimum (i.e., its precision is maximum) for a fixed total sample size n. (Neyman’s Allocation)

(ii)  Variance of the estimate is minimum for a fixed cost of the plan.

15.3  Systematic Sampling

Systematic sampling is slightly different than the simple random sampling in which only the first sample unit is selected at random and the remaining units are automatically selected in a definite sequence at equal spacing or equal time interval from one another. This technique of drawing samples is usually recommended if the complete and up-to-date list of the sampling units, i.e., the frame is available and the units are arranged in some systematic order such as alphabetical, chronological, geographical order, time interval etc.

Let us suppose that N sampling units in the population are arranged in some systematic order and serially numbered from 1 to N and we want to draw a sample of size n from it such that N = nk k = N/n, where k is usually called the sample interval. Systematic sampling consists in selecting any unit at random from the first k units numbered from 1 to k and then selecting every kth unit in succession subsequently. Thus, if the first unit selected at random is ith unit, then the systematic sample of size n will consist of the units numbered I, i+k,i+2k, … , I + (n-1)k.The random number ‘i’ is called the random start and its value, in fact, determines the whole sample. As an example, let us suppose that we want to select 50 cows from a list of cows containing 2,000 identification numbers tags arranged systematically. Here n = 50 and N = 2,000

               

We select any number from 1 to 40 at random and the corresponding serial number cow is selected. Suppose the random number selected is 25 .Then, the systematic sample will consist of 50 cows in the list at identification number 25, 65, 105, ---, 1905, 1945, 1985.

15.4  Cluster Sampling

 When the population size is very large, the previously mentioned sampling methods lead to several difficulties. The sampling frame is not available and it is too expensive and time consuming to prepare it. The other difficulties are firstly the high cost and administrative difficulty of surveying widely scattered sampling units and secondly the elementary units may not be easily identifiable and locatable. In such cases cluster sampling is useful. In this case the total population is divided, depending upon on problem under study, into some recognizable sub-divisions which are termed as clusters. A specified number of clusters are selected at random, and the observation is made on all the units in the sampled clusters. We then observe, measure and interview each and every unit in the selected clusters. The clusters are called the primary units. Cluster sampling is also known as area sampling.  For example, cluster may be consisting of all households in a village and hence there are as many clusters as the number of villages in a district. It may be noted that the cluster is a heterogeneous sub-population whereas stratum is a homogeneous sub-population. Certain precautions should be taken while using cluster sampling which are given below:

·         Each elementary unit should belong to one and only one cluster.

·         The cluster constituting the population should include each and every elementary unit belonging to the population.

·         All clusters should be distinct, meaning thereby that there should neither be overlapping nor omission of units.

·         Clusters should be as small as possible consistent with the cost and limitations of the survey.

15.5  Multistage Sampling

Instead of enumerating all the sampling units in the selected clusters one can obtain better and more efficient estimators by resorting to sub sampling within the clusters. The technique is called two-stage sampling, clusters being termed as primary units and the units within the clusters as secondary units. The above technique may be generalised to what is called multistage sampling. As the name suggests, multistage sampling refers to a sampling technique which is carried out in various stages. Here the population is regarded as consisting of a number of primary units each of which is further composed of a number of secondary stages unit in which we are interested. For example, if we are interested in obtaining a sample of, say, n households from a particular state the first stage units may be districts and second stage units may be villages in the districts and third stage units will be households in the villages. Each stage thus results in a reduction of the sample size.

Multistage sampling consists in sampling first stage units by some suitable method of sampling. From among the selected first stage units, a sub-sample of secondary stage units is drawn by some suitable method of sampling which may be same or different from the method used in selecting first stage units. Further stages may be added to arrive at a sample of desired sampling units. Multistage sampling is more flexible as compared to other methods of sampling.