Module 6. Analysis of variance

Lesson 22

TWO WAY CLASSIFICATION

22.1 Introduction

In one way classification analysis of variance explained in the previous lesson the treatments constitute different levels of a single factor which is controlled in the experiment. There are , however, many situations in which the response variable of interest may be affected by more than one factor. For example milk yield of cow may be affected by differences in treatments i.e. feeds fed as well as differences in breed of the cows, moisture contents of butter prepared by churning cream may be affected with different levels of fat and churning speed etc. When two independent factors might have an effect on the response variable of interest, it is possible to design the test so that an analysis of variance can be used to test the effect of the two factors simultaneously. Such a test is called two–factor analysis of variance. In a two –way classification the data are classified according to two different criteria or factors. The procedure for analysis of variance is somewhat different than the one followed earlier while dealing with problems of one-way classification.

22.2 Two Way Classification

Let us plan the experiment in such a way so as to study the effect of two factors at a time in the same experiment. For each factor there will be a number of classes or levels. Let us consider the case when there are two factors which may affect the variate values be operators and machines. Suppose the N observations are classified into p categories (or classes) O₁, O₂, …, O_paccording to Factor A (Operator) and into q categories M₁, M₂, …, M_q according to factor B( Machine) having pq combinations A_iB_j (O_iM_j) i=1,2,---,p ;j=1,2,---,q; often called cells. This scheme of classification according to two factors is called two way classifications and analysis is called two way analysis of variance. The number of observations in each cells may be equal or different, but we shall consider the case of one observation per cell so that N=pq. i.e., total number of cells is N.

Let X_ij be the observation on the i^th level of Operator (O_i) and j^th level of Machine (M _j) i=1,2,---,p ; j=1,2,---,q;

These N observations, the marginal totals and their means can be represented in the tabular form as follows:

Table 22.1

Operators	Machines	Total	Mean
Operators	M₁ M₂……….. M_j ….…….M_q	Total	Mean
.. .. ..	X₁₁ X₁₂.............. M_1j............X_1q X₂₁ X₂₂.............. M_2j............X_2q .. .. ……. … ……. … … .. .. …….. …… … …. …. X_i1 X_i2.............. M_ij............X_iq .. .. …….. …… … …. …. X_p1 X_p2.............. M_pj............X_pq	… … …	… … … ….. …..
Total	T.₁ T.₂………T_j…………. T._q	G= T._.
Mean	......... …………

22.2.1 Assumptions

i. The observations are independent random variables having normal distributions with mean μ_ij and common but unknown variance σ². Under this assumption model for this problem may be taken as X_ij = μ_ij + e_ij. Where e_ij vary from observation to observation and are independent random variable values having normal distributions with mean zero and variance σ² ⇒ E(X_ij) = μ_ij.

ii. The observations in the p rows are independent random samples of size q from p normal populations having mean μ₁, μ₂, … , μ_p and a common variance σ².

iii. The observations in the q columns are independent random samples of size p from q normal populations with mean μ₁, μ₂, … , μ_q and a common variance σ².

iv. The effects are additive.

Here μ_i (i=1,2,---,p) are called fixed effect due to factor operators O_i ; μ_j (j=1,2,---, q) are fixed effect due to the factor machines Mj .

22.2.2 Mathematical model

Here the mathematical model can be written as

X_ij = μ + α_i + β_j + e_ij

i) μ is the general mean effect given by .

ii) α_i (i=1, 2,….,p) is the effect due to i^th operator

iii) β_j (j=1,2, … , q) is the effect due to j^th machine

iv) e_ij’s are independently normally distributed with mean zero and variance σ_e²i.e.

22.2.3 Null hypothesis

We set up the null hypothesis, that the operators and machines are homogeneous. In other words, the null hypothesis for operators and machines are respectively:

Against the corresponding hypothesis

22.2.1 Computations of different sum of squares

a) Total Sum of Squares (TSS ) =

where G is the grand total of all the observations and N= pq. The expression i.e., sum of squares of all the observations is known as Raw Sum of Squares (R.S.S.) and the expression (G)²/N is called Correction Factor (CF)

b) Sum of Squares due to factor A (Operators) denoted by SSA

To find the sum of squares due to factor A (SSA) i.e., sum of squares among the rows (SSR) divide the squares of sum of each row by the number of observations in respective rows and find their sum and thereafter, subtract the correction factor from this sum i.e.,

where T_i. is the total of the observations pertaining to the i^th row.

c) Sum of Squares due to factor B(Machines) denoted by SSB

To find the sum of squares due to factor B (SSB) i.e. sum of squares among the columns (SSC) divide the squares of sum of each column by number of observations in respective columns and find their sum and thereafter, subtract the correction factor from this sum i.e.,

where T._j is the total of the observations pertaining to the j^th column.

d) Sum of Squares due to residuals or error denoted by SSE

The sum of squares of the residuals is obtained by subtracting sum of squares due to Factor A (SSA) and sum of squares due to factor B (SSB) from the total sum of squares ( TSS) i.e., SSE=TSS-SSA-SSB.

This sum of squares is also called error sum of squares denoted by SSE.

Prepare the analysis of variance table as follows:

Table 22.2 ANOVA Table

Source of variation	d.f.	S.S.	M.S.S.	F-Ratio
Among levels of factor A (Operators)		SSA
Among levels of factor B (Machines)	(q-1)	SSB
Error		SSE
Total	pq – 1

Interpretation

By comparing the values of F₁ and F₂ with the tabulated value of F for respective d.f. and at α level of significance , the null hypothesis of the homogeneity of various factor A (Operators) and various factor B (Machines) may be rejected or accepted at the desired level of significance .

Standard error

a) The estimated standard error of the difference between means of factor A i.e., between means of two operators is

b) The estimated standard error of the difference between means of factor B i.e., between means of two machines is

c) The Critical Difference (C.D.) or Least Significant Difference (L.S.D.) can be calculated as

C.D. = SE_dxt α,(p-1)(q-1)where SE_d is the S.E. of difference between two means, α is level of significance and (p-1)(q-1) is the d.f. for error .

The treatment means are

These can be compared with the help of critical difference. Any two treatments means are said to differ significantly if the difference is larger than the critical difference (CD).The procedure of two way ANOVA is illustrated through the following example:

Example 1: The average partial size of dried ice-cream mix spray powder dried by varying in-let temperature and automiser speed was measured in an experiment with 6 in-let temperatures and 4 automiser speed. The results obtained from the experiment are given below:

Table 22.3

Automiser Speed	In-let Temperatures
Automiser Speed	T₁	T₂	T₃	T₄	T₅	T₆
S₁	35.7	39.0	42.1	25.1	29.9	27.3
S₂	32.9	33.6	37.7	24.0	23.2	24.3
S₃	35.6	32.5	37.4	21.0	24.9	23.1
S₄	30.7	35.8	40.1	26.3	28.3	26.4

Analyze the data and discuss whether there is any significant difference between in-let temperature and automiser speed on particle size of ice-cream mix powder?

Solution:

H_oA : μ_1.=μ_2.=μ_3.=μ_4.=μ_5.=μ_6.i.e., the mean particle size of ice-cream mix powder at different in-let temperature is same.

H_1A : .

H_oB : μ.₁=μ.₂=μ.₃=μ.₄i.e., the mean particle size of ice-cream mix powder at different automiser speeds is same.

H_1A : .

Prepare the following two way table:

Table 22.4 Calculation of Treatments totals, means and the grand total

Automiser Speed	In-let Temperatures						Total	Mean
Automiser Speed	T₁	T₂	T₃	T₄	T₅	T₆	Total	Mean
S₁	35.7	39.0	42.1	25.1	29.9	27.3	=199.1	=33.1833
S₂	32.9	33.6	37.7	24.0	23.2	24.3	=175.7	=29.2833
S₃	35.6	32.5	37.4	21.0	24.9	23.1	=174.5	=29.0833
S₄	30.7	35.8	40.1	26.3	28.3	26.4	=187.6	=31.2667
Total	T.₁ = 134.9	T.₂= 140.9	T.₃ = 157.3	T.₄= 96.4	T.₅= 106.3	T.₆ = 101.1	G=736.9
Mean	33.73	35.23	= 39.33	24.10	26.58	25.28

Correction factor = (G)²/ N = (736.9)²/ 24 = 22625.9004

=23513.27-22625.9004=887.3696

Sum of Squares due to factor A (Speed)

= 22692.5517-22625.9004=66.6512

c) Sum of Squares due to factor B (In-let temperature)

= 23401.9925-22625.9004=776.0921

d) Sum of Squares due to residuals (SSE)

SSE=TSS-SSA-SSB=887.3696-66.6512-776.0921=44.62625

Prepare the following ANOVA table:

Table 22.5 ANOVA Table

Source of variation	d.f.	S.S.	M.S.S.	F-Ratio
Among levels of factor A (Speed)	(4-1)=3	66.65125	=22.2171	=7.4677
Among levels of factor B (Temperature)	(6-1)=5	776.0921	=155.2184	=52.1728
Error	(4-1)(6-1) =15	44.62625	=2.9751
Total	24 – 1=23	887.3696

From Fisher and Yate’s tables, tabulated F values for 3 and 15 d.f. and for 5 and 15 d.f. at 5% level of significance are 3.2874 and 2.9013 respectively. Since the observed values of F for factor A (automiser speed ) and factor B (in-let temperature) in the analysis of variance table are greater than the respective 5 % tabulated F value , F₁ and F₂are significant at 5% levelof significance . Hence both the null hypothesis H_oA_andH_oB are rejected at 5% level of significance.

Critical difference

C.D. (For comparison of different speed)

C.D. (For comparison of different in-let temperature)

Conclusion

It can be concluded that mean particle size of ice-cream mix powder differ significantly at various levels of in-let temperature as well as at various automiser speed levels. The mean particle size of ice-cream mix powder was found maximum at different auto miser speed S₁(33.1833) which is at par with speed S₄ (31.2667). Similar argument holds for speed S₄ and S₂ as well as for the S₂ and S₃ speeds. Similarly the mean particle size of ice-cream mix powder was found maximum in temperature T₃(39.33) followed by temperature T₂(35.23) and T₁(33.73) , both are statistically at par with each other. Similar argument holds for temperature T_5,T₆ and T₄.