Module 6. Analysis of variance
Lesson 22
TWO WAY CLASSIFICATION
22.1 Introduction
In one way classification analysis of variance explained in the previous lesson the treatments constitute different levels of a single factor which is controlled in the experiment. There are , however, many situations in which the response variable of interest may be affected by more than one factor. For example milk yield of cow may be affected by differences in treatments i.e. feeds fed as well as differences in breed of the cows, moisture contents of butter prepared by churning cream may be affected with different levels of fat and churning speed etc. When two independent factors might have an effect on the response variable of interest, it is possible to design the test so that an analysis of variance can be used to test the effect of the two factors simultaneously. Such a test is called two–factor analysis of variance. In a two –way classification the data are classified according to two different criteria or factors. The procedure for analysis of variance is somewhat different than the one followed earlier while dealing with problems of one-way classification.
22.2 Two Way Classification
Let us plan the experiment in such a way so as to study the effect of two factors at a time in the same experiment. For each factor there will be a number of classes or levels. Let us consider the case when there are two factors which may affect the variate values be operators and machines. Suppose the N observations are classified into p categories (or classes) O1, O2, …, Op according to Factor A (Operator) and into q categories M1, M2, …, Mq according to factor B( Machine) having pq combinations Ai Bj (Oi Mj) i=1,2,---,p ;j=1,2,---,q; often called cells. This scheme of classification according to two factors is called two way classifications and analysis is called two way analysis of variance. The number of observations in each cells may be equal or different, but we shall consider the case of one observation per cell so that N=pq. i.e., total number of cells is N.
Let Xij
be the observation on the ith level of
Operator (Oi) and jth
level of Machine (M j) i=1,2,---,p ; j=1,2,---,q;
These N observations, the marginal totals and their means can be represented in
the tabular form as follows:
Table 22.1
Operators |
Machines |
Total |
Mean |
M1
M2……….. Mj ….…….Mq |
|||
.. .. ..
|
X11
X12.............. M1j............X1q X21
X22.............. M2j............X2q ..
.. ……. … ……. … … ..
.. …….. …… … …. …. Xi1
Xi2.............. Mij............Xiq ..
.. …….. …… … …. …. Xp1
Xp2 .............. Mpj............Xpq |
… … … |
… ….. ….. |
Total |
T.1
T.2 ………Tj…………. T.q |
G= T.. |
|
Mean |
|
|
|
22.2.1
Assumptions
i. The observations are
independent random variables having normal distributions with mean μij and common but unknown variance σ2.
Under this assumption model for this problem may be taken as Xij
= μij + eij.
Where eij vary from observation to
observation and are independent random variable values having normal
distributions with mean zero and variance σ2 ⇒ E(Xij) = μij.
ii. The observations in the p rows are independent
random samples of size q from p normal populations having mean μ1,
μ2, … , μp
and a common variance σ2.
iii. The observations in the q columns are
independent random samples of size p from q normal populations with mean μ1,
μ2, … , μq
and a common variance σ2.
iv. The effects are additive.
Here μi
(i=1,2,---,p) are called fixed effect due to factor
operators Oi ; μj
(j=1,2,---, q) are fixed effect due to the factor machines Mj
.
22.2.2
Mathematical model
Here the mathematical model can be written as
Xij = μ
+ αi
+ βj
+ eij
i) μ
is the general mean effect given by .
ii) αi (i=1, 2,….,p) is the effect due to ith operator
.
iii) βj (j=1,2, … , q)
is the effect due to jth machine
iv) eij’s are independently normally
distributed with mean zero and variance σe2
i.e.
22.2.3 Null hypothesis
We set up the
null hypothesis, that the operators and machines are homogeneous. In other
words, the null hypothesis for operators and machines are respectively:
or
or
Against the corresponding hypothesis
22.2.1 Computations of different sum of squares
a)
Total Sum of Squares (TSS ) =
where
G is the grand total of all the observations and N= pq.
The expression i.e.,
sum of squares of all the observations is known as Raw Sum of Squares (R.S.S.)
and the expression (G)2/N is called Correction Factor (CF)
b) Sum of Squares due to factor A (Operators) denoted by SSA
To find the sum of squares due to factor A (SSA) i.e., sum of squares among the rows (SSR) divide the squares of sum of each row by the number of observations in respective rows and find their sum and thereafter, subtract the correction factor from this sum i.e.,
where Ti. is the total of the observations pertaining to the ith row.
c) Sum of Squares due to factor B(Machines) denoted by SSB
To find the sum of squares due to factor B (SSB) i.e. sum of squares among the columns (SSC) divide the squares of sum of each column by number of observations in respective columns and find their sum and thereafter, subtract the correction factor from this sum i.e.,
where T. j is the total of the observations pertaining to the jth column.
d) Sum of Squares due to residuals or error denoted by SSE
The sum of squares of the residuals is obtained by subtracting sum of squares due to Factor A (SSA) and sum of squares due to factor B (SSB) from the total sum of squares ( TSS) i.e., SSE=TSS-SSA-SSB.
This sum of squares is also called error sum of squares denoted by SSE.
Prepare the analysis of variance table as follows:
Table 22.2 ANOVA Table
Source of variation |
d.f. |
S.S. |
M.S.S. |
F-Ratio |
Among levels of factor A (Operators) |
|
SSA |
|
|
Among levels of
factor B (Machines) |
(q-1) |
SSB |
|
|
Error |
|
SSE |
|
|
Total |
pq – 1 |
|
|
|
Interpretation
By comparing the values of F1 and F2 with the tabulated value of F for respective d.f. and at α level of significance , the null hypothesis of the homogeneity of various factor A (Operators) and various factor B (Machines) may be rejected or accepted at the desired level of significance .
Standard error
a)
The estimated standard error of the difference between means of factor A i.e.,
between means of two operators is
b)
The estimated standard error of the difference between means of factor B i.e.,
between means of two machines is
c) The Critical Difference (C.D.)
or Least Significant Difference (L.S.D.) can be calculated as
C.D. = SEdxt α,(p-1)(q-1)
where SEd is the S.E. of difference
between two means, α is level of significance and (p-1)(q-1) is the d.f. for error .
The treatment
means are
These can be
compared with the help of critical difference. Any two treatments means are
said to differ significantly if the difference is larger than the critical
difference (CD).The procedure of two way ANOVA
is illustrated through the following example:
Example 1: The average partial size of dried ice-cream mix spray powder dried by varying in-let temperature and automiser speed was measured in an experiment with 6 in-let temperatures and 4 automiser speed. The results obtained from the experiment are given below:
Table 22.3
Automiser Speed |
In-let Temperatures |
||||||
T1 |
T2 |
T3 |
T4 |
T5 |
T6 |
|
|
S1 |
35.7 |
39.0 |
42.1 |
25.1 |
29.9 |
27.3 |
|
S2 |
32.9 |
33.6 |
37.7 |
24.0 |
23.2 |
24.3 |
|
S3 |
35.6 |
32.5 |
37.4 |
21.0 |
24.9 |
23.1 |
|
S4 |
30.7 |
35.8 |
40.1 |
26.3 |
28.3 |
26.4 |
|
Analyze the data and discuss whether there is any significant difference between in-let temperature and automiser speed on particle size of ice-cream mix powder?
Solution:
HoA : μ1.=μ2.=μ3.=μ4. =μ5.=μ6. i.e., the mean particle size of ice-cream mix powder at different in-let temperature is same.
H1A :
.
HoB : μ.1=μ.2=μ.3=μ.4 i.e., the mean particle size of ice-cream mix powder at different automiser speeds is same.
H1A :
.
Prepare the following two way table:
Table 22.4 Calculation of Treatments totals, means and the grand total
Automiser Speed |
In-let Temperatures |
Total |
Mean |
|||||
T1 |
T2 |
T3 |
T4 |
T5 |
T6 |
|||
S1 |
35.7 |
39.0 |
42.1 |
25.1 |
29.9 |
27.3 |
|
|
S2 |
32.9 |
33.6 |
37.7 |
24.0 |
23.2 |
24.3 |
|
|
S3 |
35.6 |
32.5 |
37.4 |
21.0 |
24.9 |
23.1 |
|
|
S4 |
30.7 |
35.8 |
40.1 |
26.3 |
28.3 |
26.4 |
|
|
Total |
T.1 = 134.9 |
T.2 = 140.9 |
T.3 = 157.3 |
T.4= 96.4 |
T.5= 106.3 |
T.6 = 101.1 |
G=736.9 |
|
Mean |
|
35.23 |
39.33 |
24.10 |
26. |
25.28 |
|
|
Correction
factor = (G)2 / N = (736.9)2
/ 24 = 22625.9004
=23513.27-22625.9004=887.3696
Sum of Squares due to factor A (Speed)
= 22692.5517-22625.9004=66.6512
c) Sum of Squares due to factor B (In-let temperature)
= 23401.9925-22625.9004=776.0921
d) Sum of Squares due to residuals (SSE)
SSE=TSS-SSA-SSB=887.3696-66.6512-776.0921=44.62625
Prepare the following ANOVA table:
Table 22.5 ANOVA Table
Source of variation |
d.f. |
S.S. |
M.S.S. |
F-Ratio |
Among levels of factor A (Speed) |
(4-1)=3 |
66.65125 |
=22.2171 |
=7.4677 |
Among levels of factor B (Temperature) |
(6-1)=5 |
776.0921 |
=155.2184 |
=52.1728 |
Error |
(4-1)(6-1) =15 |
44.62625 |
=2.9751 |
|
Total |
24 – 1=23 |
887.3696 |
From Fisher and Yate’s tables, tabulated F values for 3 and 15 d.f. and for 5 and 15 d.f. at 5% level of significance are 3.2874 and 2.9013 respectively. Since the observed values of F for factor A (automiser speed ) and factor B (in-let temperature) in the analysis of variance table are greater than the respective 5 % tabulated F value , F1 and F2 are significant at 5% levelof significance . Hence both the null hypothesis HoA and HoB are rejected at 5% level of significance.
Critical difference
C.D. (For
comparison of different speed)
=
C.D. (For comparison of different
in-let temperature)
=
Conclusion
It can be concluded that mean particle
size of ice-cream mix powder differ significantly at various levels of in-let
temperature as well as at various automiser speed
levels. The mean particle size of ice-cream mix powder was found maximum at
different auto miser speed S1(33.1833)
which is at par with speed S4 (31.2667). Similar
argument holds for speed S4 and S2 as well as
for the S2 and S3 speeds. Similarly the mean particle
size of ice-cream mix powder was found maximum in temperature T3 (39.33)
followed by temperature T2(35.23) and T1(33.73)
, both are statistically at par with each other. Similar argument
holds for temperature T5, T6 and T4
.