2013_Ch.14_Notes
Transcript of 2013_Ch.14_Notes
-
8/17/2019 2013_Ch.14_Notes
1/7
QMDS 202 Data Analysis and Modeling
Chapter 14 Analysis Of Variance
Definition: ANOVA is the name given to the approach that allows us to use sample
data to see if the values of two or more unknown population means are
likely to be equal.
Assumptions: . !he populations under study are normally distributed.". !he samples are drawn randomly# and each sample is independent of
the other samples.
$. !he populations from which the sample values are obtained all have
the same unknown population variance %σ"&. !hat is# === "$"
"
"
σ σ σ '"
k σ = where k ( no. of populations under study.
)ypothesis !esting *rocedure
. )+: µ ( µ" ( ' ( µk
): Not all population means are equal.
". ,et the level of significance %α&.
$. F -distribution as the testing distribution.
. /ind the critical value with df ( k 0 and df " ( n 0 k and state the re1ection rule.
n ( the total number of items in all samples ( n 2 n" 2 ' 2 nk n j ( sample si3e of the j
th sample
4. 5ompute the test statistic.
6. 7ake the statistical decision.
8ey !erms in ANOVA
SS(Total) ( !otal ,um of ,quares ( =−∑∑= =
"
&% X X k
j
n
i
ij
j
n
X
X
k
j
n
i
ijk
j
n
i
ij
j
j
"
"
−∑∑
∑∑ = == =
==
∑∑= =
n
X
X
k
j
n
i
ij
j
9rand mean
SST ( ,um of ,quares etween 9roups ( ,um of ,quares of !reatment groups
-
8/17/2019 2013_Ch.14_Notes
2/7
(
n
X
n
X
X X n
ij
n
i
k
j
j
ij
n
ik
j
j j
k
j
j j"
"
"
&%
−
=−ΣΣΣ
ΣΣ ===
==
SSE ( ,um of ,quares ;ithin 9roups ( ,um of ,quares of
-
8/17/2019 2013_Ch.14_Notes
3/7
rand 5 $." $.4 $.4 !$ ( +."
rand D $.4 $.@ $.@ ! ( .
! ( $."
SST (( )
"
".$
$
.
$
".+
$
".+
$
A."""""
−
+++
( 46.+6 0 44.4" ( +.4
A.46&@.$%&@.$%...&.%&6.$% """"" =++++=ΣΣ ij x
SS(Total) ( 46. 0 44.4" ( .@
SSE ( .@ 0 +.4 ( +.6
!he One-way ANOVA ,ummary !able of this problem:
Source df SS MS F
Treatments 3 0.54 0.! "."5
Error ! 0.#4 0.0!
Total .!
)+: µ ( µ " ( µ $ ( µ ): Not all population means are equal
α ( +.+4
Bndependent *opulations. Assume that all the populations are normally
distributed with an equal variance σ" ⇒ F$distribution will be used in the
ANOVA test.
df ( k 0 ( 0 ( $
df " ( nT 0 k ( " 0 ( @
Ce1ect )+ if TS .+
"4."+@.+
@.+==TS
TS ( "."4 is not great than .+ ⇒ Do not re1ect )+
5onclusion: !he mean cholesterol contents of the four diet foods are notsignificantly different.
OR
x " &% x x − " x """ &% x x − $ x "$$ &% x x − . x " &% x x −$.6 %-+.$&" $. %-+.$&" $." %-+."&" $.4 %-+."&"
. %+."&" $." %-+."&" $.4 %+.&" $.@ %+.&"
.+ %+.&" $.? %+.4&" $.4 %+.&" $.@ %+.&"
. +. +." +.$@ +." +.+6 . +.+6
$
-
8/17/2019 2013_Ch.14_Notes
4/7
?.$$
A.
==
Σ=
n
x x .$
$
".+
"
"" ==
Σ=
n
x x
.$$
".+
$
$$ ==
Σ=
n
x x A.$
$
.
==
Σ=
n
x x
n ( $ 2 $ 2 $ 2 $ ( " k (
6.$"
.".+".+A.=
+++= x
&%&%&%&%"
.."
$$"
"""
−−+−+−+−
=k
x xn x xn x xn x xn MST
@.+$
4.+
&6.$A.$%$&6.$.$%$&6.$.$%$&6.$?.$%$ """"==
−
−+−+−+−=
k n
x x x x x x x x MSE
−
−Σ+−Σ+−Σ+−Σ=
"..
"$$
"""
" &%&%&%&%
+@.+@
6.+
"
+6.++6.+$@.+.+==
−
+++=
!he One-;ay ANOVA !able
,ource of
Variation
Degrees of
/reedom %df&
,um of ,quares
%,,&
7ean ,quare
%7,&
/-,tatistic
%!,&
!reatments k 0 ,,! 7,! / ( 7,! = 7,<
-
8/17/2019 2013_Ch.14_Notes
5/7
,,%!otal& ((k
T xij
"" −ΣΣ
=iS total of the observations in the ith block
,, ( ,um of ,quares for locks ( [ ] (k T
S S S k (
"""
"
"
...
−+++7, ( 7ean ,quare for locks ( ,, = % 0 &
,,< ( ,,%!otal& 0 ,,! 0 ,,
ample " Bn the previous problem# suppose now we learn something that we did not
know earlier - the measurements of the cholesterol contents were
performed in different laboratories. !he first value of each sample# welearn# came from one laboratory# the second value came from another
laboratory# and the third value came from a third laboratory. ;e might
picture the original data as follows:
Eab. Eab. " Eab. $
rand A $.6 . .+
rand $. $." $.?rand 5 $." $.4 $.4
rand D $.4 $.@ $.@
!est whether there is a difference in the mean cholesterol contents among
the four diet foods as well as whether there is a difference in the mean
results given by the three laboratories. Fse α ( +.+4.
,olution: Eab. Eab. " Eab. $
rand A $.6 . .+ .
rand $. $." $.? +."
rand 5 $." $.4 $.4 +."rand D $.4 $.@ $.@ .
$. .6 4." $."
,,%!otal& ( .@ ,,! ( +.4
,, ( [ ] ".+4".44?.444".44".46..$
""" =−=−++
,,< ( .@ 0 +.4 0 +." ( +.""
Source df SS MS F
Treatments 3 0.54 0.! 4.*0
'locks " 0.4" 0." 5.+"
Error # 0."" 0.03#+
Total .!
/or the treatments:
)+: µ ( µ" ( µ$ ( µ
4
-
8/17/2019 2013_Ch.14_Notes
6/7
): Not all population means are equal
α ( +.+4
, ( $ ," ( 6 5.V. ( .6
!, ( .?+ .6 ⇒ Ce1ect )+
∴ !he mean cholesterol contents of the four diet foods are significantly
different.
/or the blocks:
)+: µ ( µ" ( µ$
): Not all population means are equal
α ( +.+4
, ( " ," ( 6 5.V. ( 4.
!, ( 4." 4. ⇒ Ce1ect )+
∴ !he mean results given by the three laboratories are significantly different.
ample $ !he sample data in the following table are the marks in a statistics test
obtained by nine students from $ ma1ors who were taught by $ different
instructors:
Bnstructor
A
Bnstructor
Bnstructor
5
7arketing ma1or @@
/inance ma1or @@ ? @
Accounting ma1or @4 ?4 "
At 4G significance level# test whether the mean scores of the ma1ors arethe same by using the instructors as blocks.
,olution: Bnstructor A
Bnstructor
Bnstructor 5
7arketing ma1or @@
/inance ma1or @@ ? @
Accounting ma1or @4 ?4 "
k ( (
n (
,,! (
,, (
"
ij xΣΣ (
6
-
8/17/2019 2013_Ch.14_Notes
7/7
,,%total& (
,,< (
Source df SS MS F Treatments
'locks
Error
Total
/or the treatments:)+:
):
α ( +.+4, ( ," ( 5.V. (
Decision:
5onclusion:
/or the blocks:
)+:
):
α ( +.+4
, ( ," ( 5.V. (
Decision:
5onclusion:
Assumptions in this problem:
Ceview *roblems: .# .$# .4# .46# .4@.