Caballero QG Chp. 2, Problem set Q1.b (R)

(b) Is the population in Hardy-Weinberg equilibrium for that locus?

In order to test if the population is in Hardy-Weinberg equilibrium (HWE) we need to determine the observed vs. expected distribution of the different genotypes given the observed allele frequencies. Above we estimated the frequences of A₁,A₂,A₃. Let’s recall our allele frequencies for A₁ = p¹, A₂ = p², and A₃ = p³.

print(p1)
[1] 0.3375
print(p2)
[1] 0.45
print(p3)
[1] 0.2125

Now, let’s work out the expected frequencies of each of expected genotype:

A₁A₁ = g1

g1 <- p1*p1*200
print(g1)
[1] 22.78125

A₂A₂ = g2

g2 <- p2*p2*200
print(g2)
[1] 40.5

A₃A₃ = g3

g3 <- p3*p3*200
3
print(g3)
[1] 9.03125

A₁A₂ = g4

g4 <- 2*p1*p2*200
print(g4)
[1] 60.75

A₁A₃ = g5

g5 <- 2*p1*p3*200
print(g5)
[1] 28.6875

A₂A₃ = g6

g6 <- 2*p2*p3*200
print(g6)
[1] 38.25

Now let’s summarize the observed vs. expectated genotype counts:

Genotype	A₁A₁	A₂A₂	A₃A₃	A₁A₂	A₁A₃	A₂A₃	Total
Observed	23	61	28	39	41	8	200
Expected	22.78	60.75	28.69	40.5	38.25	9.03	-

The observed vs. expected look pretty close! However, we need to use a significance test to decide how consistent these observations are with the expected values. We will use a $\chi$²-test, with the basic form of:

$\chi$² = $\sum$ $\frac{(O_i-E_i)^2}{E_i}$

Let’s create a variable for each of the expected genotype classes; we’ll create a e equivalent for all g values above:

e1 <- 23; e2 <- 39; e3 <- 8; e4 <- 61; e5 <- 28; e6 <- 41;

Now let’s generate our observed $\chi$² value, c

c <- ((g1-e1)^2/e1) + ((g2-e2)^2/e2) + ((g3-e3)^2/e3) + ((g4-e4)^2/e4) + ((g5-e5)^2/e5) + ((g6-e6)^2/e6)

print(c)
[1] 0.3950638

Our degrees of freedom for a HWE test is based upon the number of genotype classes (6) minus the number of parameters needed to quantify our expected values. We need 2 allele classes (the third is estimable from the first two) and the popoulation size (N). So we end up with 6-2-1 = 3df. There is a standard $\chi$²-test in R but it doesn’t allow us to specify the degrees of freedom (df) (it’s really setup for contingency tables). As found here, we can use the function pchisq to make our test:

pchisq(c,df=3,lower.tail=FALSE)
0.9412603061593

Given our value is much greater than 0.05, we can accept our null hypothesis that the population is probably in HWE!

Written on April 2nd , 2021 by P. Fields

Feel free to share!

Caballero QG Chp. 2, Problem set Q1.b (R)

(b) Is the population in Hardy-Weinberg equilibrium for that locus?

You may also enjoy: