The basic algorithm for handling
loops in LINKAGE is described in Ott's book, pages 170-171.
We describe it here using
more algorithmic language.
As described in the accompanying document on
Pedigree Traversal in FASTLINK,
the basic goal is to compute for each individual
x and each genotype g the conditional
probability
that x has genotype g.
This probability
is conditioned on the parts of the pedigree traversed so far and
the candidate
.
The probability that x has genotype g
may depend on the genotype
of the loop breaker. Because the loop breaker is treated as two different
people, we must force those two people to have the same genotype during
a given traversal. This means that we must re-traverse the pedigree for
each possible genotype of the loop breaker and sum the probabilities for
each traversal. For example, suppose the loop breaker may have any of three
genotypes { 1, 2, 3 }.
Then for i=1, 2, 3 we compute the conditional
probability
that
x has genotype g where we condition on
the additional assumption that the loop breaker has genotype
i.
Then P(x,g) would be computed as
, where B(i)
is the probability that the loop breaker has genotype i.
Since each possible genotype of the loop breaker requires another complete
traversal of the loop, it is highly desirable to choose loop breakers that
have the fewest number of possible genotypes.
When there are multiple loops, the algorithm must do one traversal
for each combination of possible genotypes of all the loop breakers.
For example, if the first loop breaker has 3 possible genotypes and
the second loop breaker has 4 possible genotypes,
traversals of the pedigree will be done for each likelihood evaluation.
This explains why it may be impractical to handle pedigrees with
more than 2 loops, which in turn explains why the code is distributed
with maxloop set to 2.
In FASTLINK 2.0, I introduced the following optimization. Those parts of the pedigree whose conditional genotype probabilities are independent of the genotype of the loop breaker for the first loop (labeled as 2 in column 9 of pedin.dat) are only traversed once. In LINKAGE these parts are traversed once for each possible genotype of the loop breaker.
Up: Loops in FASTLINK
Previous: Loop Ordering