I have learned the hard way that the way loops are encoded in pedin.dat is not prominently explained in the LINKAGE documentation. Here is my attempt at a comprehensive explanation.
The presence or absence of loops is encoded in column 9 of pedin.dat. The entries in column 9 range from 0 to (1 + number of loops). For each loop the user must designate one individual to be the ``loop breaker'' (this is not a standard term). The user should then make two copies of the row of data for the loop breaker. All the genotype data in those two rows is the same, but the first columns are different. In one instance the loop breaker has no parents, but has children, and in the other the loop breaker has children but no parents.
The two copies of the loop breaker for the
loop must have the
number i+1 in column 9 unless one of them is also the proband in which
case one copy has a 1 and the other has i+1 in column 9. I.e., the
two copies of the loop breaker for the first loop have a 2, the two
copies of a loop breaker for the second loop have a 3, etc. In each
pedigree the user may designate one non-loop breaker to be the
``proband'' by giving that person a 1 in column 9. Any other
individuals who are not the proband and not loop breakers get a 0 in
column 9. Note that by this definition, each individual can be the
loop breaker for at most one loop.
For example two lines that start:
1 3 0 0 7 0 0 1 2 [Genotype data here]
1 4 1 2 0 5 5 1 2 [same genotype data here]
would encode a loop. The 1 in column 1 indicates pedigree number 1. The 3 and 4 four in column 2 indicate the two numbers assigned to the two copies of the loop breaker. This person is a male due to the 1 in column 8 and is a loop breaker due to the 2 in column 9. Individual number 3 acts as a parent; the 7 column 5 indicates that individual 7 is the first child of the loop breaker. Individual number 4 acts as a child and sibling. The 1 in column 3 indicates that 1 is the father of the loop breaker. The 2 in column 4 indicates that 2 is the mother of the loop breaker. The 5 in columns 6 and 7 indicates that individual number 5 is the next paternal sibling and next maternal sibling of the loop breaker. The numbers assigned to the two copies of the loop breaker in column 2 need not be consecutive. The two lines for a loop breaker do not have to occur consecutively in pedin.dat.
Here is a tabular summary of what the columns encode: Column 1: Pedigree number Column 2: Individual number Column 3: Number of father Column 4: Number of mother Column 5: First child Column 6: Next sibling with same father Column 7: Next sibling with same mother Column 8: Male (encoded as 1) or female(encoded as 2) Column 9: Neither proband nor loop breaker (0), proband (1), or loopbreaker for loop i(i+1).
Next: Loop Ordering
Up: Loops in FASTLINK
Previous: Counting and Breaking