From: softlib.cs.rice.edu
Last Mod: May 24, 1995

Map Functions Used In LINKAGE/FASTLINK

by Jeremy Buhler
Rice University

This README file tries to connect the discussion of mapping functions in Chapter 1 of Ott's book[3] to what actually happens in LINKAGE/FASTLINK.

LINKAGE/FASTLINK uses two functions for calculating map distance: Haldane's map function [1] and Kosambi's map function [2]. These functions are implemented as methods for calculating recombination fractions of flanking markers given the fractions between three adjacent markers.

If we have three loci A, B, and C which are present on the chromosome in the order ABC, we say that A and C are flanking markers. We say that A and B, as well as B and C are adjacent markers. If we know the recombination fractions theta(AB) and theta(BC), we would like to determine the fraction theta(AC). One way to determine theta(AC) is to take the sum theta(AB) + theta(BC); this is Morgan's map function, which equates distance on the linkage map to recombination fraction. This approach implicitly assumes only a single crossover between adjacent loci, which is unreasonable for loci which are not linked fairly tightly (theta < 0.1).

Haldane's map function assumes that crossovers follow a Poisson distribution, with no interference between crossovers. Haldane's function x(theta) is given by

                   x = -1/2 ln(1 - 2 * theta)
or, inversely,
                 theta = 1/2 [1 - exp(-2x)]
From this formula, we see that the process of adding recombination fractions while accounting for the new crossover distribution is equivalent to the mathematical manipulation:

x(AC) = x(AB) + x(BC) = -1/2(ln(1 - 2 * theta(AB)) + ln(1 - 2 * theta(BC)))
                      = -1/2(ln( (1 - 2 * theta(AB)) (1 - 2 * theta(BC)) ))

theta(AC) = 1/2 [1 - exp(-2 x(AC))]
          = 1/2 [ 1 - (1 - 2 * theta(AB)) (1 - 2 * theta(BC))]
          = 1/2 [ 1 - 1 + 2 * theta(AB) + 2 * theta(BC) 
                              - 4 * theta(AB) * theta(BC)]

theta(AC) = theta(AB) + theta(BC) - 2 * theta(AB) * theta(BC)

This formula appears throughout LINKAGE/FASTLINK. Moreover, if we wish to use a map-function-derived theta(AC) and a given theta(AB) to derive theta(BC), we can rewrite the addition formula to find that
theta(BC) (1 - 2 * theta(AB)) = theta(AC) - theta(AB)
theta(BC) = (theta(AC) - theta(AB)) / (1 - 2 * theta(AB))
This last formula is used in LINKMAP to recalculate theta(BC) from the known theta(AC) as B is moved incrementally across the gap between A and C.

Kosambi's map function is based on a model of chiasmal interference. It is given by

    x = 1/2 arctanh(2 * theta) = 1/4 ln((1 + 2 * theta) / (1 - 2 * theta))
or, inversely,
    theta = 1/2 tanh(2x) = 1/2 (exp(4x) - 1) / (exp(4x) + 1)
Under this mapping function, addition of recombination fractions is equivalent to the following manipulation:
x(AC) = x(AB) + x(BC)
          1    / 1 + 2 * theta(AB) \   1    / 1 + 2 * theta(BC) \ 
      =   - ln | ----------------- | + - ln | ----------------- |
          4    \ 1 - 2 * theta(AB) /   4    \ 1 - 2 * theta(BC) /

      1    / 1 + 2 * theta(AB) + 2 * theta(BC) + 4 * theta(AB) * theta(BC) \
   =  - ln | ------------------------------------------------------------- |
      4    \ 1 - 2 * theta(AB) - 2 * theta(BC) + 4 * theta(AB) * theta(BC) /
theta(AC) = 1/2 (exp(4x(AC)) - 1) / (exp(4x(AC) + 1)
        1 + 2 * theta(AB) + 2 * theta(BC) + 4 * theta(AB) * theta(BC) 
        ------------------------------------------------------------- - 1
    1   1 - 2 * theta(AB) - 2 * theta(BC) + 4 * theta(AB) * theta(BC)
  = - ---------------------------------------------------------------------
    2   1 + 2 * theta(AB) + 2 * theta(BC) + 4 * theta(AB) * theta(BC) 
        ------------------------------------------------------------- + 1
        1 - 2 * theta(AB) - 2 * theta(BC) + 4 * theta(AB) * theta(BC)


     1 4 * theta(AB) + 4 * theta(BC)
  =  - -----------------------------
     2 2 + 8 * theta(AB) * theta(BC)


theta(AC) =  (theta(AB) + theta(BC)) / (1 + 4 * theta(AB) * theta(BC))
If the user specifies that interference is to be included in the model and sets the parameter "independent" (in datain.dat) to 2, then (and only then) is Kosambi's mapping function used instead of Haldane's.

References

[1] Haldane, J.B.S. 1919. "The combination of Linkage values and the calculation of distances between the loci of linked factors." J. Genet. 8:299-309.

[2] Kosambi, D.D. 1944. "The estimation of map distances from recombination values." Ann. Eugen. 12:172-75.

[3] Ott, J. 1991. Analysis of Human Genetic Linkage (Revised Edition). Baltimore: Johns Hopkins U. Press.


back to fastlink