MAPMAKER/EXP Tutorial/Reference Manual 3.0


Automatically Finding a Map Orders

Having divided our data set into chromosomes, we can now determine the likely map for each of those chromosomes. However, because of the quantity of data to manage, we will not try to do this using the manual mapping commands presented earlier (such as "try" and "compare"). As you will see, these functions may be performed in a more automated fashion using simple MAPMAKER commands.

First, we use the "sequence" command to select a particular chromosome for analysis. While again we could type the names of all the loci on that chromosome, MAPMAKER allows us to type the chromosome name alone as a convenient shortcut. For illustrative purposes, we select the set of markers assigned to chromosome 10.

Next, we use the "three point" command to pre-compute the likelihoods of all three-point crosses for this chromosome We do not have to do this step to proceed, although three-point analysis provides a powerful way to speed up the steps we perform below. Here's how it works:

We would like our mapping analyses to consider as many orders of any group as possible. However, most of these orders are very unlikely, given the observed data, and are simply not worth wasting computer time on. In MAPMAKER/EXP 3.0, three-point analysis simply excludes the majority of these "ridiculous" orders from consideration, allowing MAPMAKER to spend time carefully examining only those orders reasonably consistent with the observed data.

When you type the "three-point" command, MAPMAKER first finds every linked triple of markers in the current sequence. For each triple, MAPMAKER computes the most likely map distances and likelihoods for all 3 possible orders. For each order, MAPMAKER displays the 'relative log-likelihood' of that order as compared to the most likely (or best) order of the triple. As before, the most likely order of the three is has a relative log-likelihood of 0.0, while the others have negative relative log-likelihoods.

MAPMAKER will make use of these data as follows: any three-point order will be considered excluded if its relative log-likelihood is worse than the best by some threshold (by default, the threshold is 4.0). Any multiple locus order which contains one or more excluded three-point sub-orders will itself be considered excluded, and only non-excluded multipoint orders will be evaluated by full multipoint analysis.

Had we not performed this step, then MAPMAKER would use full-multipoint analysis in the steps below to evaluate all possible orders. This definitely would be slower, but presumably would produce identical answers.

61> sequence chrom10
sequence #23= chrom10: assigned

62> three point Linkage Groups at min LOD 3.00, max Distance 50.0 Triplet criteria: LOD 3.00, Max-Dist 50.0, #Linkages 2 'triple error detection' is on. counting...143 linked triplets in 1 linkage group log-likelihood differences count markers a-b-c b-a-c a-c-b 1: L062 D030 M067 0.00 -8.93 -0.58 2: L062 D030 M003 -8.63 0.00 -0.74 3: L062 D030 M007 -5.00 0.00 -4.11 4: L062 D030 M172 -4.19 0.00 -4.75 5: L062 D030 M139 0.00 -4.32 -1.36 6: L062 D030 M153 0.00 -1.24 -3.25 7: L062 D030 A037 -5.91 0.00 -3.33 8: L062 D030 A114 0.00 -4.62 -0.55 9: L062 D030 T032 -3.00 0.00 -5.78 10: L062 T031 M139 -1.89 -4.33 0.00 11: L062 T031 A114 -2.59 -3.75 0.00 12: L062 M067 M003 -9.36 0.00 -0.70 13: L062 M067 M007 -5.45 0.00 -3.77 14: L062 M067 M172 -4.56 0.00 -4.33 15: L062 M067 M139 0.00 -7.01 -0.37 16: L062 M067 M153 0.00 -1.59 -2.43 17: L062 M067 A037 -6.44 0.00 -3.08 18: L062 M067 A114 -0.30 -7.97 0.00 19: L062 M067 T032 -3.18 0.00 -5.21 20: L062 M003 M007 0.00 -0.80 -6.08 21: L062 M003 M172 0.00 -0.77 -7.77 22: L062 M003 M139 -0.62 0.00 -10.45 23: L062 M003 A037 0.00 -0.83 -3.53 24: L062 M003 A114 -0.66 0.00 -9.68 25: L062 M003 A063 0.00 -0.56 -11.32 26: L062 M003 T032 0.00 -0.69 -10.00 27: L062 M007 M175 0.00 -2.63 -5.85 28: L062 M007 M172 0.00 -8.74 -0.75 29: L062 M007 M139 -2.49 0.00 -5.77 30: L062 M007 A037 -0.78 -6.98 0.00 31: L062 M007 A114 -2.77 0.00 -5.34 32: L062 M007 A063 0.00 -5.33 -4.36 33: L062 M007 T032 0.00 -6.86 -2.92 34: L062 M175 M172 -4.35 -3.65 0.00 35: L062 M175 T032 -2.53 -5.42 0.00 36: L062 M172 M139 -3.00 0.00 -4.83 37: L062 M172 A037 -2.42 -6.05 0.00 38: L062 M172 A114 -3.36 0.00 -4.48 39: L062 M172 A063 0.00 -7.05 -2.93 40: L062 M172 T032 0.00 -9.23 -1.46 .
<< More Output Follows

.

We now use MAPMAKER's "order" command to find a linear order of the markers on chromosome 10. Briefly, this command performs the following analyses: (1) it tries to find a small subset of loci (by default, 5 loci), for which a single order is found to be much more likely than any other using a "compare" style analysis; (2) remaining markers which can be mapped to a unique position are added to this order one at a time; (3) at the end, any markers which cannot be mapped to a unique position in the order are mapped into multiple intervals. A more detailed explanation is provided in the reference section.

We have a few hints on the use of the "orders" command, and three-point analysis more generally. First, three-point analysis is not nearly as complete as multipoint analysis, and it is a mistake to rely on it for all map ordering.

We suggest that you use three-point analysis only with very conservative thresholds, as described in the reference manual. This way, MAPMAKER will default to using its more powerful multipoint analysis mechanism to resolve any questionable cases.

Secondly, three-point analysis is somewhat more sensitive to genotyping errors than full multipoint analysis (because the bulk of the raw data is not available to indicate the correct crossover positions). Not only does this mean that conservative thresholds are in order, but also that using three-point with "error detection" is wise in dense data sets which may have some small number of genotyping errors present (again see the reference manual for details). We selected this option in our initialization file using the "triple error detection" command.

Lastly, the ability of the "orders" command to find a starting order depends heavily on its ability to find a good highly informative subset. Thus, it is imperative that you use the "informativeness criteria" command to set good criteria for your data set (because the criteria depend on the nature of your data set, the default criteria are extremely lax.) We also did this in our initialization file -- please see the reference manual for details.

Because the "order" command uses some randomized sorting, your results may disagree slightly with those shown here (in fact, if you run the command twice, you may see slightly different output). Of course, you should not see dramatic qualitative differences in the MAPMAKER analyses (or let's say that if you do, this likely indicates contradictory data in the raw data set, perhaps the result of genotyping errors).

The output from the "order" command can be quite lengthy, essentially because of the multiple automatic steps executed by this command. We explain this output on the next page.

63> order
Linkage Groups at min LOD 3.00, max Distance 50.0
Starting Orders: Size 5, Log-Likelihood 3.00, Searching up to 50 subsets
Informativeness: min #Individuals 44 (codominant only), min Distance 4.0
Placement Threshold-1 3.00, Threshold-2 2.00, Npt-Window 7
===========================================================================
Linkage group 1, 15 Markers:
  56 L062         63 D030         81 T031        109 M067        122 M003 
 123 M007        159 M175        164 M172        197 M024        212 M139 
 222 M153        263 A037        277 A114        278 A063        301 T032 

Most informative subset: 278 222 109 122 212 159 164 Searching for a unique starting order containing 5 of 7 informative loci... Got one at log-likelihood 13.19

Placing at log-likelihood threshold 3.00... Start: 109 122 164 278 159 Npt: 109 122 (263) 164 278 159 Npt: 109 (56) 122 263 164 278 159 Npt: 109 56 122 263 (123) 164 278 159 Npt: 109 56 122 263 123 164 (301) 278 159 No unique placements for 6 remaining markers

Map: Markers Distance 109 M067 18.5 cM 56 L062 2.4 cM 122 M003 6.0 cM 263 A037 2.3 cM 123 M007 2.3 cM 164 M172 3.5 cM 301 T032 4.7 cM 278 A063 8.9 cM 159 M175 --------- 48.4 cM 9 markers log-likelihood= -75.40

Markers placed relative to above map: 109 56 122 263 123 164 301 278 159 :-19-:--2-:--6-:--2-:--2-:--3-:--5-:--9-: 277 2 **.:..*.:....:....:....:....:....:....:....:... 212 2 **.:..*.:....:....:....:....:....:....:....:... 63 2 .*.:.**.:....:....:....:....:....:....:....:... 222 2 **.:..*.:....:....:....:....:....:....:....:... 81 2 **.:....:....:....:....:....:....:....:....:.*. 197 3 **.:..*.:....:....:....:....:....:....:....:.*. --------------------------------------------------------------------------- Placing at log-likelihood threshold 2.00... Start: 109 56 122 263 123 164 301 278 159 Npt-End: (222) 109 56 122 263 123 164 301 278 159 Npt: 222 (212) 109 56 122 263 123 164 301 278 159 Npt: 222 212 109 (63) 56 122 263 123 164 301 278 159 Npt: 222 (197) 212 109 63 56 122 263 123 164 301 278 159 Npt: 222 197 212 (277) 109 63 56 122 263 123 164 301 278 159 No unique placements for 1 remaining marker

As we walk through the "order" output, refer to the output section numbers indicated on the right of the page.

In section 1, MAPMAKER displays the parameters it will use in its analysis. These are described in detail in the reference section.

In section 2, MAPMAKER prints the markers in the group (by name and by number), as well the most informative subset from which it will try to find a starting order. To save space, most of the orders output uses numbers rather than names. In this case, MAPMAKER found a 5 marker subset for which one order was preferred over all others by a relative likelihood of at least 13.19.

In section 3, MAPMAKER has begun iteratively adding markers to the starting order. In this first pass, MAPMAKER is using a strict criteria log-likelihood criteria of 3.0 -- only makers which map to a single interval at this log-likelihood are placed in the order. Nine of the group's 15 markers were added to the order this way.

In section 4, MAPMAKER prints the map of the nine marker order it found. A cartoon is drawn showing the placements of the remaining six markers relative to these nine. Here a star indicates a possible map position, while two stars indicate the best map position.

In section 5, MAPMAKER has begun adding markers to the order again, this time at a less-strict log-likelihood threshold of 2.0. This time, all but one of the markers were mapped.

In section 6, MAPMAKER again displays the map of the extended order and the placement cartoon for the remaining unplaced marker.

In section 7, gives the accepted order of loci a name (here "order1") so that we do not need to type the whole list of loci if we wish to perform further analyses. However, these names are a bit fleeting -- other MAPMAKER commands will also set the value of "order1" (etc.) to report their results.

In section 8, MAPMAKER displays the map(s) for any remaining unplaced markers. That is, each unplaced marker is placed in its most likely position, and a new map is calculated.

To conveniently name this order for future reference, we now give it a new name, "best", using MAPMAKER's "let" command. The name "best" will not be changed by MAPMAKER unless we tell it to do so.

Map:	<< Order Output Continues
  Markers          Distance 
  222  M153          7.6 cM
  197  M024          3.7 cM
  212  M139          1.1 cM
  277  A114          6.0 cM
  109  M067          3.4 cM
   63  D030         16.6 cM
   56  L062          2.4 cM
  122  M003          6.0 cM
  263  A037          2.3 cM
  123  M007          2.3 cM
  164  M172          3.5 cM
  301  T032          4.7 cM
  278  A063          8.9 cM
  159  M175       ----------
                    68.2 cM   14 markers   log-likelihood= -102.55
Markers placed relative to above map:
         222  197  212  277  109  63   56   122  263  123  164  301  278  
          :--8-:--4-:--1-:--6-:--3-:-17-:--2-:--6-:--2-:--2-:--3-:--5-:--9-:
  81 2
**.:..*.:....:....:....:....:....:....:....:....:....:....:....:....:..

order1= M153 M024 M139 A114 M067 D030 L062 M003 A037 M007 M172 T032 A063 M175 other1= T031 --------------------------------------------------------------------------- Best placement of T031: Markers Distance (81) T031 1.2 cM 222 M153 7.6 cM 197 M024 3.7 cM 212 M139 1.1 cM 277 A114 6.0 cM 109 M067 3.4 cM 63 D030 16.6 cM 56 L062 2.4 cM 122 M003 6.0 cM 263 A037 2.3 cM 123 M007 2.3 cM 164 M172 3.5 cM 301 T032 4.7 cM 278 A063 8.9 cM 159 M175 ---------- 69.4 cM 15 markers log-likelihood= -104.62 ===========================================================================

64> let best=order1 best= M153 M024 M139 A114 M067 D030 L062 M003 A037 M007 M172 T032 A063 M175


up: table of contents
previous section: finding chromosome assignment
next section: verifying a map order