simibd V1.0
Last Modification: December 7, 1995
This directory and the files within contain the software implementation of the algorithms described in Davis et al., "Nonparametric simulation-based statistics for detecting linkage in general pedigrees," Am J Hum Genet 57(4):A190
The software package is divided into two distinct executables.
The executable "simibd" performs the actual statistical
calculations and directs the second executable, "xslink", to
perform the simulations necessary to describe the simulated
null distribution. "xslink" is a version of "fastslink", which
is further described in the README.XSLINK
file.
REFERENCES
Please use the following four references when using results from these programs.
The algorythms used in simibd are based on:
1) Davis S, Schroder M, Goldin LR, and Weeks DE, "Nonparametric
simulation-based statistics for detecting linkage in general
pedigrees," Am J Hum Genet, 57(4): A190.
Also, please reference the following three papers that deal with the SLINK code:
SLINK implements a simulation algorithm developed by Jurg Ott and described in:
2) Ott J (1989) Computer-simulation methods in human linkage
analysis. Proc Natl Acad Sci USA 86:4175-4178
The algorithm was implemented in the original SLINK computer
program package by Weeks, Ott, and Lathrop:
3) Weeks DE, Ott J, Lathrop GM (1990) SLINK: a general simulation
program for linkage analysis. Am J Hum Genet 47:A204 (abstr)
The SLINK simulation program has been modified by Schaffer and
Weeks to use the algorithms developed by Cottingham et al:
4) Cottingham Jr RW, Idury RM, Schaffer AA (1993) Faster sequential
genetic linkage computations. Am J Hum Genet 53:252-263
Please cite references 1-4 if you use this package. Thank you.
In order to use the programs included here, you must compile them to run on your machine. We recommend that you read the README.XSLINK before doing this, as there are compilation instructions specific to "xslink" included there. However, simply issuing the command:
make
will produce the executables "simibd" and "xslink" that can
perform the SimIBD and SimISO calculations (see reference).
Note that if you have a version of the program already "made", you
must issue the command:
make cleanall
before attempting to make a new version.
In our experiences, producing optimized code provides a
significant increase in speed while still affording correct
answers. If you are unfamiliar with producing optimized code when
compiling, see your system administrator for assistance. We
recommend using gcc with at least one level of optimization,
the default setting for compiling with make. The default compiler
and options can be changed by editing the "Makefile" in this directory.
USE
Using simibd is simply a matter of having a LINKAGE formatted pedigree and locus file. Then, issue the command:
simibd <pedfile> <locfile>where <pedfile> is the name of the pedigree file and <locfile> is the name of the locus or data file. You will be prompted for several pieces of information including the family weighting function, the total number of replicates (this number will determine the accuracy of the resulting p-value), the number of replicates per xslink run (this number allows the run to be broken up into smaller pieces, thereby reducing the amount of disk space neecessary for calculation), the number of the trait locus, the value of the trait signifying affection, and the marker locus to be analyzed. We recommend using the weighting function 1/sqrt(p) [p is the population frequency of a given allele]. The number of replicates will vary depending on the level of accuracy desired for the p-value and, to a lesser extent, on the number of affecteds in the pedigree. The number of replicates per xslink run is dependent on the amount of disk space that you wish to devote to writing the replicates generated by xslink. Try using 10% of the total number of replicates as a starter value (i.e., if you are using 1000 replicates, try using 100 replicates per xslink run).
If you would like to perform either the SimISO and SimAPM statistics in addition to the SimIBD statistic, issue the same command as above, but with the optional "-a" or "-u". For example, to get the SimAPM result, issue the command:
simibd -a <pedfile> <locfile>The command
simibd -au <pedfile> <locfile>will generate results for SimIBD, SimISO, and SimAPM.
Simibd attempts to keep you up-to-date about its progress and will estimate time necessary to complete the current task at hand. When run is complete, you will have before you a great deal of information. For your convenience, the brief output is contained in the file "simibd.out". This contains only the summary values of the statistics and the assosiated p-values for each family. Other output files are available with more detailed results including histograms of the data.
The output files are summarized below:
FILE CONTENTS
simibd.out Brief output from all statistics run
& long output from the SimISO calculations
simibd.aff.out Long output from the SimIBD calculations
simibd.un.out Long output from the Unaffected calculations
apm.out Long output from the SimAPM calculations
Temporary files used for communication with xslink are:
FILE CONTENTS
simped.dat Pedigree file for xslink to use simdata.dat Locus file for xslink to use pedfile.dat Simulated replicates generated by xslink
The accuracy of p-value obtained is a function of the number of replicates
simulated to produce the null distribution. For example, if it is
reqired that the p-value be accurate to three decimal places, then at
least 1000 replicates must be simulated. However, it may be
beneficial to use a larger number of replicates when the pedigrees
being simulated have a large number of affected individuals because
covering the large sample space adequately enough to produce
a null distribution may take more than the number of replicates needed
for p-value accuracy. In practice, if there is much variability in
the p-value obtained from the same data run several times, increase
the number of replicates simulated.
QUESTIONS AND COMMENTS
If you have any comments about how to improve this code, please address them to:
Sean Davis
University of Pittsburgh
Department of Human Genetics
A300 Crabtree Hall
130 DeSoto Street
Pittsburgh, PA 15213
davis@moriarty.hgen.pitt.edu
or
Daniel E. Weeks
University of Pittsburgh
Department of Human Genetics
A310 Crabtree Hall
130 DeSoto Street
Pittsburgh, PA 15213
daniel.weeks@well.ox.ac.uk