44. MAPL: A package of microcomputer programs for RFLP linkage mapping

Yasuo UKAI16, Ryo OHSAWA1 and Akira SAITO2

1) National Institute of Agro-environmental Sciences, Tsukuba, 305 Japan

2) National Institute of Agro-biological Resources, Tsukuba, 305 Japan

High-density restriction fragment length ploymorphism (RFLP) maps are now being prepared in many crop plants such as tomato, maize, potato, pepper, and lettuce. In rice, detailed maps of RFLP have been developed by MacCouch et al. (1988) and Saito et al. (in prep.). Methods for estimation of recombination fraction and other related values in RFLP linkage maps is essentially the same as in conventional linkage maps for ordinary Mendelian traits. However, the amount of calculation required is quite different. In the case of RFLP maps loci to be simultaneously analyzed is usually very many, often several hundreds, and the number of possible pairs of loci for which a two-point linkage test and estimation of recombination value are to be made is much more than can be analyzed through conventional calculation. Also determination of the order of RFLP loci within linkage group involves iterative calculations of a large matrix. Thus, aid of computer in preparation of RFLP linkage maps is indispensible.

We have recently developed a package of programs named MAPL which can be used for preparation of RFLP linkage maps from segregation data of RFLP in F2 or backcross generation ftom a cross of two pure lines (P1 and P2). At the present time the package is composed of 8 independently executable programs, YOMU, SEGREG, LINKAGE, GROUP, ORDER, GRAPH, LBLOCK, and LOD. It is written in N88-Basic(86) on MS-DOS and runs on a 16-bit or 32-bit microcomputer of NEC PC9801 series.

YOMU is used for data loading. Data of RFLP for the loci experimentally analyzed is loaded from the keyboard or data files prepared by a software Lotus 1-2-3 (Lotus Dev. Corp.). Three genotypes, i.e. P1 type homozygote, P2 type homozygote and heterozygote are coded as 1, 2, and 3, respectively. Missing values are coded as 0. For instance, a sequence of codes like 133231312313321233113321333122 for the first 30 plants in a segregating generation is loaded for a particular RFLP.

SEGREG provides chi-square values for observed segregation ratio tested against the expected ratio of codominant (1 : 2 : 1) or dominant (3 : 1) locus.

LINKAGE calculates chi-square values for linkage and maximum LOD score for all pairs of loci. Here LOD is the abbreviation of logarithm of odds and is the common logarithm of likelihood ratio for the observed segregation frequency at a pair of loci. When linkage is significant at 5% level and the maximum LOD is 1.30 or higher, LINKAGE further calculates recombination value and its standard error by the maximum likelihood method. The results are recorded in a disk file.

GROUP makes grouping of RFLP loci by sequentially combining a locus which shows the smallest recombination value against it. A group thus may correspond to a part of or a whole linkage group.

ORDER allows to determine the most likely order and relative position of the loci within a group based on a multi-point linkage data. When the number of loci is large, sequential application of ordinary three-point analysis can not give correct order, since it gives no information as to what set of three loci to be chosen for comparison. On the other hand, determination of the order based on the comparison of LOD scores for all possible orders would require enormously heavy computation and it is practically impossible when the number of loci exceeds 10. In a computer package MAPMAKER developed by Lander et al. (1987) they proposed to choose 6 markers first, compute maps for every possible order of them and then to place the remaining marker(s) relative to the order of the 6 marker framework. However, the order determined by this method depends on what loci are chosen for the first set of 6 loci and in what order the remaining loci are inserted sequentially.

So Torgerson's (1952) metric multidimentional scaling (MDS) method was utilized in the procedure of ORDER. The MDS method provides relative position of the elements in a space of a small number of dimensions based on a distance relationship for all pairs of elements. Since RFLP loci are located linearly on a chromosome, and the genetic distance between loci are expressed by map distance (cM), it is expected that application of MDS method to the RFLP linkage data may give useful informations about the relative position and order of the loci within a linkage group. MDS method enables us to compute automatically the most likely order of RFLP loci even if the number of loci is as many as 50. Also the method makes it possible to detect misclassification, if any, involved in the segregation data. The dimentionality of space necessary to account for the multipoint linkage relationships of RFLP loci are very small, one for a short chromosome segment and two for a long chromosome. The map distance is calculated from recombination value according to Kosambi's (I 944) mapping function. A set of map distances for all possible pairs of loci is used as the distance matrix of MDS method.

GRAPH illustrates RFLP linkage map, conventional linkage map and cytological map for each chromosome. The region(s) of chroniosome which include genes showing abnormal segregation ratio is also marked on the illustrated chromo- some.

LBLOCK shows graphical genotypes on all chromosomes for each plant. Crossover points on a chromosome for each plant are also shown.

LOD calculates LOD score for all sets of a given number, 9 at maximum, of RFLPs which are sequentially located on the same chromosome. The program is used for checking the order of RFLP loci determined by MDS method.

MAPL can be used by everybody without prior computer experience. For program execution only two floppy disks, one for programs and the other for data recording and output, are required. MAPL has been utilized with success for the construction of a detailed RFLP linkage map of rice from F2 segregation data which were provided by a group of researchers at National Institute of Agroenvironmental Sciences. Programs are available upon request. One who wants a copy of MAPL is requested to send two blank 5-inch high-density floppy disks to the first author.

References

Kosambi, D. D., 1944. The estimation of map distance from recombination values. Ann. Eugen. 12: 505-525.

Lander, E. S., P. Green, J. Abrahamson, A. Barlow, M. J. Daly, S. E. Lincoln and L. Newburg, 1987. MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.

McCouch, S. R., G. Kochert, Z. H. Yu, Z. Y. Wang, G. S. Khush, W. R. Coffman and S. D. Tanksley, 1988. Molecular mapping of rice chromosomes. Theor. Appl. Genet. 76: 815- 829.

Torgerson, W. S., 1952. Multidimentional scaling: I. Theory and method. Psychometrika 17: 401- 419.