C. 
Report of the Committee on Genetic Engineering
(Functional Genomics of Plants)
R. Wu, Convener

 
Department of Molecular Biology & Genetics, Cornell Univusity, Ithaca, NY 14853, USA

 
     One ultimate goal of many plant scientists is to gain in-depth knowledge on the function of each gene in a given plant. Traditional approaches involve studying each plant gene, one at a time. Since there are at least 20,000 genes in any given plant species, it is believed that the functions of all these genes can be more readily determined by using plant genomics, in which the functions of many genes can be studied in parallel.
For convenience of discussion, a plant genomics project can be arbitrarily divided into three phases. Phase I involves mapping the genome by genetic and physical methods. Phase II includes cloning and sequencing all, or at least most, of the genes. Phase III entails determining the function of each gene. Phase III is not dependent on the completion of either Phase I or II work.
     Phase III work can be further divided into three steps. Step one is to construct a complete insertional-mutagenesis library for a large number of plants of a given species, with the hope of disrupting each gene separately. Step two is to determine the DNA sequence of one or both flanking regions of the inserted DNA and, in some cases, the chromosomal location of the disrupted gene as well. Step three is to identify the function of each disrupted gene by correlating it with a mutant phenotype, followed by detailed analyses, including attempts to obtain revertants or using complementation tests. Several additional approaches are needed to study the function of a gene, especially if there is no obvious phenotypic change. In this review article, I will cover mainly step one of Phase III research.
     For Arabidopsis and rice, the work on Phase I was completed by mapping with a large number of RFLP, RAPD and/or SSLP markers (see Pejic et al. 1998 for a review), as well as constructing large contigs of YAC or BAC clones that span most of the genomes. Phase II work is expected to be completed in mid-2000 for Arabidopsis and around 2003 for rice, when most of the DNA sequence information is expected to have been obtained. Thus, the major tasks for genomics research over at least the next 25-50 years will be to correlate data on the DNA sequence of each gene with biological functions.
     Work related to step one of Phase III, functional genomics of Arabidopsis, was started ten years ago, when a partial T-DNA insertional-mutagenesis library with approximately 8,000 insertional mutants was made (Feldmann et a! 1989, and subsequent work by the same group). Some work on steps 2 and 3 was also initiated in 1989. However, progress has been slow. Thus, new methods and strategies are required to expedite the progress.
     In work related to step two of Phase ifi, flanking sequences of T-DNA inserts can be determined by using TAIL PCR (Liu eta!. 1995) or suppression PCR (Schupp eta!. 1998), followed by sequence analysis. The chromosomal location of a gene-disrupted mutant can be determined by RFLP, RAPD, AFLP (Pejic et al. 1998), or SSLP (see Ponce et a!. 1999 for a review).
     In work related to step three of Phase III, several methods involving cell biology and molecular biology techniques are being used to determine the functions of genes. However, additional methods need to be established. To assist with functional analysis, the expression levels of all cDNAs in a library can be determined by using the microarray method. In this method, small amounts of DNA samples from thousands of cDNAs are applied onto a microchip. Four to ten DNA microchips would therefore include all of the cDNAs in a library. Then, mRNAs are isolated from different tissues of a given plant and separately hybridized to several identical copies of DNA chips to determine tissue-specific expression (see Pennisi 1998 for a review). Alternatively, a plant is subjected to different abiotic stresses, or challenged with pathogens. Total mRNAs are isolated after each treatment and separately hybridized to identical copies of DNA chips to determine stress-induced gene expression, or pathogen-induced gene expression, in different plant tissues.
     In all three phases of a plant genome project, large sets of data are generated. Therefore, extensive computational analyses are required to handle these large data sets.
     For step one of Phase III work, to date, over 100,000 insertional mutant plant lines (T-DNA-tagged mutants) have been produced by different groups of investigators. Since a T-DNA insertional library represents a random library, for the estimated 25,000 genes in Arabidopsis, approximately 100,000 insertional mutants (with a 4-fold redundancy) are needed to cover most of the genome, such as 98%, on a statistical basis. It is very time- consuming to produce a large number of insertional mutants, since each one comes from a separate successful transformation event.
     In principle, the number of insertional mutants should be proportional to the genome size, not the number of genes. If the process of generating insertional mutants is random and, if on average, there is a gene every 5 kb in the plant genome, one needs 24,000 mutants for Arabidopsis, which has a genome size of 1.2 x 108 bp. By allowing a margin of safety, most scientists would try to obtain a 10-fold redundancy, which means 240,000 mutants are needed. By using the same calculations and allowing a 10-fold redundancy, one needs 800,000 mutants for rice (genome size 4 x i08 bp), and 4,800,000 for maize (genome size 2.4 x 109 bp).
     A second type of insertional library makes use of a transposable element, such as AciDs of maize, to produce a relatively small number (such as 1,000) of primary anchor mutants. After crossing a Ds-containing plant with an Ac-containing plant, a large number of secondary insertional mutants can be generated from each primary mutant after transposition. One major advantage of this method is that from 1,000 anchor plant lines, over 200,000 secondary insertional-mutant plant lines can be generated without the need of additional time-consuming transformation steps (Hehi and Baker, 1989; Bancroft and Dean, 1993). The AciDs system was improved by using enhancer-trap and gene-trap plasmids to transform Arabidopsis. This allows disrupted genes, which are non-phenotypic, to be detected by the expression of a reporter gene (such as Gus). Eventually, results from enhancer-trap and gene-trap experiments can be used to infer gene function. So far, this type of insertional-mutant library includes less than 10,000 AciDs- tagged plant lines (Sundaresan et a!. 1995; Martienssen 1998). Therefore, many additional plant lines are needed to cover approximately 98% of the genome. Another advantage of Ac/Ds-tagged plants over the T-DNA-tagged plants is that revertants can be obtained more easily.
     K. Shimamoto’s group has published several papers on using the AciDs system to produce insertional mutant rice plants (Izawa eta!. 1997; Enoki eta!. 1999). These plants are simple insertional mutant lines without gene- or enhancer-trap features. So far, approximately 6,000 mutant plant lines have been produced, demonstrating that Ac can be used efficiently for functional analysis of the rice genome. However, a much larger population is needed to cover the entire rice genome. C.-d. Han’s group produced several hundred enhancer/gene trap Ds-vector or Ac/Ds-vector transformed mutant rice plant lines. This group plans to generate over 10,000 mutant plant lines over the next several years (Chin et a!. 1999). The advantage of using the AciDs system in rice, especially when enhancer-trap or gene-trap features are included, is that one can start with a mutant plant line which shows a phenotypic change and try to identify the gene that is responsible for it by using the flanking sequence adjacent to the Ds element, following the classical forward genetics approach.
     Recently, two abstracts reported production of AciDs insertional mutants, which include enhancer-trap features. Greco et a!. (1999) constructed plasmids carrying the maize transposon systemsAc-Ds and En-I (spm) as enhancer and activation traps and transformed Japonica rice varieties with them. Having demonstrated transposition activity using these plasmids, this group is now developing transposon-tagged rice plant lines for functional genomics. Another group reported preliminary results on the genetic transformation of Basmati rice with AciDs transposons for isolation of important genes (Dhaliwal eta!. 1999).
     A third type of insertional-mutant library is produced by making use of an endogenous transposon, such as the Ac transposon or the mutator in maize (see references quoted in Hehl and Baker 1989; Walbot 1992; Bensen eta!. 1995; Martienssen 1998), or the rice transposon, tosl7 (Hirochika et a!. 1996). tosl7 transposes only during the tissue-culture stage; thus, transposition events can be controlled. So far, the tosl 7-based insertional mutant library has less than 9,000 rice lines (Sato et a!. 1999). In this approach, there are usually multiple copies of the transposable element, often as many as 5 to 20 copies per plant. The major use of this type of mutant library is to determine the function of a gene by the reverse genetics approach. This is done by identifying a mutant plant line with PCR-based screening of the entire mutant library using two primers: one primer sequence derived from a specific gene of interest and the other derived from a portion of the endogenous transposon sequence (Ballinger and Benzer 1989; Bensen et a!. 1995; Sato et al. 1999). Even though it is difficult to obtain revertants to confirm the specific assignment of a mutation with a given phenotype, one can use complementation tests to confirm an assigned gene function. As discussed earlier, by including a margin of safety of 10-fold redundancy, perhaps 800,000 mutant plant lines are needed. Of course, if a mutant plant line includes an average of five copies of the endogenous transposon, the number of mutant plant lines can be reduced fivefold.
     The advantage to the above approach is that a large number of plant lines can be screened relatively quickly using PCR. The disadvantage is that by using a gene-specific primer with the hope of learning more about the function of this gene, the identified insertional mutant plant line may not show any phenotype. Since in Arabidopsis and rice, so far, only a low percent of insertional mutant plant lines give identifiable phenotypic changes, a considerable amount of effort may be consumed without learning the function of the gene of interest.
     All of the insertional mutant libraries described above have been constructed based on random insertions of a DNA (a T-DNA, an endogenous transposon, or a plasmid that contains a transposable element that can be introduced and result in transposition) into the plant genome. In other words, the insertional mutagenesis libraries are produced by a “shotgun”-type approach because the site of insertion is presumed to be random. In shotgun libraries, one usually needs to include a 4-fold redundancy in order to cover most of the genome on a statistical basis. Since, in fact, insertions are not random, one needs to include perhaps a 10-fold redundancy. It is very labor-intensive to produce and analyze approximately 800,000 insertional mutant plant lines in rice. Thus, a new method needs to be discovered to produce an insertional mutant library using a systematic approach. In this case, only a slight redundancy (perhaps 50%) would need to be included to cover most of the genome. Thus, in rice, only 120,000 insertional mutants would be needed. If a systematic approach can be devised to produce insertional mutant libraries, it would save a great deal of time and labor in both the construction and subsequent analyses of plant lines.

 
References
Ballinger, D.G. and S. Benzer, 1989. Targeted gene mutations in Drosophila. Proc. Nati. Acad. Sci. USA 86:
9402-9406.
Bancroft, I. and C. Dean, 1993. Transposition pattern of the maize Ds in Arabidopsis thaliana. Genetics 134:
1211-1229.
Bensen, R.J. G.S. Johal, V.C. Crane, J.T. Tossberg, P.S. Schnable, R.B. Meeley and S.P. Briggs, 1995. Cloning and characterizing of the maize An] gene. Plant Cell 7: 75-84.
Chin, H.G., M.S. Choe, S.H. Lee, S.H. Park, S.H. Park, J.C. Kao, N.Y. Kim, J.J. Lee, B.G. Oh, G.H.Yi, S.C. Kim, H.C.Choi, M.J. Cho and C.-d. Han, 1999. Molecular analysis of rice plants harboring an AciDs transposable element-mediated gene trapping system. Plant J. 19: 6 15-623.
Dhaliwal, H.S. B. Singh, V.K. Gupta, R. Sharma and H. Uchimiya, 1999. Genetic transformation of Basmati rice with AciDs transposons for isolation of important genes. In General Meeting of Int’l Program on Rice Biotechnology, September 20-24, 1999, Phuket, Thailand, p. 186.
Enoki, H., T. luwa, M. Kawahara, M. Komatsu, S. Koh, J. Kyozuka and K. Shimamoto, 1999. Acas a tool for functional genomics of rice. Plant J. 19: 605-613.
Feldmann, LA., M.D. Marks, M.L. Christianson and R.S. Quatrano, 1989. A dwarf mutant of Arabidopsis
generated by T-DNA insertion mutagenesis. Science 243: 1351-1354.
Greco, R., P. Ouwerkerk, H. Rage and A. Pereira, 1999. Functional genomics with transposons in rice, in General Meeting of Int’l Program on Rice Biotechnology, September 20-24, 1999, Phuket, Thailand, p. 42.
Hehl, R. and B. Baker, 1989. Induced transposition of Ds by a stable Ac in crosses of transgenic tobacco plants. Mo!. Gen. Genet. 217: 53-59.
Hirochika, H., 1999. Insertional mutagenesis of rice using endogenous retrotransposons. Plant and Animal Genome VIII, San Diego, p. 66.
Izawa, T., T. Ohnishi, T. Nakano, N. Ishida, H. Enoki, H. Hashimoto, K. Itoh, R. Terada, C. Wu, C. Miyazaki, T. Endo, S. lida and K. Shimamoto, 1997. Transposon tagging in rice. Plant Molec. Biol. 35: 219-229.
Liu, Y.G., N. Mitsukawa, T. Ooswni and R.F. Whittier, 1995. Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J. 8: 457-463.
Martienssen, R.A., 1998. Functional genomics: probing plant gene function and expression with transposons. Proc. Natl. Acad. Sci. USA 95: 2021-2026.
Pejic, I., P. Ajmone-Marsan and M. Motto, 1998. Comparative analysis of genetic similarity among maize inbred lines detected by RFLPs, RAPDs, SSRs and AFLPs. Theor. App!. Genet. 97: 1248-1255.
Pennisi, E., 1998. Sifting through and making sense of genome sequences. Science 280: 1692-1693.
Ponce, MR., P. Robles and J.L Micol, 1999. High-throughput genetic mapping inArabidopsis thaliana. Mol. Gen. Genet. 261: 408-415.
Sato, Y., N. Sentoku, Y. Miura, H. Hirochika, H. Kitano and M. Matsuoka, 1999. Loss-of-function mutations in the rice homeobox gene DSHJ5 affect the architecture of internodes resulting in dwarf plants. EMBO J.
18: 992-1002.
Schupp, J.M., L.R. Price, A. Kleytska and P. Keim, 1998. Internal and flanking sequence from AFLP fragments using ligation-mediated suppression PCR. BioTechniques 26: 905-912.
Sundaresan, V., P. Springer, T. Volpe, S. Howard, J.D.G. Jones, C. Dean, H. Ma and R. Martienssen, 1995. Patterns of gene action in plant development revealed by enhancer-trap and gene-trap transposable elements. Genes & Develop. 9: 1797-1810.
Walbot, V., 1992. Strategies for mutagenesis and gene cloning using transposon tagging and T-DNA insertional mutagenesis. Annu. Rev. Plant Physiol. Plant Mol. Biol. 43: 49-82.