49. BLAST analysis of the terminal sequences of cloned markers from 
      the genetic map of rice
      S. Constantino, A. RESURRECCION, B. ALBANO, J.-A. Champoux, 
      C. Villareal, G.S. KHUSH and J. Bennett
      Division of Plant Breeding, Genetics and Biochemistry,
      International Rice Research Institute, MCPO 3127, 
      Makati City 1271, Philippines

 
     Functional genomics seeks to assign biological function to the sequences of genes and intergenic regions. This assignment may be accomplished by forward genetics, reverse genetics or a variety of ancillary approaches. Experience shows that a large minority (25-42%) of anonymous gene sequences from a plant such as rice have sufficient sequence similarity to already characterized genes from rice or other organisms, that a reasonable inference can be made as to the class of protein encoded by the gene (Uchimiya et al. 1992, Yarnamoto and Sasaki 1997, Harushima et al. 1998). While this sort of analysis usually falls short of establishing biological function, it can provide important clues, especially when combined with the location of the gene on a genetic or physical map of rice (Harushima eta!. 1998). It is for this reason that we have attempted to provide sequence data for 350 of the markers of the genetic map established principally by workers from Cornell University (McCouch et a!. 1988, Causse et a!. 1994). These data supplement the large database assembled from sequenced markers of the genetic map developed by the Rice (Genome Research Project at Tsukuba (Harushima et al. 1998).
     Terminal sequencing was conducted principally on both RZ clones (cDNA clones derived from RNA of etiolated leaves of IR36) and RG clones (genomic clones derived from IR36 DNA). Some barley (BCD) and oat (CDO) cDNA clones were also sequenced The names and map locations of these clones are presented in Robeniol et al. (1996), which also describes the manual sequencing procedure. Both ends of each clone were sequenced to allow PCR-based amplification of longer DNA segments than is possible with data derived from the single-pass sequencing conducted by Uchimiya et al. (1992) and Harushima et al. (1998). The nucleotide sequences and the deduced amino acid sequences of each terminus were then compared to the Genbank databases using 
programs of the Basic Local Alignment Search Tool (BLAST) (Altschul eta!. 1990). Table 1 shows hits with the BLAST-N program accessed through the Entrez Web site at www.ncbi.nlm.nih.gov.
    A total of 76 clones, representing all twelve rice chromosomes, recorded hits on known sequences. The termini of each clone were designated as F or R depending on whether they were sequenced using the universal forward (F) or reverse (R) sequencing primer. As all inserts had been ligated into the vectors in random orientation, there was no relationship between the F and R ends and the direction of transcription of the gene. Most of the hits (71%) were among the RZ clones, consistent with these clones being from a cDNA library (Causse et a!. 1994). Relative few hits (24%) were found among the RG clones which were random PstI genomic clones. The remainder (5%) were BCD and CDO cDNA clones. In 20% of the 76 clones registering hits, both termini hit on known genes, and in all of these cases the same class of protein was revealed, even if the name was different, as in the case of RZ244. For this clone, the F terminus hit ferric leghemoglobin reductase and the R terminus hit lipoamide dehydrogenase but these proteins have the same function and in soybean nodules are probably the same poroteins. For the remaining 80% of clones, only the F or R terminus hit on a known gene. The usual reason for the lack of homology at the other terminus was that the terminus corresponded to a poorly conserved region of the gene such as the 3’-untranslated region.
     Most of the hits were to rice and other plant genes and were unequivocal, including 19 hits on genes known only from dicotyledonous plants. Three hits were to non-plant genes (maize dwarf mottle virus coat protein, Caenorhabditis dolichol monophosphate mannose synthase, human isovaleryl CoA dehydrogenase). Further sequencing would be required to confirm these particular hits. The hits included several cases in which more than one member of a gene family was revealed. Alpha-tubulin was found on chromosomes U and 3, ferredoxin III on chromosomes 2 and 3, and cytosolic glyceraldehyde 3-phosphate dehydrogenase (GAPDH) on chromosomes 4 (twice) and 8. Translational elongation factors were found on chromosomes 2, 3 and 12. As further genes of known function are isolated and deposited in the public databases, the number of hits registered by the 350 clones markers listed by Robeniol eta!. (1996) is expected to increase from the 76 recorded here. In addition, the current hits will be examined more closely to determine their patterns of expression and their biological function in terms of specific traits.
Acknowledgement
This research was supported in part by grants from the Rockefeller Foundation and the German Federal Ministry for Economic Cooperation (BMZ).

 
 
Reference
Altschul, S.F., W. Gish, W. Miller, Webb, E.W. Myers and D.J. Lipmann, 1990. Basic Local Alignment Search Tool. Mol. Biol. 215: 403-410.
Causse, MA., T.M. Fulton, Y.G. Cho, S.N. Ahn, J. Chunwongse, K.Wu, J. Xiao, Z. Yu, P.C. Ronald, S.B. Harrington, GA. Second, SR. McCouch and S.D. Tanksley, 1994. Saturated molecular map of the rice Genome based on an interspecific backcross population. Genetics 138: 1251-1274.
Harushima, Y., M. Yano, A. Shomura, M. Sato, T. Shitnano, Y. Kuboki, T. Yamamoto, S.Y. Lin, B.A Antonio, A.
Parco, H. Kajiya, N. Huang, K. Yamamoto,Y. Nagamura,N. Kurata, G.S. Khush and T. Sasaki, 1998. A
high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148:
479-494.
McCouch, S.R., G. Kochert, Z.H. Yu, Z.Y. Wang, G.S. Khush, W.R. Coffman and S.D. Tanksley, 1988. Molecular mapping of rice chromosomes. Theor. AppL Genet. 76: 815-829.
Robeniol, J.A., S.V. Constantino, A.P. Resurreccion, C.P. Villareal, B. Ghareyazie, B.R. Lu, S.K. Katiyar, C.A.
Menguito, E.R. Angeles, H.-Y. Fu, S. Reddy, W. Park, S.R. McCouch, G.S. Khush and J. Bennett, 1996.
Sequence-tagged sites and low-cost DNA markers for rice. Proceedings of the Third International Rice
Genetics Symposium. p. 293-306.
Uchimiya H., S. Kiduo, T. Shimazaki, S. Aotsuka, S. Takamatsu, R. Nishi, H. Hashimoto, Y. Matsubayashi, N. Kiduo, M, Umeda and A. Kato, 1992. Random sequeacing of cDNA libraries reveals a variety of expressed genes incultured cells of rice (Qryza sativa L.) Plant Journal. 106:1241-1255.
Yamamoto, K. andT. Sasaki, 1997. Large scale EST sequencing in rice. Plant Mol. Biol. 35: 135-144.