49. BLAST analysis of the terminal sequences of cloned markers
from
the genetic map of rice S. Constantino, A. RESURRECCION,
B. ALBANO, J.-A. Champoux,
C. Villareal, G.S. KHUSH and J. Bennett Division of Plant Breeding,
Genetics and Biochemistry,
International Rice Research
Institute, MCPO 3127,
Makati City 1271, Philippines Functional genomics seeks to assign
biological function to the sequences of genes and intergenic regions. This
assignment may be accomplished by forward genetics, reverse genetics or
a variety of ancillary approaches. Experience shows that a large minority
(25-42%) of anonymous gene sequences from a plant such as rice have sufficient
sequence similarity to already characterized genes from rice or other organisms,
that a reasonable inference can be made as to the class of protein encoded
by the gene (Uchimiya et al. 1992, Yarnamoto and Sasaki 1997, Harushima
et al. 1998). While this sort of analysis usually falls short of establishing
biological function, it can provide important clues, especially when combined
with the location of the gene on a genetic or physical map of rice (Harushima
eta!. 1998). It is for this reason that we have attempted to provide sequence
data for 350 of the markers of the genetic map established principally
by workers from Cornell University (McCouch et a!. 1988, Causse et a!.
1994). These data supplement the large database assembled from sequenced
markers of the genetic map developed by the Rice (Genome Research Project
at Tsukuba (Harushima et al. 1998).
Terminal sequencing was conducted
principally on both RZ clones (cDNA clones derived from RNA of etiolated
leaves of IR36) and RG clones (genomic clones derived from IR36 DNA). Some
barley (BCD) and oat (CDO) cDNA clones were also sequenced The names and
map locations of these clones are presented in Robeniol et al. (1996),
which also describes the manual sequencing procedure. Both ends of each
clone were sequenced to allow PCR-based amplification of longer DNA segments
than is possible with data derived from the single-pass sequencing conducted
by Uchimiya et al. (1992) and Harushima et al. (1998). The nucleotide sequences
and the deduced amino acid sequences of each terminus were then compared
to the Genbank databases using
programs of the Basic Local Alignment Search Tool (BLAST)
(Altschul eta!. 1990). Table 1 shows hits with the BLAST-N program accessed
through the Entrez Web site at www.ncbi.nlm.nih.gov.
A total of 76 clones, representing all
twelve rice chromosomes, recorded hits on known sequences. The termini
of each clone were designated as F or R depending on whether they were
sequenced using the universal forward (F) or reverse (R) sequencing primer.
As all inserts had been ligated into the vectors in random orientation,
there was no relationship between the F and R ends and the direction of
transcription of the gene. Most of the hits (71%) were among the RZ clones,
consistent with these clones being from a cDNA library (Causse et a!. 1994).
Relative few hits (24%) were found among the RG clones which were random
PstI genomic clones. The remainder (5%) were BCD and CDO cDNA clones. In
20% of the 76 clones registering hits, both termini hit on known genes,
and in all of these cases the same class of protein was revealed, even
if the name was different, as in the case of RZ244. For this clone, the
F terminus hit ferric leghemoglobin reductase and the R terminus hit lipoamide
dehydrogenase but these proteins have the same function and in soybean
nodules are probably the same poroteins. For the remaining 80% of clones,
only the F or R terminus hit on a known gene. The usual reason for the
lack of homology at the other terminus was that the terminus corresponded
to a poorly conserved region of the gene such as the 3’-untranslated region.
Most of the hits were to rice and
other plant genes and were unequivocal, including 19 hits on genes known
only from dicotyledonous plants. Three hits were to non-plant genes (maize
dwarf mottle virus coat protein, Caenorhabditis dolichol monophosphate
mannose synthase, human isovaleryl CoA dehydrogenase). Further sequencing
would be required to confirm these particular hits. The hits included several
cases in which more than one member of a gene family was revealed. Alpha-tubulin
was found on chromosomes U and 3, ferredoxin III on chromosomes 2 and 3,
and cytosolic glyceraldehyde 3-phosphate dehydrogenase (GAPDH) on chromosomes
4 (twice) and 8. Translational elongation factors were found on chromosomes
2, 3 and 12. As further genes of known function are isolated and deposited
in the public databases, the number of hits registered by the 350 clones
markers listed by Robeniol eta!. (1996) is expected to increase from the
76 recorded here. In addition, the current hits will be examined more closely
to determine their patterns of expression and their biological function
in terms of specific traits.
Acknowledgement
This research was supported in part by grants from the Rockefeller
Foundation and the German Federal Ministry for Economic Cooperation (BMZ).
Altschul, S.F., W. Gish, W. Miller, Webb, E.W. Myers and
D.J. Lipmann, 1990. Basic Local Alignment Search Tool. Mol. Biol. 215:
403-410.
Causse, MA., T.M. Fulton, Y.G. Cho, S.N. Ahn, J. Chunwongse,
K.Wu, J. Xiao, Z. Yu, P.C. Ronald, S.B. Harrington, GA. Second, SR. McCouch
and S.D. Tanksley, 1994. Saturated molecular map of the rice Genome based
on an interspecific backcross population. Genetics 138: 1251-1274.
Parco, H. Kajiya, N. Huang, K. Yamamoto,Y. Nagamura,N. Kurata,
G.S. Khush and T. Sasaki, 1998. A
high-density rice genetic linkage map with 2275 markers using
a single F2 population. Genetics 148:
479-494.
McCouch, S.R., G. Kochert, Z.H. Yu, Z.Y. Wang, G.S. Khush,
W.R. Coffman and S.D. Tanksley, 1988. Molecular mapping of rice chromosomes.
Theor. AppL Genet. 76: 815-829.
Robeniol, J.A., S.V. Constantino, A.P. Resurreccion, C.P.
Villareal, B. Ghareyazie, B.R. Lu, S.K. Katiyar, C.A.
Menguito, E.R. Angeles, H.-Y. Fu, S. Reddy, W. Park, S.R.
McCouch, G.S. Khush and J. Bennett, 1996.
Sequence-tagged sites and low-cost DNA markers for rice.
Proceedings of the Third International Rice
Genetics Symposium. p. 293-306.
Uchimiya H., S. Kiduo, T. Shimazaki, S. Aotsuka, S. Takamatsu,
R. Nishi, H. Hashimoto, Y. Matsubayashi, N. Kiduo, M, Umeda and A. Kato,
1992. Random sequeacing of cDNA libraries reveals a variety of expressed
genes incultured cells of rice (Qryza sativa L.) Plant Journal. 106:1241-1255.
Yamamoto, K. andT. Sasaki, 1997. Large scale EST sequencing
in rice. Plant Mol. Biol. 35: 135-144.
|