This reports the protocol used to align the Rice_ESTcluster_PlantGDB features to tigrv4-genome. Mon Apr 24 16:41:35 2006 Source of Rice_ESTcluster_PlantGDB : this is a set of EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Oryza_sativa/Oryza_sativa.PUT.fasta.bz2 Alignment procedure details --------------------------- 141239 Rice_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minScore=120 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'Coding-SameSpecies' data sets. Initial summary # alignments : 125243 # unique Features these alignments represent: 118987 % of total features these alignments represent : 84.25 % The following is the distribution of the feature coverage %coverage no of alignments -------- -------- 9 16 19 253 29 830 39 1742 49 2734 59 5608 69 9134 79 8861 89 14551 90 2495 91 2779 92 3293 93 3964 94 4742 95 6109 96 7182 97 8599 98 9600 99 9145 100 23601 Alignments less than 95 % coverage are deleted # remaining Alignments : 58243 # unique Features these remaining alignments represent: 56220 % of total features these alignments represent : 39.80 % GAP distribution of the remaining features Gaps # alignments -------- -------- 1000 40219 2000 8242 3000 4411 4000 2268 5000 1102 6000 654 7000 373 8000 279 9000 189 10000 115 20000 266 Alignments with gaps > 4000 bp are deleted # remaining Alignments : 55140 # unique Features these remaining alignments represent: 53196 % of total features these alignments represent : 37.66 % % Identity distribution of the remaining features % Identity # alignments -------- -------- 90 0 91 0 92 0 93 0 94 61 95 462 96 1050 97 2526 98 6745 99 29222 100 15074 Frequency distribution of the remaining features # hits # features -------- -------- 1 52387 2 478 3 123 4 66 5 37 6 25 8 47 9 14 10 10 20 4 30 2 40 1 50 1 100 1 Features that hit more than four times are deleted. # remaining Alignments : 53976 # unique Features these remaining alignments represent: 53054 % of total features these alignments represent : 37.56 %