This reports the protocol used to align the Rice_ESTcluster_PlantGDB features to Oryza_sativa_indica-chromosome-20070724. Tue Aug 7 18:40:34 2007 Source of Rice_ESTcluster_PlantGDB : this is a set of EST clusters and singletons down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Oryza_sativa/Oryza_sativa.PUT.fasta.bz2 Alignment procedure details --------------------------- 141239 Rice_ESTcluster_PlantGDB are aligned to Oryza_sativa_indica-chromosome-20070724 using blat with blat parameters -minScore=120 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'Coding-SameSpecies' data sets. Initial summary # alignments : 117657 # unique Features these alignments represent: 113911 % of total features these alignments represent : 80.65 % The following is the distribution of the feature coverage %coverage no of alignments -------- -------- 9 23 19 244 29 714 39 1424 49 2531 59 5296 69 8808 79 8683 89 13987 90 2479 91 2683 92 3122 93 3826 94 4877 95 5992 96 7277 97 8504 98 9519 99 10100 100 17567 Alignments less than 95 % coverage are deleted # remaining Alignments : 53060 # unique Features these remaining alignments represent: 51718 % of total features these alignments represent : 36.62 % GAP distribution of the remaining features Gaps # alignments -------- -------- 1000 36425 2000 7360 3000 3899 4000 1913 5000 997 6000 593 7000 360 8000 240 9000 178 10000 125 20000 509 Alignments with gaps > 4000 bp are deleted # remaining Alignments : 49597 # unique Features these remaining alignments represent: 48303 % of total features these alignments represent : 34.20 % % Identity distribution of the remaining features % Identity # alignments -------- -------- 90 0 91 0 92 0 93 0 94 59 95 448 96 1023 97 2462 98 7607 99 30216 100 7782 Frequency distribution of the remaining features # hits # features -------- -------- 1 47207 2 1029 3 40 4 11 5 4 6 2 8 2 9 2 10 1 20 2 30 3 40 0 50 0 100 0 Features that hit more than four times are deleted. # remaining Alignments : 49429 # unique Features these remaining alignments represent: 48287 % of total features these alignments represent : 34.19 %
Last modified: Thu Sep 13 15:01:03 2007