This reports the protocol used to align the Rice_ESTcluster_PlantGDB features to tigrv4-genome.
Mon Apr 24 16:41:35 2006


Source of Rice_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Oryza_sativa/Oryza_sativa.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

141239 Rice_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minScore=120 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'Coding-SameSpecies' data sets.

Initial summary
# alignments : 125243
# unique Features these alignments represent: 118987
% of total features these alignments represent : 84.25 %

The following is the distribution of the feature coverage 
%coverage	no of alignments
--------	--------
9	 16
19	 253
29	 830
39	 1742
49	 2734
59	 5608
69	 9134
79	 8861
89	 14551
90	 2495
91	 2779
92	 3293
93	 3964
94	 4742
95	 6109
96	 7182
97	 8599
98	 9600
99	 9145
100	 23601

 Alignments less than 95 % coverage are deleted
# remaining Alignments : 58243
# unique Features these remaining alignments represent: 56220
% of total features these alignments represent : 39.80 %

GAP distribution of the remaining features
Gaps	# alignments
--------	--------
1000	 40219
2000	 8242
3000	 4411
4000	 2268
5000	 1102
6000	 654
7000	 373
8000	 279
9000	 189
10000	 115
20000	 266

Alignments with gaps > 4000 bp are deleted
# remaining Alignments : 55140
# unique Features these remaining alignments represent: 53196
% of total features these alignments represent : 37.66 %

% Identity distribution of the remaining features
% Identity	# alignments
--------	--------
90	 0
91	 0
92	 0
93	 0
94	 61
95	 462
96	 1050
97	 2526
98	 6745
99	 29222
100	 15074

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 52387
2	 478
3	 123
4	 66
5	 37
6	 25
8	 47
9	 14
10	 10
20	 4
30	 2
40	 1
50	 1
100	 1

 Features that hit more than four times are deleted.  
# remaining Alignments : 53976
# unique Features these remaining alignments represent: 53054
% of total features these alignments represent : 37.56 %