This reports the protocol used to align the Sorghum_ESTcluster_PlantGDB features to tigrv4-genome.
Mon Apr 24 16:39:26 2006


Source of Sorghum_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Sorghum_bicolor/Sorghum_bicolor.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

46755 Sorghum_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 29667
# unique Features these alignments represent: 28010
% of total features these alignments represent : 59.91 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 3334
150	 3063
200	 3328
250	 3205
300	 3023
350	 2743
400	 2479
450	 1957
500	 1531
550	 1127
600	 795
650	 637
700	 456
750	 368
800	 336
10000	 1285

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 23345
# unique Features these remaining alignments represent: 22291
% of total features these alignments represent : 47.68 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 21749
2	 353
3	 64
4	 40
5	 35
6	 31
8	 6
9	 5
10	 4
20	 4
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 22647
# unique Features these remaining alignments represent: 22166
% of total features these alignments represent : 47.41 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 6
50	 13
60	 49
70	 230
80	 1868
90	 15761
95	 4503
100	 217

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 16746
2000	 3135
3000	 1308
4000	 474
5000	 213
6000	 114
7000	 72
8000	 54
9000	 42
10000	 19

Following is the final summary
# alignments : 22647
# unique Features these alignments represent: 22166
% of total features these alignments represent : 47.41 %