This reports the protocol used to align the Wheat_ESTcluster_PlantGDB features to tigrv4-genome.
Mon Apr 24 16:39:58 2006


Source of Wheat_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Triticum_aestivum/Triticum_aestivum.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

242257 Wheat_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 143196
# unique Features these alignments represent: 132945
% of total features these alignments represent : 54.88 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 18588
150	 18373
200	 19474
250	 18285
300	 16177
350	 13120
400	 10300
450	 7573
500	 5702
550	 3951
600	 3131
650	 2152
700	 1595
750	 1245
800	 879
10000	 2651

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 106597
# unique Features these remaining alignments represent: 98648
% of total features these alignments represent : 40.72 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 94110
2	 2938
3	 757
4	 359
5	 211
6	 170
8	 84
9	 5
10	 7
20	 7
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 102257
# unique Features these remaining alignments represent: 97805
% of total features these alignments represent : 40.37 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 2
30	 1
40	 8
50	 30
60	 106
70	 706
80	 6443
90	 63169
95	 28804
100	 2988

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 81523
2000	 12279
3000	 3669
4000	 1153
5000	 554
6000	 303
7000	 296
8000	 190
9000	 122
10000	 90

Following is the final summary
# alignments : 102257
# unique Features these alignments represent: 97805
% of total features these alignments represent : 40.37 %