This reports the protocol used to align the Maize_ESTcluster_PlantGDB features to tigrv4-genome.
Mon Apr 24 16:38:01 2006


Source of Maize_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Zea_mays/Zea_mays.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

129494 Maize_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 71317
# unique Features these alignments represent: 68170
% of total features these alignments represent : 52.64 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 12353
150	 8405
200	 8087
250	 7183
300	 6198
350	 5460
400	 4591
450	 3681
500	 2921
550	 2436
600	 1854
650	 1438
700	 1207
750	 948
800	 740
10000	 3815

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 50726
# unique Features these remaining alignments represent: 48730
% of total features these alignments represent : 37.63 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 47524
2	 878
3	 146
4	 63
5	 51
6	 34
8	 16
9	 7
10	 7
20	 4
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 49718
# unique Features these remaining alignments represent: 48548
% of total features these alignments represent : 37.49 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 1
30	 2
40	 7
50	 19
60	 86
70	 389
80	 4303
90	 35599
95	 8883
100	 429

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 36406
2000	 7251
3000	 3058
4000	 1188
5000	 510
6000	 267
7000	 155
8000	 113
9000	 57
10000	 48

Following is the final summary
# alignments : 49718
# unique Features these alignments represent: 48548
% of total features these alignments represent : 37.49 %