This reports the protocol used to align the Maize_ESTcluster_PlantGDB features to tigrv4-genome. Mon Apr 24 16:38:01 2006 Source of Maize_ESTcluster_PlantGDB : this is a set of EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Zea_mays/Zea_mays.PUT.fasta.bz2 Alignment procedure details --------------------------- 129494 Maize_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 71317 # unique Features these alignments represent: 68170 % of total features these alignments represent : 52.64 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 12353 150 8405 200 8087 250 7183 300 6198 350 5460 400 4591 450 3681 500 2921 550 2436 600 1854 650 1438 700 1207 750 948 800 740 10000 3815 Alignments with matches less than 150 bp are deleted # remaining Alignments : 50726 # unique Features these remaining alignments represent: 48730 % of total features these alignments represent : 37.63 % Frequency distribution of the remaining features # hits # features -------- -------- 1 47524 2 878 3 146 4 63 5 51 6 34 8 16 9 7 10 7 20 4 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 49718 # unique Features these remaining alignments represent: 48548 % of total features these alignments represent : 37.49 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 1 30 2 40 7 50 19 60 86 70 389 80 4303 90 35599 95 8883 100 429 Following is the distribution of gaps Gaps # features -------- -------- 1000 36406 2000 7251 3000 3058 4000 1188 5000 510 6000 267 7000 155 8000 113 9000 57 10000 48 Following is the final summary # alignments : 49718 # unique Features these alignments represent: 48548 % of total features these alignments represent : 37.49 %