This reports the protocol used to align the Wheat_ESTcluster_PlantGDB features to tigrv4-genome. Mon Apr 24 16:39:58 2006 Source of Wheat_ESTcluster_PlantGDB : this is a set of EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Triticum_aestivum/Triticum_aestivum.PUT.fasta.bz2 Alignment procedure details --------------------------- 242257 Wheat_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 143196 # unique Features these alignments represent: 132945 % of total features these alignments represent : 54.88 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 18588 150 18373 200 19474 250 18285 300 16177 350 13120 400 10300 450 7573 500 5702 550 3951 600 3131 650 2152 700 1595 750 1245 800 879 10000 2651 Alignments with matches less than 150 bp are deleted # remaining Alignments : 106597 # unique Features these remaining alignments represent: 98648 % of total features these alignments represent : 40.72 % Frequency distribution of the remaining features # hits # features -------- -------- 1 94110 2 2938 3 757 4 359 5 211 6 170 8 84 9 5 10 7 20 7 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 102257 # unique Features these remaining alignments represent: 97805 % of total features these alignments represent : 40.37 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 2 30 1 40 8 50 30 60 106 70 706 80 6443 90 63169 95 28804 100 2988 Following is the distribution of gaps Gaps # features -------- -------- 1000 81523 2000 12279 3000 3669 4000 1153 5000 554 6000 303 7000 296 8000 190 9000 122 10000 90 Following is the final summary # alignments : 102257 # unique Features these alignments represent: 97805 % of total features these alignments represent : 40.37 %