This reports the protocol used to align the Sorghum_ESTcluster_PlantGDB features to tigrv4-genome. Mon Apr 24 16:39:26 2006 Source of Sorghum_ESTcluster_PlantGDB : this is a set of EST clusters and singletons from Gramene markers database, originally down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Sorghum_bicolor/Sorghum_bicolor.PUT.fasta.bz2 Alignment procedure details --------------------------- 46755 Sorghum_ESTcluster_PlantGDB are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 29667 # unique Features these alignments represent: 28010 % of total features these alignments represent : 59.91 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 3334 150 3063 200 3328 250 3205 300 3023 350 2743 400 2479 450 1957 500 1531 550 1127 600 795 650 637 700 456 750 368 800 336 10000 1285 Alignments with matches less than 150 bp are deleted # remaining Alignments : 23345 # unique Features these remaining alignments represent: 22291 % of total features these alignments represent : 47.68 % Frequency distribution of the remaining features # hits # features -------- -------- 1 21749 2 353 3 64 4 40 5 35 6 31 8 6 9 5 10 4 20 4 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 22647 # unique Features these remaining alignments represent: 22166 % of total features these alignments represent : 47.41 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 0 40 6 50 13 60 49 70 230 80 1868 90 15761 95 4503 100 217 Following is the distribution of gaps Gaps # features -------- -------- 1000 16746 2000 3135 3000 1308 4000 474 5000 213 6000 114 7000 72 8000 54 9000 42 10000 19 Following is the final summary # alignments : 22647 # unique Features these alignments represent: 22166 % of total features these alignments represent : 47.41 %