This reports the protocol used to align the Sorghum_EST features to tigrv4-genome. Fri Apr 14 18:08:18 2006 Source of Sorghum_EST : from Gramene markers database, originally Downloaded from genbank with query ' txid4557[orgn] AND gbdiv_est[PROP]' Alignment procedure details --------------------------- 232284 Sorghum_EST are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 186281 # unique Features these alignments represent: 167835 % of total features these alignments represent : 72.25 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 15789 150 17971 200 21693 250 23017 300 23152 350 22088 400 20020 450 15732 500 10804 550 7553 600 4334 650 2462 700 1118 750 331 800 141 10000 76 Alignments with matches less than 150 bp are deleted # remaining Alignments : 152888 # unique Features these remaining alignments represent: 136371 % of total features these alignments represent : 58.71 % Frequency distribution of the remaining features # hits # features -------- -------- 1 129181 2 4008 3 806 4 938 5 592 6 409 8 148 9 64 10 129 20 96 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 139615 # unique Features these remaining alignments represent: 133995 % of total features these alignments represent : 57.69 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 0 40 6 50 29 60 131 70 929 80 9154 90 90738 95 37108 100 1520 Following is the distribution of gaps Gaps # features -------- -------- 1000 115573 2000 15181 3000 3554 4000 1007 5000 445 6000 330 7000 369 8000 292 9000 138 10000 84 Following is the final summary # alignments : 139615 # unique Features these alignments represent: 133995 % of total features these alignments represent : 57.69 %