This reports the protocol used to align the Sugarcane_EST features to tigrv4-genome. Fri Apr 14 18:15:59 2006 Source of Sugarcane_EST : from Gramene markers database, originally Downloaded from genbank dbEST with query 'saccharum AND gbdiv_est[PROP]' Alignment procedure details --------------------------- 255964 Sugarcane_EST are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 193325 # unique Features these alignments represent: 182755 % of total features these alignments represent : 71.40 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 15565 150 16412 200 19903 250 21757 300 21913 350 22680 400 22111 450 19113 500 14406 550 10193 600 6011 650 2285 700 655 750 215 800 50 10000 56 Alignments with matches less than 150 bp are deleted # remaining Alignments : 161708 # unique Features these remaining alignments represent: 152839 % of total features these alignments represent : 59.71 % Frequency distribution of the remaining features # hits # features -------- -------- 1 147365 2 4092 3 636 4 318 5 110 6 90 8 160 9 29 10 25 20 14 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 157457 # unique Features these remaining alignments represent: 152093 % of total features these alignments represent : 59.42 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 1 30 2 40 13 50 44 60 154 70 833 80 9022 90 104804 95 40349 100 2235 Following is the distribution of gaps Gaps # features -------- -------- 1000 126007 2000 21469 3000 5244 4000 1209 5000 563 6000 334 7000 246 8000 269 9000 98 10000 90 Following is the final summary # alignments : 157457 # unique Features these alignments represent: 152093 % of total features these alignments represent : 59.42 %