This reports the protocol used to align the Wheat_EST features to tigrv4-genome. Mon Apr 17 12:17:41 2006 Source of Wheat_EST : from Gramene markers database, originally Downloaded from genbank with query ' txid4564[orgn] AND gbdiv_est[PROP]' Alignment procedure details --------------------------- 623606 Wheat_EST are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 464583 # unique Features these alignments represent: 410784 % of total features these alignments represent : 65.87 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 46176 150 46791 200 53978 250 56147 300 57641 350 51940 400 47804 450 37852 500 26182 550 17258 600 11080 650 5802 700 3082 750 1438 800 564 10000 848 Alignments with matches less than 150 bp are deleted # remaining Alignments : 372614 # unique Features these remaining alignments represent: 324958 % of total features these alignments represent : 52.11 % Frequency distribution of the remaining features # hits # features -------- -------- 1 301884 2 13415 3 3205 4 2393 5 1770 6 1274 8 788 9 157 10 40 20 32 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 338329 # unique Features these remaining alignments represent: 318504 % of total features these alignments represent : 51.07 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 2 30 1 40 13 50 56 60 234 70 1407 80 16261 90 207339 95 103626 100 9390 Following is the distribution of gaps Gaps # features -------- -------- 1000 270782 2000 44855 3000 10203 4000 2587 5000 1310 6000 637 7000 1176 8000 451 9000 254 10000 178