This reports the protocol used to align the Sorghum_cDNA features to tigrv4-genome. Fri Apr 14 11:45:20 2006 Source of Sorghum_cDNA : from Gramene markers database, originally no description of data source Alignment procedure details --------------------------- 4720 Sorghum_cDNA are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 2672 # unique Features these alignments represent: 2613 % of total features these alignments represent : 55.36 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 377 150 389 200 436 250 395 300 361 350 247 400 173 450 113 500 76 550 40 600 31 650 13 700 10 750 4 800 3 10000 4 Alignments with matches less than 150 bp are deleted # remaining Alignments : 1921 # unique Features these remaining alignments represent: 1880 % of total features these alignments represent : 39.83 % Frequency distribution of the remaining features # hits # features -------- -------- 1 1845 2 31 3 3 4 0 5 1 6 0 8 0 9 0 10 0 20 0 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 1916 # unique Features these remaining alignments represent: 1879 % of total features these alignments represent : 39.81 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 0 40 0 50 0 60 1 70 11 80 144 90 1325 95 414 100 21 Following is the distribution of gaps Gaps # features -------- -------- 1000 1593 2000 229 3000 52 4000 9 5000 4 6000 3 7000 3 8000 2 9000 1 10000 1 Following is the final summary # alignments : 1916 # unique Features these alignments represent: 1879 % of total features these alignments represent : 39.81 %