This reports the protocol used to align the Maize_ArrayGene_NSF58K features to tigrv4-genome. Fri Apr 14 12:00:05 2006 Source of Maize_ArrayGene_NSF58K : Downloaded from TIGR http://www.maizearray.org/files/remapping_version3_57452_fasta.zip Alignment procedure details --------------------------- 57452 Maize_ArrayGene_NSF58K are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 38853 # unique Features these alignments represent: 36465 % of total features these alignments represent : 63.47 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 4408 150 3722 200 3919 250 3674 300 3311 350 2854 400 2670 450 2035 500 1717 550 1392 600 1091 650 926 700 878 750 766 800 700 10000 4790 Alignments with matches less than 150 bp are deleted # remaining Alignments : 30783 # unique Features these remaining alignments represent: 29132 % of total features these alignments represent : 50.71 % Frequency distribution of the remaining features # hits # features -------- -------- 1 28383 2 498 3 79 4 62 5 39 6 35 8 15 9 9 10 6 20 1 30 1 40 2 50 0 100 2 Features that hit more than thrice are deleted. # remaining Alignments : 29616 # unique Features these remaining alignments represent: 28960 % of total features these alignments represent : 50.41 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 2 40 29 50 38 60 92 70 389 80 2984 90 21226 95 4611 100 245 Following is the distribution of gaps Gaps # features -------- -------- 1000 21162 2000 4337 3000 1918 4000 739 5000 384 6000 174 7000 113 8000 85 9000 44 10000 33 Following is the final summary # alignments : 29616 # unique Features these alignments represent: 28960 % of total features these alignments represent : 50.41 %