This reports the protocol used to align the Sorghum_cDNA features to tigrv4-genome.
Fri Apr 14 11:45:20 2006


Source of Sorghum_cDNA : from Gramene markers database, originally no description of data source 

Alignment procedure details 
--------------------------- 

4720 Sorghum_cDNA are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 2672
# unique Features these alignments represent: 2613
% of total features these alignments represent : 55.36 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 377
150	 389
200	 436
250	 395
300	 361
350	 247
400	 173
450	 113
500	 76
550	 40
600	 31
650	 13
700	 10
750	 4
800	 3
10000	 4

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 1921
# unique Features these remaining alignments represent: 1880
% of total features these alignments represent : 39.83 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 1845
2	 31
3	 3
4	 0
5	 1
6	 0
8	 0
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 1916
# unique Features these remaining alignments represent: 1879
% of total features these alignments represent : 39.81 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 0
50	 0
60	 1
70	 11
80	 144
90	 1325
95	 414
100	 21

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 1593
2000	 229
3000	 52
4000	 9
5000	 4
6000	 3
7000	 3
8000	 2
9000	 1
10000	 1

Following is the final summary
# alignments : 1916
# unique Features these alignments represent: 1879
% of total features these alignments represent : 39.81 %