This reports the protocol used to align the Sorghum_MethylFilter_Orion features to tigrv4-genome.
Fri Apr 14 17:26:10 2006


Source of Sorghum_MethylFilter_Orion : Sorghum_Orion_Genethresher_reads, from Gramene markers database, originally these are obtained from Orion Genomics. 

Alignment procedure details 
--------------------------- 

136197 Sorghum_MethylFilter_Orion are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Genomic' data sets.

Initial summary
# alignments : 67747
# unique Features these alignments represent: 57111
% of total features these alignments represent : 41.93 %

The length of the matches are distributed as follows 
Hit_length	# alignments
--------	--------
100	 18946
150	 10766
200	 7122
250	 5552
300	 4381
350	 3495
400	 2961
450	 2591
500	 2030
550	 1755
600	 1461
650	 1631
700	 1495
750	 1180
800	 589
10000	 1792

Alignments with matches less than 100 bp are filtered 
# remaining Alignments : 49074
# unique Features these remaining alignments represent: 41555
% of total features these alignments represent : 30.51 %

gap distribution of the remaining features
gaps	# alignments
--------	--------
1000	 42790
2000	 1508
3000	 278
4000	 173
5000	 83
6000	 54
7000	 79
8000	 61
9000	 44
10000	 42
20000	 313

Alignments with gaps  > 4000 bp are filtered
# remaining Alignments : 44749
# unique Features these remaining alignments represent: 38031
% of total features these alignments represent : 27.92 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 35316
2	 1563
3	 289
4	 202
5	 116
6	 197
8	 270
9	 16
10	 29
20	 32
30	 1
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted. 
# remaining Alignments : 39309
# unique Features these remaining alignments represent: 37168
% of total features these alignments represent : 27.29 %

% Identity distribution of the remaining features
% Identity	# alignemnts
--------	--------
10	 0
20	 6
30	 41
40	 163
50	 585
60	 1348
70	 2576
80	 4543
90	 21296
100	 8751

 Alignments with percent identity lower than 60 deleted. 
# remaining Alignments : 37368
# unique Features these remaining alignments represent: 35501
% of total features these alignments represent : 26.07 %

Following is the final summary
# alignments : 37368
# unique Features these alignments represent: 35501
% of total features these alignments represent : 26.07 %