This reports the protocol used to align the Maize_MethylFilter_CSHL features to tigrv4-genome.
Fri Apr 14 12:09:00 2006


Source of Maize_MethylFilter_CSHL : Methyl-filtered CSHL maize sequence, downloaded from genbank with query '(txid4577[ORGN] AND McCombie[AUTH] AND methyl[TITL] AND 2002[MDAT])' 

Alignment procedure details 
--------------------------- 

66390 Maize_MethylFilter_CSHL are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Genomic' data sets.

Initial summary
# alignments : 37990
# unique Features these alignments represent: 21014
% of total features these alignments represent : 31.65 %

The length of the matches are distributed as follows 
Hit_length	# alignments
--------	--------
100	 7239
150	 3527
200	 3276
250	 2664
300	 2134
350	 2050
400	 1970
450	 1878
500	 1883
550	 2262
600	 2563
650	 2536
700	 2346
750	 1194
800	 405
10000	 63

Alignments with matches less than 100 bp are filtered 
# remaining Alignments : 30810
# unique Features these remaining alignments represent: 15576
% of total features these alignments represent : 23.46 %

gap distribution of the remaining features
gaps	# alignments
--------	--------
1000	 30355
2000	 27
3000	 9
4000	 27
5000	 21
6000	 7
7000	 3
8000	 17
9000	 2
10000	 13
20000	 30

Alignments with gaps  > 4000 bp are filtered
# remaining Alignments : 30418
# unique Features these remaining alignments represent: 15245
% of total features these alignments represent : 22.96 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 11133
2	 1071
3	 736
4	 625
5	 507
6	 232
8	 453
9	 125
10	 177
20	 184
30	 1
40	 0
50	 0
100	 1

 Features that hit more than thrice are deleted. 
# remaining Alignments : 15483
# unique Features these remaining alignments represent: 12940
% of total features these alignments represent : 19.49 %

% Identity distribution of the remaining features
% Identity	# alignemnts
--------	--------
10	 0
20	 1
30	 20
40	 68
50	 121
60	 336
70	 1072
80	 1441
90	 5685
100	 6739

 Alignments with percent identity lower than 60 deleted. 
# remaining Alignments : 14991
# unique Features these remaining alignments represent: 12506
% of total features these alignments represent : 18.84 %

Following is the final summary
# alignments : 14991
# unique Features these alignments represent: 12506
% of total features these alignments represent : 18.84 %