Amino acid and dipeptide composition ==================================== Input: amino acid sequences in the fasta format Output: amino acid composition or dipeptide composition Output file format: [SVM_light format] this can be used as the input file for SVM-light. The format is shown below. Note that the first letter '0' is not the entry name, but it is used to identify the type of each sequence (1 for a positive sample, -1 for a negative sample, or 0 for an unknown sample). In this output, all sequences are assigned with '0'. 0 1:value1 2:value2 3:value3 4:value4 5:value5 6:value6 7:value7 8:value8 [TAB-delimited table] this is a simple table format with the header in the first line. Sequence ID and length (aa) are listed before frequencies. The order of amino acid frequencies: K N T R S I M H P L E D A G V Q Y C W F The order of dipeptide frequencies is specified as: (ARNDCQEGHILKMFPSTWYV) x (ARNDCQEGHILKMFPSTWYV) = AA AR AN AD AC AQ AE ... VS VT VW VY VV