Codename: peptide
Ala | Arg | Asn | Asp | Cys | Gln | Glu | Gly | His | Ile | Leu | Lys | Met | Phe | Pro | Ser | Thr | Try | Tyr | Val |
A | R | N | D | C | Q | E | G | H | I | L | K | M | F | P | S | T | W | Y | V |
Valid files are like:
PPPALPPKKR |
aptmppplpp |
ProProProAlaLeuProProLysLysArg |
AlaProThrMetProProProLeuProPro |
but these are incorrect:
PPPALPPKKR |
AlaProThrMetProProProLeuProPro |
ProProProAlaLeuProProLysLysArg |
AlaProThrMetPPPLPP |
--peptide <string> The string is a valid one or three letter sequence. Example:
molconvert --peptide FFKMLL mol -o peptide.mol will convert a one-letter sequence to a molfile
peptide:3 Using this option the output will be a tree-letter sequence. Examples:
echo "[H]NCC(=O)NC(C)C(=O)NCC(O)=O" | molconvert peptide:3 will convert SMILES representation to a three-letter sequence molconvert --peptide GAG peptide:3 will convert one-letter sequence to a three-letter sequence peptide:1 One-letter peptide sequence option. Example:
echo "[H]NCC(=O)NC(C)C(=O)NCC(O)=O" | molconvert peptide:1 will convert the SMILES string to a one-letter sequence
Apart from the essential amino acids that are already recognizable, it is possible to define custom amino acids with non-standard sidechains or with alternative protonation states. The usual format of the dictionary file is:
Ala A [CX4H3][C@HX4H1]([NX3])C=O 3 4 Arg R [N;X3][C@@H]([CH2][CH2][CH2][N;H1X3][C;X3]([N;H2X3])=N)C=O 1 10 Asn N [#7;X3][C@@H]([CH2]C([N;H2X3])=O)[C;X3]=O 1 7 Asp D [NX3][C@@HH1]([CH2]C([OX2H1])=O)C=O 1 7 ...
To create a custom amino acid abbreviation it is assumed that its name will
start with X and some other letters will follow this character
between parentheses. It is adviced to set this string for both the short
and the long name of the custom amino acid. Valid lines are:
X(Hcy) X(Hcy) [SX2H1][CH2][CH2][C@HH1]([NX3])C=O 5 6 X(1-foo) X(1-foo) [SX2H1][CH2][C@HH1]([NX3])C=O 4 5 X(b) X(b) [CH3][CH2][CH2][CH2][CH2][C@HH1]([NX3])C=O 7 8 ...Note the SMARTS strings representing amino acid fragments are denoting the hydrogens and sometimes the connection numbers to avoid ambiguity. For example if only the C[C@H](N)C=O string is used for alanin, this would match for many other amino acids as well as some of them are "containing" alanin as a substructure. Users can store their custom amino acids in the custom_aminoacids.dict file in the .chemaxon directory (UNIX) or the user's chemaxon directory using MS Windows.