Peptide Encoding
Peptide Encoding is the challenge of “backtracking” the central dogma of biology: finding DNA sequences within a genome that can encode a given peptide. Peptide sequencing presents some algorithms for solving this problem.
Approach
Computationally, these stages can be done with reverse-lookup tables:
For the string of amino acids forming the peptide, find the codon (RNA triplet) that forms it (e.g. using the inverse-codon table for each amino acid, below). Note: the RNA is not unique:
A. Multiple codons can encode the same amino acid. For example, Arginine has 6 codons: (CGU, CGC, CGA, CGG; AGA, AGG).
B. In the case that the peptide is cyclic, any rotation of the peptide string can be reported (but also, note that some cyclic peptides are actually Non-ribosomal and do not appear in the genome at all).
Amino acid | RNA codons |
---|---|
Ala, A | GCU, GCC, GCA, GCG |
Arg, R | CGU, CGC, CGA, CGG; AGA, AGG |
Asn, N | AAU, AAC |
Asp, D | GAU, GAC |
Asn or Asp, B | AAU, AAC; GAU, GAC |
Cys, C | UGU, UGC |
Gln, Q | CAA, CAG |
Glu, E | GAA, GAG |
Gln or Glu, Z | CAA, CAG; GAA, GAG |
Gly, G | GGU, GGC, GGA, GGG |
His, H | CAU, CAC |
START | AUG |
Ile, I | AUU, AUC, AUA |
Leu, L | CUU, CUC, CUA, CUG; UUA, UUG |
Lys, K | AAA, AAG |
Met, M | AUG |
Phe, F | UUU, UUC |
Pro, P | CCU, CCC, CCA, CCG |
Ser, S | UCU, UCC, UCA, UCG; AGU, AGC |
Thr, T | ACU, ACC, ACA, ACG |
Trp, W | UGG |
Tyr, Y | UAU, UAC |
Val, V | GUU, GUC, GUA, GUG |
STOP | UAA, UGA, UAG |
Source: Inverse Codon Table |
One can find amino-acid coding similarity using the codon wheel below:
Source: Codon Table page
Convert from DNA to RNA (flip the Us to Ts).
And, as always, the DNA sequence encoding the RNA sequence is also not unique, since
A. The strandedness allows reverse-complement sequences to be equivalent
B. There are coding (extrons) and non-coding (introns) regions within the genome at any given point in time.
History
In 1967, Marshall Nirenberg discovered that RNA strads of only uracil produced a peptide of only phenylalanine (Phe). Scientists continued this technique to discover how RNA codons encode amino acids.