Introduction to Protein Structure
Writing Peptide and Protein Sequences
The primary structure (or sequence) of a peptide or protein is always written starting with the amino terminus on the left and progressing towards the carboxy terminus. If all of the entire sequence does not fit on one line it is simply continued on a second line, still following the left-to-right, amino-to-carboxy terminus convention.
Amino acid sequences can be written using either the three letter code or a one letter code.The exact formating of sequences varies with the application; by convention single letter codes are always capitalized.
Here are the codes for each amino acid:
Amino Acid | 3 Letter Code | 1 Letter Code | Amino Acid | 3 Letter Code | 1 Letter Code | |
---|---|---|---|---|---|---|
Alanine | Ala | A | Leucine | Leu | L | |
Arginine | Arg | R | Lysine | Lys | K | |
Asparagine | Asn | N | Methionine | Met | M | |
Aspartate | Asp | D | Phenylalanine | Phe | F | |
Cysteine | Cys | C | Proline | Pro | P | |
Histidine | His | H | Serine | Ser | S | |
Isoleucine | Ile | I | Threonine | Thr | T | |
Glutamine | Gln | Q | Tryptophan | Trp | W | |
Glutamate | Glu | E | Tyrosine | Tyr | Y | |
Glycine | Gly | G | Valine | Val | V |
Examples of primary structures:
Example |
Three letter code | One Letter Code | |
A small peptide (8 residues) |
AspIleGluPheArgValLeuHis |
DIEFRVLH | |
|
|||
Lysozyme* (129 residues) *from chicken egg white |
LYS VAL PHE GLY ARG CYS GLU LEU ALA ALA ALA MET LYS ARG HIS GLY LEU ASP ASN TYR ARG GLY TYR SER LEU GLY ASN TRP VAL CYS ALA ALA LYS PHE GLU SER ASN PHE ASN THR GLN ALA THR ASN ARG ASN THR ASP GLY SER THR ASP TYR GLY ILE LEU GLN ILE ASN SER ARG TRP TRP CYS ASN ASP GLY ARG THR PRO GLY SER ARG ASN LEU CYS ASN ILE PRO CYS SER ALA LEU LEU SER SER ASP ILE THR ALA SER VAL ASN CYS ALA LYS LYS ILE VAL SER ASP GLY ASN GLY MET ASN ALA TRP VAL ALA TRP ARG ASN ARG CYS LYS GLY THR ASP VAL GLN ALA TRP ILE ARG GLY CYS ARG LEU | KVFGRCELAA AMKRHGLDNY RGYSLGNWVC AAKFESNFNT QATNRNTDGS TDYGILQINS RWWCNDGRTP GSRNLCNIPC SALLSSDITA SVNCAKKIVS DGNGMNAWVA WRNRCKGTDV QAWIRGCRL |
Copyright © 1998, 1999, 2007 by Frank R. Gorga; Page maintained by F.R. Gorga; Last updated: 12-Mar-2007