This calculator translates a DNA or RNA sequence into an amino acid chain using the standard genetic code. You paste a nucleotide sequence, select a reading frame, and the tool converts each codon (group of three bases) into its corresponding amino acid, shown in one-letter code.
The page also serves as a compact reference on how codons work, how reading frames affect translation, and what assumptions this calculator makes. It is designed for students, educators, and anyone needing a quick way to convert gene sequences to protein.
DNA and RNA are long chains built from four types of nucleotides:
During translation, the cell reads the sequence three nucleotides at a time. Each group of three is a codon. A codon either specifies one amino acid or acts as a start or stop signal.
Because there are four possible bases and three positions, there are 43 = 64 possible codons. These map to 20 standard amino acids plus start and stop signals. Many amino acids have multiple codons, which makes the genetic code redundant but also more tolerant to some mutations.
The core idea behind the calculator is straightforward: clean the input, group the sequence into triplets according to the chosen frame, and map those triplets to amino acids.
Mathematically, suppose the cleaned sequence has length N nucleotides, indexed from 0. If you choose a reading frame offset f (0 for Frame 1, 1 for Frame 2, 2 for Frame 3), the number of full codons k that can be read is:
Only full codons are translated; any leftover bases at the end are ignored. Each codon is then looked up in a fixed codon table to determine the amino acid.
Biologically, the correct frame is usually set by the location of a start codon, but here you select it manually.
You can try different frames on the same sequence to see how frame shifts change the protein sequence or introduce early stop codons.
This tool accepts both DNA and RNA-style input:
ATGGCC....AUGGCC....To make input more forgiving, the calculator:
If the last one or two bases at the end of the cleaned sequence do not form a complete codon, they are left untranslated and do not appear in the amino-acid output.
Because codons have three bases, a single nucleotide string can be read in three different forward reading frames. Changing the frame completely changes which triplets are formed and therefore which amino acids are produced.
For example, consider the DNA sequence:
ATGAAACCC
ATG AAA CCC → Met (M), Lys (K), Pro (P)TGA AAC CC... → starts with TGA, which is a stop codonGAA ACC C... → begins with GAA (Glu, E)In a real gene, only one of these frames is used for the protein-coding region, and it usually begins at a start codon such as ATG (AUG in RNA). The calculator does not attempt to guess the correct frame; instead, you can explore all three and see the differences for yourself.
The calculator uses the standard genetic code for nuclear genes. A small subset of codons is shown below for reference:
| Codon(s) | Amino acid (3-letter) | Amino acid (1-letter) | Notes |
|---|---|---|---|
| TTT, TTC | Phenylalanine | F | Hydrophobic aromatic |
| TTA, TTG, CTT, CTC, CTA, CTG | Leucine | L | Six different codons |
| ATT, ATC, ATA | Isoleucine | I | Start codon in some contexts (ATA in mitochondria, not here) |
| ATG | Methionine | M | Common start codon |
| GTT, GTC, GTA, GTG | Valine | V | Hydrophobic |
| TAA, TAG, TGA | Stop | * | Termination codons |
The full calculator internally includes all 64 codons. For translation, each valid codon is mapped to its one-letter amino-acid symbol, and stop codons are typically represented by an asterisk (*) or another clear marker.
This example shows exactly how the calculator behaves for a short DNA sequence.
Suppose you paste the following DNA sequence (with spaces and line breaks):
ATG GAA TTT
GCC TGA
The tool strips whitespace and keeps only the letters A, C, G, and T, giving:
ATGGAATTTGCCTGA
Select Frame 1 (start at the first base). The tool will split the cleaned sequence into codons:
ATG GAA TTT GCC TGA
Using the standard code:
The resulting amino-acid sequence in one-letter code is:
M E F A *
Depending on how the interface is configured, you might see the amino-acid string without spaces (e.g., MEFA*) or with separators. Some implementations may also display the original codons aligned with their amino acids.
If you choose Frame 2 instead, the codons shift:
TGG AAT TTG CCT GA...
Now the amino-acid sequence begins with a different set of residues, and the last incomplete codon (GA) is ignored. This illustrates how a simple change in the reading frame can completely alter the translated protein.
Once you click Translate, you receive at least one of the following:
MKVLY*).*).Points to keep in mind when reading the output:
The calculator models the core codon-to-amino-acid mapping but simplifies many biological details. The table below outlines some key differences.
| Aspect | Calculator behavior | Biological translation |
|---|---|---|
| Reading frame selection | User chooses Frame 1, 2, or 3; all are treated equally. | Frame is set by the start codon within a specific context on the mRNA. |
| Start codon handling | ATG/AUG is translated to methionine like any other codon; no special initiation logic. | Start codons recruit the ribosome and often define the N-terminus of the protein. |
| Stop codon handling | Stop codons are marked (e.g., as *), but translation of later codons can continue in the output. |
Translation typically terminates at the first in-frame stop codon. |
| Strand direction | Only the sequence as entered (forward direction) is translated. | Genes can be on either strand; mRNA is synthesized in a defined orientation. |
| Genetic code used | Always uses the standard nuclear genetic code. | Some organisms and organelles (e.g., mitochondria) use variant codes. |
| Ambiguous bases (N, R, Y, etc.) | Ambiguous letters are removed and not translated. | In reality, ambiguous positions represent uncertainty but still correspond to a physical base in the molecule. |
To keep the tool simple and fast, several assumptions are made. Be aware of these when interpreting results:
Within these limitations, the calculator is a convenient way to explore how changes in a nucleotide sequence affect the resulting amino-acid chain, to teach the principles of the genetic code, or to perform quick sanity checks on small fragments of genes.