💡 Quick Summary
Reverse Translate accepts a protein sequence and uses a codon usage table to generate a DNA sequence representing the most likely non-degenerate coding sequence. A consensus sequence derived from all the possible codons for each amino acid is also returned. Use Reverse Translate when designing PCR primers to anneal to an unsequenced coding sequence from a related species.
📋 How to Use
- Paste a raw protein sequence or one or more FASTA sequences into the top textarea. Valid single-letter amino acid codes are accepted. Stop codons (*) are supported. Input limit: 20,000,000 characters.
- Review or replace the codon usage table. The default is the standard E. coli GCG-format table. You can paste any GCG-format table from the Codon Usage Database.
- Click Run. The tool outputs: (1) most likely codons, (2) consensus (IUPAC degenerate) codons, and (3) a base probability graph for each sequence.
- Use the Copy buttons to copy any output section to your clipboard.
- Click Load Example to try with a sample 20-amino-acid sequence.
🧮 Formulas & Logic
📊 Result Interpretation
A non-degenerate DNA sequence that picks the statistically most probable codon for each amino acid based on the selected organism's codon usage.
A degenerate DNA sequence using IUPAC ambiguity codes (R, Y, S, W, K, M, B, D, H, V, N) to represent all bases that could encode each amino acid. Useful for designing degenerate PCR primers.
For each amino acid, three sections (first, second, third codon position) each show four bars (G, A, T, C). A single long bar means that base dominates — low degeneracy, ideal for primer design.
g = guanine (lowercase), a = adenine (lowercase), T = thymine (uppercase), C = cytosine (uppercase). Each bar is followed by the frequency (0.00–1.00).
🔬 Applications
- Designing degenerate PCR primers to clone an unsequenced ortholog from a related species
- Generating the most likely DNA coding sequence for a known protein from a specific organism
- Identifying codon positions with minimal degeneracy for primer annealing
- Producing a codon-usage-optimised synthetic gene design starting from a protein sequence
⚠️ Common Mistakes & Warnings
The tool strips everything before the ".." marker and then parses AmAcid / Codon / Number / /1000 / Fraction columns. Tables from the Codon Usage Database (kazusa.or.jp) are in this format.
Some entries in the Codon Usage Database list fraction as 0. The tool automatically recalculates fractions from the /1000 column to fix this.
Residues coded as X are treated as equally likely to be any amino acid — all four bases receive 0.25 frequency at each codon position, producing N in the consensus sequence.