Multi Rev Trans
Degenerate reverse translation of a protein alignment with codon frequency chart

FASTA or GDE format. All sequences must be pre-aligned (equal length including gaps). Input limit: 20,000,000 characters.

GCG format with a ".." header marker. Default is E. coli (all coding sequences in GenBank). Replace with any organism's table from the Codon Usage Database.

💡 Quick Summary

Multi Rev Trans accepts a protein alignment and a GCG-format codon usage table, then produces a degenerate DNA consensus sequence plus a per-column nucleotide frequency chart. Use it to identify regions of low nucleotide degeneracy when designing degenerate PCR primers for unsequenced coding regions.

📋 How to Use
  1. Paste a protein alignment in FASTA or GDE format into the top textarea. All sequences must be pre-aligned (equal length including gap characters). Input limit: 20,000,000 characters.
  2. Review or replace the codon usage table. The default is the standard E. coli GCG-format table. You can paste any GCG-format table from the Codon Usage Database.
  3. Click Run. The tool outputs: (1) the FASTA consensus sequence and (2) a bar chart for each alignment column showing nucleotide frequencies at each of the three codon positions.
  4. Use Copy to copy the full output to your clipboard.
  5. Click Load Example to try with three aligned signal peptide sequences.
🧮 Formulas & Logic
Gap contribution
A gap character (− or .) contributes 0.25 to each base frequency at each codon position
Base frequency
Sum of codon-usage fractions across all sequences at that column, divided by number of sequences
Consensus base
IUPAC degenerate code from the set of bases with frequency > 0 at that position
Bar scale
Bar length = round( base_fraction × 98 ) characters out of 98 maximum
📊 Result Interpretation
Consensus FASTA

A single degenerate DNA sequence encoding all aligned amino acids. IUPAC codes (R, Y, S, W, K, M, B, D, H, V, N) represent positions where multiple bases are possible.

Column bar chart

For each alignment column, three sections (first, second, third codon position) each show four bars (G, A, T, C). A single long bar means that base dominates — low degeneracy, good for primer design.

Bar labels

g = guanine (lowercase), a = adenine (lowercase), T = thymine (uppercase), C = cytosine (uppercase). Each bar is followed by the fraction (0.00–1.00).

Fraction value

The proportion of codons in the table that use that base at this codon position, weighted by all sequences in the alignment.

🔬 Applications
  • Designing degenerate PCR primers to clone an unsequenced ortholog from a related species
  • Visualising which codon positions are most variable across a protein family alignment
  • Identifying conserved codon positions that would allow a minimal-degeneracy primer
  • Generating a codon-usage-weighted reverse-translated DNA sequence for synthetic gene design
⚠️ Common Mistakes & Warnings
Sequences must be pre-aligned and equal length

This tool does not perform alignment. Paste sequences that are already aligned (all the same length including gap characters).

Codon table must be in GCG format with a ".." header marker

The tool strips everything before the ".." marker and then parses AmAcid / Codon / Number / /1000 / Fraction columns. Tables from the Codon Usage Database (kazusa.or.jp) are in this format.

Fractions are recalculated from /1000 values

Some entries in the Codon Usage Database list fraction as 0. The tool automatically recalculates fractions from the /1000 column to fix this.

❓ Frequently Asked Questions

What does the consensus sequence represent?
Each triplet in the consensus encodes the amino acid at the corresponding alignment column, using the IUPAC degenerate base code that covers all bases with non-zero frequency at that codon position. For example if the first base is always G or A, the consensus shows R. The full sequence therefore represents the minimal degenerate DNA oligomer that could encode any of the aligned proteins.
How do gap columns affect the result?
A gap (− or .) at a column adds 0.25 to each of the four base frequencies for all three codon positions. This reflects total uncertainty about the nucleotide — a gap is treated as an equally probable unknown base.
Where can I get a codon usage table?
The Codon Usage Database at kazusa.or.jp provides GCG-format tables for thousands of organisms. Search for your organism of interest and copy the GCG table into the codon table field.
What does a single-base bar mean?
If one bar spans the full width (fraction ≈ 1.00) and the other three are zero, all codons for every amino acid at that column position start (or end) with the same nucleotide — there is zero degeneracy at that codon position. That position in a primer will be a perfect match to any sequence in the alignment.