Codon Plot
Visualise codon usage frequency across a DNA sequence

Paste a raw sequence or one or more FASTA sequences. Non-DNA characters are stripped automatically. U is treated as T. Input limit: 50,000,000 characters.

Codon Usage Table (GCG format)

The default table is E. coli (all coding sequences in GenBank). Replace it with any GCG-format table from the Codon Usage Database (kazusa.or.jp).

💡 Quick Summary

Codon Plot accepts a DNA sequence and generates a text-based plot with a horizontal bar for each codon. The bar length is proportional to the codon's usage fraction in the codon frequency table you supply. Use Codon Plot to identify portions of a sequence that may be poorly expressed in your organism of interest, or to visualise a codon usage table directly.

📋 How to Use
  1. Paste a raw DNA sequence or one or more FASTA sequences into the Input Sequence area. Input limit is 50,000,000 characters.
  2. Review or replace the Codon Usage Table. The tool ships with a default E. coli table from the Codon Usage Database. You can paste any GCG-format table in its place.
  3. Click Run. For each codon in the sequence the tool prints the codon, its position range, its amino acid, and a bar of X characters scaled to the usage fraction.
  4. A bar spanning the full width represents fraction 1.0 (the only codon used for that amino acid). A short bar indicates a rarely used codon that may limit expression.
  5. Use the Copy button to copy the plot to your clipboard.
  6. Click Load Example to try a sequence containing one of each codon type with the E. coli default table.
🧮 Formulas & Logic
Bar length
bar_width = round( fraction × 96 ) characters of "X"
Fraction (recalculated)
fraction = perThou_codon / Σ perThou_synonymous_codons (fixes tables where fraction is listed as 0)
📊 Result Interpretation
Sequences Processed

Number of FASTA records successfully processed.

Codons Plotted

Total number of codons analysed across all sequences.

Bar length

Longer bars = higher usage fraction. A full-width bar means this codon accounts for all usage of that amino acid in the table organism.

🔬 Applications
  • Identifying codon-poor regions in a heterologous gene that may limit expression in the host organism
  • Comparing codon usage of a synthetic gene against the host organism's preferences before synthesis
  • Visualising the codon composition of a coding sequence to guide codon optimisation efforts
  • Using a sequence of one of each codon type to display a codon usage table as a bar chart
⚠️ Common Mistakes & Warnings
Sequence length not a multiple of 3

If the cleaned sequence length is not divisible by 3, the last 1 or 2 bases are ignored and a warning is shown.

Codon not found in table

If a codon contains non-ATGC characters (after stripping and U→T substitution) it cannot be looked up. Its bar is rendered at fraction 0 and the amino acid shown as "???". Check that your sequence does not contain long runs of ambiguous IUPAC codes.

GCG table must contain the ".." header marker

The parser strips everything before the first ".." in the table. Standard GCG-format tables from the Codon Usage Database include this marker on the header line.

❓ Frequently Asked Questions

What is codon usage fraction?
For any given amino acid, multiple codons may encode it (synonymous codons). The usage fraction for one codon is its frequency relative to all synonymous codons: a fraction of 1.0 means only that codon is ever used; 0.10 means it is rarely used. The tool recalculates fractions from the /1000 values to correct tables where fractions are stored as 0.
Where can I get codon usage tables for other organisms?
The Codon Usage Database at kazusa.or.jp provides GCG-format tables for thousands of organisms. Download the table for your host organism and paste it into the Codon Usage Table field, replacing the default E. coli table.
What does a short bar mean?
A short bar indicates a low usage fraction — the codon is rarely used by the table organism to encode that amino acid. A gene with many short-bar codons may be poorly translated in that host.
Can I process multiple sequences?
Yes. Paste any number of FASTA-formatted sequences. The same codon table is applied to each sequence, and each produces its own plot section in the output.
Why is U replaced with T?
The codon table uses DNA triplets. If you paste an RNA sequence (containing U), each U is automatically converted to T before lookup so the correct codon is matched.