Translate - TheBiologyBro

Q: Which genetic code should I choose?

For most nuclear genes, use the standard (1) code. Mitochondrial genes require organism-specific tables (e.g., vertebrate mitochondrial for human/mouse mtDNA). The bacterial (11) table is appropriate for bacteria and plastids.

DNA Sequence

Raw sequence or one or more FASTA sequences. IUPAC degenerate bases accepted. Input limit: 200,000,000 characters.

Options

Reading frame

Strand

Genetic code

💡 Quick Summary

Translate accepts a DNA sequence and converts it into a protein in the reading frame you specify. Supports the full IUPAC alphabet, the uppercase reading frame for masked sequences, direct and reverse-complement strands, and 17 NCBI genetic code tables.

📋 How to Use

Paste a raw DNA sequence or one or more FASTA sequences into the textarea. IUPAC degenerate bases (R, Y, S, W, K, M, B, D, H, V, N) are supported. Input limit: 200,000,000 characters.
Select the reading frame: frame 1, 2, or 3 starts translation at offset 0, 1, or 2 respectively. The uppercase frame strips all lowercase bases first and then translates the uppercase portion.
Choose the strand: direct translates the sequence as given; reverse takes the reverse complement before translating.
Select the genetic code appropriate for your organism (default: standard code, table 1).
Click Run. Each sequence is output in FASTA format with a header showing the reading frame and original sequence title.
Use Copy to copy the full output to your clipboard.

🧮 Formulas & Logic

Reading frame offset

Frame 1 = bases 1,2,3… | Frame 2 = bases 2,3,4… | Frame 3 = bases 3,4,5…

Uppercase frame

Lowercase bases are stripped; the remaining uppercase bases are translated from position 1

Reverse strand

The reverse complement of the sequence is computed first, then the selected reading frame is applied

Unknown codon → X

Any three-base codon not covered by the selected genetic code table is translated as X

Partial last codon

A trailing 1- or 2-base overhang (not divisible by 3) is silently discarded

📊 Result Interpretation

Output header

">rf {frame} {title}" — e.g., ">rf 1 sample sequence" for reading frame 1. The uppercase frame is shown as ">rf \"uppercase\" …".

Stop codons

Stop codons are shown as * in the protein sequence.

X residues

An X in the protein indicates a codon that could not be unambiguously assigned to an amino acid under the selected genetic code.

Multiple sequences

Each FASTA record is translated independently with the same frame, strand, and genetic code settings.

🔬 Applications

Translating a newly sequenced coding region to check for frameshifts or premature stops
Checking all three reading frames of a DNA fragment to find the correct ORF
Translating a reverse-complement strand to find genes encoded on the antisense strand
Translating repeat-masked (lowercase) sequences using the uppercase frame to isolate exonic protein
Verifying the protein product of a synthetic gene design before ordering

⚠️ Common Mistakes & Warnings

Frame selection shifts from the very first base in each sequence

Frame 2 skips the first base, frame 3 skips the first two bases. If your sequence has a FASTA header, it is stripped before the offset is applied.

Non-DNA characters are silently removed

Characters that are not in the IUPAC DNA/RNA alphabet (A, C, G, T, U, R, Y, S, W, K, M, B, D, H, V, N, and their lowercase equivalents) are stripped before translation. Numbers, spaces, and other symbols are ignored.

Uppercase frame requires mixed-case input

The uppercase frame only makes sense when the input sequence intentionally uses lowercase to mark regions you want to skip (e.g., repeat-masked sequences). If the entire sequence is lowercase, nothing will be translated.

❓ Frequently Asked Questions

What does the uppercase reading frame do?

It strips all lowercase characters from the DNA sequence before translating. This is useful with repeat-masked sequences where repetitive elements are lowercased and coding regions are uppercase — only the uppercase (non-masked) portions are translated.

Why do I see X in the output?

X appears when a codon cannot be mapped to an amino acid by the selected genetic code. This usually happens with highly degenerate IUPAC codons (e.g., NNN), codons that are genuinely ambiguous under the chosen table, or non-standard bases that survived the input filter.

How do I translate all six reading frames?

Run the tool twice: once with the direct strand (frames 1, 2, 3) and once with the reverse strand (frames 1, 2, 3). Each run translates one frame at a time. For a fully automated six-frame search with ORF filtering, use the ORF Finder tool instead.

Which genetic code should I choose?

For most nuclear genes, use the standard (1) code. Mitochondrial genes require organism-specific tables (e.g., vertebrate mitochondrial for human/mouse mtDNA). The bacterial (11) table is appropriate for bacteria and plastids.