ORF Finder
Find open reading frames and protein translations in a DNA sequence

Raw sequence or a single FASTA record. Input limit: 100,000,000 characters.

💡 Quick Summary

ORF Finder searches a DNA sequence for open reading frames and returns the position of each ORF together with its protein translation. Use it to find potential protein-coding regions in newly sequenced DNA. Supports all six reading frames (three on each strand), all IUPAC degenerate bases, and 17 genetic code tables.

📋 How to Use
  1. Paste a raw DNA sequence or a single FASTA record into the textarea. Input limit: 100,000,000 characters.
  2. Choose which codons can start an ORF: any codon (open search), atg only, or atg/gtg/ctg/ttg.
  3. Select the reading frame (1, 2, 3, or all three) and strand (direct or reverse complement).
  4. Enter the minimum ORF length in codons (default: 30). Only ORFs equal to or longer than this threshold are reported.
  5. Select the genetic code table appropriate for your organism.
  6. Click Run. Each ORF is reported in FASTA format: first the DNA sequence with its coordinates, then the protein translation.
🧮 Formulas & Logic
Minimum length
Number of codons (including the stop codon) that an ORF must reach to be reported
ORF coordinates
1-based positions in the input sequence (or in the reverse complement for reverse strand ORFs)
Reading frame 1
Starts at base 1 (index 0) — first codon is bases 1–3
Reading frame 2
Starts at base 2 (index 1) — first codon is bases 2–4
Reading frame 3
Starts at base 3 (index 2) — first codon is bases 3–5
📊 Result Interpretation
ORF header line

"ORF number N in reading frame RF on the STRAND strand extends from base START to base END." Coordinates are 1-based.

DNA sequence block

The nucleotide sequence of the ORF, 60 bases per line.

Translation block

The amino acid sequence of the ORF. Stop codon is shown as *. Unknown codons become X.

Reverse strand

Coordinates refer to positions in the reverse-complement sequence, not the original.

🔬 Applications
  • Finding protein-coding regions in newly sequenced genomic or cDNA sequences
  • Checking all six reading frames of a PCR product for unexpected open reading frames
  • Identifying the longest ORF in a sequence for expression vector cloning
  • Translating annotated CDS coordinates to verify protein sequence
⚠️ Common Mistakes & Warnings
Only the first sequence in multi-FASTA input is processed

The original SMS ORF Finder accepts a single sequence. Paste one FASTA record at a time.

"Any codon" start finds overlapping ORFs

With start = "any codon", every codon can begin an ORF, so overlapping ORFs at the same stop codon will each be reported. Use "atg" for conventional gene-finding.

Minimum length includes the stop codon

A minimum of 30 codons means the ORF must span at least 30 codons counting the stop. A 30-codon ORF encodes a 29-residue protein plus stop.

❓ Frequently Asked Questions

What does "any codon" mean as start codon?
With "any codon", the tool does not require a traditional start codon — it begins an ORF at the first codon in each reading frame window and extends until a stop codon is hit. This finds all possible ORF segments including those that may lack an upstream ATG in the input window.
How are reverse-strand ORFs reported?
The input sequence is first reverse-complemented. ORF coordinates then refer to positions in that reverse-complement sequence. To convert to original-strand coordinates, subtract the end position from the total length.
Why might the tool find ORFs without a stop codon?
If an ORF reaches the end of the sequence before encountering a stop codon and its length meets the minimum threshold, it is still reported as a partial ORF. This is useful when the sequence is a fragment of a longer coding region.