DNA Sequence

Raw sequence or a single FASTA record. Input limit: 100,000,000 characters.

Parameters

ORFs begin with

Reading frame

Strand

Min. ORF length (codons)

Genetic code

💡 Quick Summary

ORF Finder searches a DNA sequence for open reading frames and returns the position of each ORF together with its protein translation. Use it to find potential protein-coding regions in newly sequenced DNA. Supports all six reading frames (three on each strand), all IUPAC degenerate bases, and 17 genetic code tables.

📋 How to Use

Paste a raw DNA sequence or a single FASTA record into the textarea. Input limit: 100,000,000 characters.
Choose which codons can start an ORF: any codon (open search), atg only, or atg/gtg/ctg/ttg.
Select the reading frame (1, 2, 3, or all three) and strand (direct or reverse complement).
Enter the minimum ORF length in codons (default: 30). Only ORFs equal to or longer than this threshold are reported.
Select the genetic code table appropriate for your organism.
Click Run. Each ORF is reported in FASTA format: first the DNA sequence with its coordinates, then the protein translation.

🧮 Formulas & Logic

Minimum length

Number of codons (including the stop codon) that an ORF must reach to be reported

ORF coordinates

1-based positions in the input sequence (or in the reverse complement for reverse strand ORFs)

Reading frame 1

Starts at base 1 (index 0) — first codon is bases 1–3

Reading frame 2

Starts at base 2 (index 1) — first codon is bases 2–4

Reading frame 3

Starts at base 3 (index 2) — first codon is bases 3–5

📊 Result Interpretation

ORF header line

"ORF number N in reading frame RF on the STRAND strand extends from base START to base END." Coordinates are 1-based.

DNA sequence block

The nucleotide sequence of the ORF, 60 bases per line.

Translation block

The amino acid sequence of the ORF. Stop codon is shown as *. Unknown codons become X.

Reverse strand

Coordinates refer to positions in the reverse-complement sequence, not the original.

🔬 Applications

Finding protein-coding regions in newly sequenced genomic or cDNA sequences
Checking all six reading frames of a PCR product for unexpected open reading frames
Identifying the longest ORF in a sequence for expression vector cloning
Translating annotated CDS coordinates to verify protein sequence

⚠️ Common Mistakes & Warnings

Only the first sequence in multi-FASTA input is processed

The original SMS ORF Finder accepts a single sequence. Paste one FASTA record at a time.

"Any codon" start finds overlapping ORFs

With start = "any codon", every codon can begin an ORF, so overlapping ORFs at the same stop codon will each be reported. Use "atg" for conventional gene-finding.

Minimum length includes the stop codon

A minimum of 30 codons means the ORF must span at least 30 codons counting the stop. A 30-codon ORF encodes a 29-residue protein plus stop.

❓ Frequently Asked Questions

What does "any codon" mean as start codon?

With "any codon", the tool does not require a traditional start codon — it begins an ORF at the first codon in each reading frame window and extends until a stop codon is hit. This finds all possible ORF segments including those that may lack an upstream ATG in the input window.

How are reverse-strand ORFs reported?

The input sequence is first reverse-complemented. ORF coordinates then refer to positions in that reverse-complement sequence. To convert to original-strand coordinates, subtract the end position from the total length.

Why might the tool find ORFs without a stop codon?

If an ORF reaches the end of the sequence before encountering a stop codon and its length meets the minimum threshold, it is still reported as a partial ORF. This is useful when the sequence is a fragment of a longer coding region.