Pairwise Align DNA
Global alignment of two DNA sequences using a configurable identity matrix

Raw sequence or FASTA format. Input limit: 20,000 characters.

Raw sequence or FASTA format. Input limit: 20,000 characters.

Use the following parameters to specify how alignments are scored.

💡 Quick Summary

Pairwise Align DNA accepts two DNA sequences and determines the optimal global alignment. Scoring is based on a simple identity matrix with configurable match and mismatch values. Use Pairwise Align DNA to look for conserved sequence regions between two DNA sequences.

📋 How to Use
  1. Paste the first DNA sequence (raw or FASTA) into Sequence One. Input limit is 20,000 characters.
  2. Paste the second DNA sequence (raw or FASTA) into Sequence Two. Input limit is 20,000 characters.
  3. Set the Match value (reward for identical bases) and Mismatch value (penalty for non-identical bases).
  4. Set the Value for gaps preceding a sequence, Value for internal gaps, and Value for gaps following a sequence.
  5. Click Submit. The aligned sequences are shown in FASTA format with the alignment score.
  6. Click Load Example to align the two sample sequences from the original SMS.
  7. Use the Copy button to copy the output to your clipboard.
🧮 Formulas & Logic
Alignment algorithm
Hirschberg divide-and-conquer (linear space, O(n) memory) with Needleman–Wunsch DP for sub-problems
Scoring matrix
Identity matrix: +match for identical bases, +mismatch for non-identical bases
Gap penalty
Select value is added directly to the running score — positive display = reward, negative display = penalty
📊 Result Interpretation
Aligned output

Gaps in the alignment are represented as "-"

Alignment score

Sum of match/mismatch scores plus gap values; higher is better

Seq 1 length

Number of valid DNA bases in the first input sequence

Seq 2 length

Number of valid DNA bases in the second input sequence

🔬 Applications
  • Comparing homologous gene sequences from different species
  • Identifying conserved regulatory elements between two genomic regions
  • Aligning two alleles or splice variants of the same gene
  • Quick global comparison before running a BLAST search
⚠️ Common Mistakes & Warnings
Non-DNA characters are removed

Characters that are not valid IUPAC DNA/RNA symbols are stripped before alignment. U is accepted and treated as T.

Long sequences may be slow

The Hirschberg algorithm is O(nm) time but O(n) space. Sequences near the 20,000 character limit may take several seconds in the browser.

❓ Frequently Asked Questions

What match and mismatch values should I use?
The SMS defaults are match = +2 and mismatch = -1. These work well for general DNA comparisons. A higher match-to-mismatch ratio rewards identity more strongly.
What do the gap values mean?
Positive display values subtract from the score (penalty); the default internal gap penalty is -2. Setting begin/end gap values to 0 allows one sequence to extend beyond the other for free.
What does "-" mean in the output?
A "-" character represents a gap introduced into one sequence to maximise the global alignment score.