Pairwise Align Protein
Global alignment of two protein sequences using BLOSUM or PAM matrices

Raw sequence or FASTA format. Input limit: 20,000 characters.

Raw sequence or FASTA format. Input limit: 20,000 characters.

Use the following parameters to specify how alignments are scored.

💡 Quick Summary

Pairwise Align Protein accepts two protein sequences and determines the optimal global alignment. Choose from BLOSUM45, BLOSUM62, BLOSUM80, PAM30, or PAM70 substitution matrices. Use this tool to look for conserved sequence regions between two proteins.

📋 How to Use
  1. Paste the first protein sequence (raw or FASTA) into Sequence One. Input limit is 20,000 characters.
  2. Paste the second protein sequence (raw or FASTA) into Sequence Two. Input limit is 20,000 characters.
  3. Select a Scoring Matrix. BLOSUM62 is the default and works well for most comparisons.
  4. Set the Value for gaps preceding a sequence, Value for internal gaps, and Value for gaps following a sequence.
  5. Click Submit. The aligned sequences are shown in FASTA format with the alignment score.
  6. Click Load Example to align the Human and Xenopus p53 proteins from the original SMS.
  7. Use the Copy button to copy the output to your clipboard.
🧮 Formulas & Logic
Alignment algorithm
Hirschberg divide-and-conquer (linear space, O(n) memory) with Needleman–Wunsch DP for sub-problems
Scoring matrices
BLOSUM45, BLOSUM62, BLOSUM80 (Henikoff & Henikoff, 1992); PAM30, PAM70 (Dayhoff et al., 1978)
Gap penalty
Select value is added directly to the running score — positive display = reward, negative display = penalty
📊 Result Interpretation
Aligned output

Gaps in the alignment are represented as "-"

Alignment score

Sum of substitution scores plus gap values; higher is better

Seq 1 length

Number of amino acid residues in the first input sequence

Seq 2 length

Number of amino acid residues in the second input sequence

Matrix used

The substitution matrix selected for scoring

🔬 Applications
  • Comparing orthologous proteins across different species
  • Identifying conserved functional domains between two protein sequences
  • Assessing overall sequence similarity before a BLAST search
  • Aligning two protein variants (e.g. wild-type vs. mutant) to locate changes
⚠️ Common Mistakes & Warnings
Non-amino acid characters are removed

Characters that are not valid IUPAC amino acid symbols are stripped before alignment.

Long sequences may be slow

The Hirschberg algorithm is O(nm) time but O(n) space. Sequences near the 20,000 character limit may take several seconds in the browser.

Matrix choice affects results

BLOSUM62 is best for sequences with ~40–60% identity. Use BLOSUM80 for highly similar sequences, BLOSUM45 or PAM matrices for more distantly related ones.

❓ Frequently Asked Questions

Which scoring matrix should I use?
BLOSUM62 is the standard choice for most pairwise alignments. Use BLOSUM80 for closely related sequences (>60% identity) or BLOSUM45/PAM matrices for distantly related sequences.
What do the gap values mean?
Positive display values (stored as negative numbers) subtract from the score and penalise gaps. The SMS defaults are: gaps preceding = 0 (free), internal gaps = -2 (penalty of 2 per gap), gaps following = 0 (free).
What does "-" mean in the output?
A "-" character represents a gap introduced into one sequence to maximise the global alignment score.