Color Align Conservation
Colour a sequence alignment by conservation — identical, similar, and different residues

FASTA or GDE format. All sequences must be pre-aligned (equal length including gaps). Input limit: 200,000,000 characters.

Comma-separated groups of similar amino acids (e.g. GAVLI, FYW, CM, ST, KRH, DENQ, P). Leave empty for DNA alignments.

Comma-separated starting positions to offset the per-sequence residue counter. Defaults to 0 for all sequences when left empty.

💡 Quick Summary

Color Align Conservation accepts a group of aligned sequences in FASTA or GDE format and colours the alignment by conservation level. Identical residues receive a black background, similar residues a grey background, and all others remain uncoloured. Use it to visually highlight conserved regions in a sequence alignment.

📋 How to Use
  1. Paste pre-aligned sequences in FASTA or GDE format into the textarea. All sequences must be the same length (gaps included). Input limit: 200,000,000 characters.
  2. Set residues per line — controls how many alignment columns are shown per block row (default: 80).
  3. Set the consensus threshold — the minimum percentage of sequences that must share a residue (or belong to the same similarity group) for conservation colouring to be applied (default: 100%).
  4. Choose the colour mode: Background fills identical residues black and similar residues grey; Text colours the characters instead (useful for printing on white paper).
  5. Enter similarity groups as comma-separated letter strings (e.g. GAVLI, FYW, CM, ST, KRH, DENQ, P). Leave empty when colouring a DNA alignment.
  6. Optionally enter starting positions as comma-separated integers to offset the per-sequence position counter (e.g. 0, 200, 0, -1).
  7. Click Run. Use Copy to copy the plain-text alignment.
🧮 Formulas & Logic
Identity check
Count of sequences with the same residue at a column ÷ total sequences ≥ threshold → "identical" colour
Similarity check
Count of sequences whose residue belongs to the same group ÷ total sequences ≥ threshold → "similar" colour
Position counter
Increments by the number of non-gap characters (not - or .) in each displayed block for each sequence
📊 Result Interpretation
Black (background mode)

Identical residue: all (or ≥ threshold %) sequences share the exact same character at this column.

Grey (background mode)

Similar residue: all (or ≥ threshold %) sequences have a character belonging to the same physicochemical group.

No highlight

Residue is neither identical nor similar across the alignment at the selected threshold.

Position numbers

Printed at the end of each row — the cumulative count of non-gap residues up to that point in each sequence.

🔬 Applications
  • Visualising conserved residues in a multiple protein or DNA sequence alignment
  • Identifying structurally or functionally important positions shared across a protein family
  • Enhancing figures produced by sequence alignment programs for publication
  • Comparing conservation patterns between different alignment regions
  • Designing primers or probes that target highly conserved regions
⚠️ Common Mistakes & Warnings
Sequences must be pre-aligned and equal length

This tool does not perform sequence alignment. All input sequences must already be aligned and must have identical lengths (including gap characters).

Leave similarity groups empty for DNA alignments

The default similarity groups are designed for protein alignments. When colouring a DNA alignment, clear the similarity groups field so only identical nucleotides are highlighted.

Sequence titles are truncated to 20 characters

Only the first 20 characters of each FASTA title are used for display. Ensure the first 20 characters of each title are distinct enough to identify the sequence.

❓ Frequently Asked Questions

What do the similarity groups represent?
Each group is a set of amino acids that share physicochemical properties. For example, GAVLI groups small non-polar aliphatic residues. A column is coloured "similar" if all (or ≥ threshold %) sequences have residues from the same group — even if the exact residues differ.
What does the consensus threshold do?
At 100 % (default), colouring is applied only when every sequence in the alignment agrees. Lower the threshold (e.g. 80 %) to highlight columns that are conserved in most but not all sequences. This is useful for large or divergent alignments.
Why does the position counter start at 0?
By default, position numbering starts at 0 for all sequences. To begin at a different position — for example to match the numbering of a reference sequence — enter the starting values in the Starting Positions field, one per sequence separated by commas.
Can I colour a DNA alignment?
Yes. Clear the similarity groups field (leave it empty) before running. Only perfectly identical nucleotides at ≥ threshold % will receive the identity colour; no similarity colouring is applied.