Group Protein
Format a protein sequence with group spacing and position numbering

Raw protein sequence or one or more FASTA sequences. Non-letter characters are removed automatically. Input limit: 100,000,000 characters.

💡 Quick Summary

Group Protein adjusts the spacing of protein sequences and adds numbering. Specify the group size (residues per group) and the number of residues per line. The output serves as a convenient annotated reference — the numbering and spacing let you quickly locate specific residues.

📋 How to Use
  1. Paste a raw protein sequence or one or more FASTA sequences into the textarea. Input limit: 100,000,000 characters.
  2. Choose the group size: the number of residues in each space-separated block (3, 5, or 10).
  3. Set residues per line: how many residues to display on each line (30–100; default 80).
  4. Choose the numbering position: Left places the line start position before each line; Above places a position ruler above each block; Right places the line end position after each line.
  5. Click Run. Position numbering always starts at 1 for protein sequences. Use Copy to copy the plain-text output.
🧮 Formulas & Logic
Position (left/right)
Line start (left) or end (right) = 1-based residue index, starting from 1
Ruler (above)
Position numbers placed right-justified at every 10th residue, spaced to match the grouped sequence
📊 Result Interpretation
Group spacing

Spaces separate each block of residues, making it easier to count to a specific position within a line.

Position numbers

Show the cumulative residue number at the start or end of each line (or as a ruler above). Numbering always begins at 1.

🔬 Applications
  • Creating a numbered residue reference for a protein of interest
  • Locating specific residues (active site, mutations, post-translational modification sites) by position
  • Generating formatted protein sequence output for a publication or lab notebook
  • Cross-referencing residue positions between a sequence and a structural annotation
⚠️ Common Mistakes & Warnings
Sequence must be single-letter amino acid code

Non-letter characters (spaces, numbers, dashes) and characters other than standard single-letter amino acid codes and stop codon (*) are stripped automatically.

❓ Frequently Asked Questions

Which group size should I use?
Use 10 (default) for easy positional counting — every 10 residues you pass one full group. Use 5 for a more compact view that still aids counting. Use 3 to visually group by codons or tripeptide motifs.
Can I use three-letter amino acid codes?
No. The input must be in single-letter amino acid code (e.g. M, A, K, …). Three-letter codes (Met, Ala, Lys, …) contain non-protein characters that will be stripped, leaving only the letter characters.
Why does numbering always start at 1?
Protein residue numbering conventionally starts at 1 (the N-terminal methionine). Unlike Group DNA, there is no custom start number option. If you need a different starting number, use the Group DNA tool with a protein sequence entered as a DNA-style input — though Group DNA is designed for nucleotide sequences.