💡 Quick Summary
Protein Stats returns the count and percentage of each amino acid residue in the sequence you enter. Percentage totals are also given for eight biologically meaningful residue groups — aliphatic, aromatic, sulphur-containing, basic, acidic, aliphatic hydroxyl, and the two tRNA synthetase classes — allowing you to quickly compare the composition of different sequences.
📋 How to Use
- Paste one or more protein sequences (raw or FASTA) into the text area.
- Click Submit. A residue composition table is displayed for each sequence.
- The upper section of each table shows individual residue counts (A–Z); the lower section shows group totals.
- Click Load Example to analyse two sample sequences.
- Use Copy All to copy the full text report to your clipboard.
🧮 Formulas & Logic
📊 Result Interpretation
Non-polar, hydrophobic residues that form the hydrophobic core of globular proteins.
Bulky hydrophobic residues; often buried or involved in stacking interactions.
Cysteine can form disulfide bonds; methionine is rarely involved in cross-links.
Positively charged at physiological pH. High in DNA-binding proteins.
Negatively charged or amidated residues. D and E are fully charged at pH 7.
Common phosphorylation targets; also involved in hydrogen bonding.
Charged by class I aminoacyl-tRNA synthetases.
Charged by class II aminoacyl-tRNA synthetases.
🔬 Applications
- Comparing amino acid composition between protein families or organisms
- Estimating hydrophobicity and charge properties before wet-lab work
- Checking for unusual amino acid distributions in designed or synthetic sequences
- Identifying proteins likely to be membrane-associated (high aliphatic/aromatic content)
- Verifying that a translated sequence has the expected stop codon or composition
⚠️ Common Mistakes & Warnings
B (Asp or Asn), X (any residue), and Z (Glu or Gln) are retained and counted separately. They are also included in the acidic group (B, D, E, N, Q, Z). Sequences from automated pipelines may contain X residues that inflate group counts.
Digits, whitespace, gap characters, stop codons (*), and any letter not in the standard 20 + B/X/Z set are removed before statistics are calculated. The reported length reflects the cleaned sequence.