Protein Stats - TheBiologyBro

💡 Quick Summary

Protein Stats returns the count and percentage of each amino acid residue in the sequence you enter. Percentage totals are also given for eight biologically meaningful residue groups — aliphatic, aromatic, sulphur-containing, basic, acidic, aliphatic hydroxyl, and the two tRNA synthetase classes — allowing you to quickly compare the composition of different sequences.

📋 How to Use

Paste one or more protein sequences (raw or FASTA) into the text area.
Click Submit. A residue composition table is displayed for each sequence.
The upper section of each table shows individual residue counts (A–Z); the lower section shows group totals.
Click Load Example to analyse two sample sequences.
Use Copy All to copy the full text report to your clipboard.

🧮 Formulas & Logic

Count

Number of occurrences of the residue or group pattern in the cleaned sequence.

Percentage

(Count / sequence length) × 100, reported to two decimal places.

📊 Result Interpretation

Aliphatic (G,A,V,L,I)

Non-polar, hydrophobic residues that form the hydrophobic core of globular proteins.

Aromatic (F,W,Y)

Bulky hydrophobic residues; often buried or involved in stacking interactions.

Sulphur (C,M)

Cysteine can form disulfide bonds; methionine is rarely involved in cross-links.

Basic (K,R,H)

Positively charged at physiological pH. High in DNA-binding proteins.

Acidic (B,D,E,N,Q,Z)

Negatively charged or amidated residues. D and E are fully charged at pH 7.

Aliphatic hydroxyl (S,T)

Common phosphorylation targets; also involved in hydrogen bonding.

tRNA synthetase class I (Z,E,Q,R,C,M,V,I,L,Y,W)

Charged by class I aminoacyl-tRNA synthetases.

tRNA synthetase class II (B,G,A,P,S,T,H,D,N,K,F)

Charged by class II aminoacyl-tRNA synthetases.

🔬 Applications

Comparing amino acid composition between protein families or organisms
Estimating hydrophobicity and charge properties before wet-lab work
Checking for unusual amino acid distributions in designed or synthetic sequences
Identifying proteins likely to be membrane-associated (high aliphatic/aromatic content)
Verifying that a translated sequence has the expected stop codon or composition

⚠️ Common Mistakes & Warnings

Degenerate codes are counted

B (Asp or Asn), X (any residue), and Z (Glu or Gln) are retained and counted separately. They are also included in the acidic group (B, D, E, N, Q, Z). Sequences from automated pipelines may contain X residues that inflate group counts.

Non-standard characters stripped before counting

Digits, whitespace, gap characters, stop codons (*), and any letter not in the standard 20 + B/X/Z set are removed before statistics are calculated. The reported length reflects the cleaned sequence.

❓ Frequently Asked Questions

Why are B, X, and Z included?

These are standard IUPAC degenerate amino acid codes that appear in sequences downloaded from databases. B represents Asp or Asn, Z represents Glu or Gln, and X represents any amino acid. They are counted separately so you can see how many ambiguous residues your sequence contains.

Do the group percentages add up to 100%?

No. Group categories are not mutually exclusive and are not designed to sum to 100%. For example, Trp (W) appears in both the Aromatic group and tRNA synthetase class I. Individual residue percentages sum to 100% (excluding B, X, Z if they are absent).

What is the tRNA synthetase classification based on?

Aminoacyl-tRNA synthetases (aaRS) are divided into two classes based on their active site architecture. Class I enzymes have a Rossmann fold and typically aminoacylate the 2′-OH of the terminal adenosine; class II enzymes have a seven-stranded antiparallel β-sheet and aminoacylate the 3′-OH.