Random Protein Sequence
Generate a random amino acid sequence of specified length

Maximum length: 10,000,000 residues. Each residue is chosen uniformly at random from the 20 standard amino acids.

💡 Quick Summary

Random Protein Sequence generates a random amino acid sequence of the length you specify. Each of the 20 standard residues is chosen with equal probability. Random protein sequences are useful as null models for evaluating the significance of sequence analysis results.

📋 How to Use
  1. Enter the desired length in residues (default: 1000; maximum: 10,000,000).
  2. Choose how many sequences to generate (1, 10, 50, or 100).
  3. Click Run. Each sequence is output as a FASTA record with an auto-generated title. Use Copy to copy the plain-text result.
🔬 Applications
  • Generating null-model sequences for evaluating the significance of motif or pattern search results
  • Creating random backgrounds for benchmarking alignment or annotation tools
  • Producing synthetic protein sequences with uniform residue composition for statistical testing
  • Testing sequence analysis pipelines with known-random inputs
⚠️ Common Mistakes & Warnings
Sequences are randomly generated each run

Every click of Run produces a different random sequence. Each residue position is chosen independently and uniformly at random from the 20 standard amino acids.

Large lengths or many sequences may be slow

Generating 100 sequences of 10,000,000 residues each creates 1,000,000,000 characters in the browser. For very large outputs, use a smaller count or length.

Equal residue frequencies

All 20 amino acids are drawn with equal probability (5% each). The output does not reflect any organism's actual residue composition or codon usage bias.

❓ Frequently Asked Questions

Are all 20 amino acids represented equally?
Yes — each of the 20 standard amino acids is chosen with equal probability (5% each). The output does not reflect any organism's actual residue composition.
Why does the output have 60-character lines?
Sequences are wrapped at 60 characters per line, which is a common convention for FASTA-format protein sequences.
Can I generate sequences longer than 10,000,000 residues?
Not with this tool — the limit is set to avoid excessive browser memory use. For longer sequences, consider a command-line tool.