💡 Quick Summary
Shuffle Protein randomly shuffles the residues of one or more protein sequences using a Fisher-Yates permutation. Because shuffling is a permutation rather than random sampling, the output sequences have exactly the same residue composition as the inputs — only the order of residues changes. Shuffled sequences serve as composition- and length-matched controls for evaluating sequence analysis results.
📋 How to Use
- Paste one or more raw or FASTA sequences into the textarea. Multiple FASTA records are each shuffled independently. Input limit: 300,000,000 characters.
- Click Run. Each sequence is independently shuffled and output as a FASTA record. Use Copy to copy the plain-text result.
🧮 Formulas & Logic
📊 Result Interpretation
Unlike sampling-based tools, shuffling does not change the count of any residue. If the input is 20% Leucine, the shuffled output is also exactly 20% Leucine.
All positional information (motif positions, domain boundaries, sequence biases) is destroyed. The shuffled sequence has no sequence-level similarity to the original.
If you paste multiple FASTA records, each sequence is shuffled independently. The number of output records equals the number of valid input records.
IUPAC degenerate amino acid codes (B, X, U, O, J) are retained in the shuffle pool alongside standard residues.
🔬 Applications
- Generating composition- and length-matched null sequences for statistical testing of motif enrichment
- Producing background sequences that share exactly the same amino acid composition as a query protein
- Testing analysis pipelines with sequences that have identical composition but no sequence similarity to real proteins
- Evaluating whether an analysis result depends on sequence order rather than just residue composition
⚠️ Common Mistakes & Warnings
Digits, spaces, and non-IUPAC characters are removed from each sequence before shuffling. The output length may be shorter than the input if invalid characters were present.
Shuffle Protein produces a single permutation of the input. To generate many independent randomised sequences from a template, use Sample Protein instead.