💡 Quick Summary
Random Protein Regions selects specific positions or ranges within a protein sequence and either replaces those regions with random residues (leaving the rest intact) or preserves those regions and randomises everything else. Random sequences generated this way can serve as controls for evaluating the significance of sequence analysis results.
📋 How to Use
- Paste a raw protein sequence or one or more FASTA sequences into the textarea. Input limit: 500,000,000 characters.
- Enter the residue positions or ranges of interest. Use
..or-to specify a range and a comma to separate entries — for example:1, 5, 10..20, 50-60. Positions are 1-based. - Choose the mode: Replace ranges with random residues fills the specified positions with random amino acids and leaves the rest of the sequence unchanged. Preserve ranges keeps the specified positions intact and randomises everything else.
- Click Run. Each input sequence is processed independently and output as a FASTA record. Use Copy to copy the plain-text result.
📊 Result Interpretation
Only the residues at the specified positions are changed to random amino acids. The sequence length does not change.
Every residue that is not covered by a range is replaced with a random amino acid. The specified positions are output unchanged.
Overlapping or duplicate ranges are handled correctly — each residue position is considered once.
Range endpoints are automatically clamped to the actual sequence length. Positions beyond the end of the sequence are silently ignored.
🔬 Applications
- Generating negative-control sequences that preserve specific motifs (e.g. an active site or binding domain) while randomising the background
- Destroying specific features (signal peptides, transmembrane regions, conserved motifs) within a sequence for in silico experiments
- Producing synthetic sequences that retain fixed structural elements while randomising variable regions
- Testing pattern-search or alignment tools against partially-randomised protein inputs
⚠️ Common Mistakes & Warnings
Digits, spaces, and any non-IUPAC amino acid characters are removed from the input before ranges are applied. The position numbers you enter refer to the cleaned sequence.
Random positions are replaced with one of the 20 standard amino acids (A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y) each with equal probability.
Position 1 refers to the first residue of the cleaned sequence. If your ranges were derived from a tool that uses 0-based coordinates, add 1 to each boundary.
❓ Frequently Asked Questions
What is the difference between "Replace" and "Preserve" mode?
Can I use both ".." and "-" as range separators in the same input?
1..10, 20-30 is valid. Both separators are recognised and may be mixed freely.