💡 Quick Summary
Protein Pattern Find accepts one or more protein sequences along with a regular expression search pattern, and returns the number and positions of all sites that match the pattern. Use it to locate sequence regions matching a consensus sequence of interest, such as phosphorylation motifs, binding sites, or conserved domains.
📋 How to Use
- Paste one or more protein sequences (raw or FASTA) into the text area.
- Enter a search pattern in the pattern field. The default pattern
S[^S]{0,5}Sfinds pairs of serine residues that are no more than 5 residues apart. - Standard JavaScript regular expression syntax is supported: use
[ACE]for character classes,{n,m}for repeat counts,.for any residue, and[^X]for "not X". - Click Submit. Each matching region is reported with its position (1-based) within the sequence.
- Overlapping matches are reported — for example,
SSinSSSis found at positions 1 and 2. - Use Copy All to copy the full results report to your clipboard.
🧮 Formulas & Logic
📊 Result Interpretation
Residue positions are 1-based and count from the first residue of the cleaned sequence (after non-protein characters are stripped).
If your pattern can match overlapping regions (e.g. SS in SSS), all overlapping occurrences are reported.
The pattern did not match anywhere in that sequence. Check the pattern syntax and residue case (the search is case-insensitive).
🔬 Applications
- Locating known phosphorylation, glycosylation, or cleavage motifs across a sequence set
- Finding occurrences of a consensus binding site or recognition sequence
- Identifying conserved short linear motifs (SLiMs) in protein families
- Verifying the presence or absence of a tag, linker, or signal sequence after cloning
- Scanning translated sequences for protease cleavage sites before digest experiments
⚠️ Common Mistakes & Warnings
Characters outside the 20 standard amino acid letters (B, Z, X, *, gaps, digits, whitespace) are removed from the sequence before the pattern is applied. Reported positions refer to the cleaned sequence.
Any valid JavaScript regular expression is accepted. Unbalanced brackets, invalid quantifiers, or other syntax errors will be caught and reported before the search runs.
Patterns such as .* or .{0,} will match very large regions. Narrow the pattern with specific residue requirements or tighter repeat bounds.
❓ Frequently Asked Questions
What pattern syntax is supported?
[ACDE] matches any of A, C, D, or E; [^P] matches any residue except P; . matches any residue; {2,4} means 2–4 repeats; ^ and $ anchor to the start/end of the sequence. The search is always case-insensitive.What does the default pattern S[^S]{0,5}S mean?
Why do the positions differ from those in a database annotation?
Can I search for a PROSITE-style pattern?
x(2,4) instead of .{2,4}). You will need to convert them to regular expression syntax manually before searching here.