Window Extractor DNA - TheBiologyBro

Input Sequence

Paste a raw sequence or one or more FASTA sequences. Non-DNA characters are stripped automatically. Input limit: 500,000,000 characters.

Window size (bp) Bases to extract

Position (1-based) Anchor position in sequence

Anchor

Strand

Output mode

💡 Quick Summary

Window Extractor DNA accepts a DNA sequence along with a window size, a position, and an anchor mode (centered on / ending with / starting with). The bases within the window are returned either as a new sequence, as uppercase text within the full sequence, or as lowercase text within the full sequence. Useful for extracting subsequences using position information.

📋 How to Use

Paste a raw DNA sequence or one or more FASTA sequences into the input area. Input limit is 500,000,000 characters.
Set the Window size — the number of bases to extract (default 61).
Choose an Anchor: Centered on places the window symmetrically around the position; Ending with makes the position the last base of the window; Starting with makes the position the first base of the window.
Set the Position — the 1-based anchor position within the sequence (default 50).
Choose a Strand: Direct returns the sequence as-is; Complement returns the reverse complement of the extracted window.
Choose an Output mode: New sequence returns only the window bases as a FASTA entry; Uppercased in context returns the full sequence with the window in uppercase; Lowercased in context returns the full sequence with the window in lowercase.
Click Run. Each input sequence produces one output FASTA entry.
Use the Copy button to copy the result to your clipboard.
Click Load Example to try with a sample 190-base sequence, window size 61, centered on position 50.

🧮 Formulas & Logic

Centered on position P, window W

start = P − ⌊W/2⌋, end = start + W − 1 (1-based, clamped to sequence bounds)

Ending with position P, window W

start = P − W + 1, end = P (1-based, clamped)

Starting with position P, window W

start = P, end = P + W − 1 (1-based, clamped)

Complement strand

output = reverse( IUPAC_complement( window ) )

📊 Result Interpretation

Sequences Processed

Number of FASTA records successfully processed.

Window Info

The window size, anchor position, and anchor mode used for this run.

🔬 Applications

Extracting a fixed-size neighbourhood around a SNP or mutation site for primer design or visual inspection
Obtaining the sequence context around a restriction site or transcription factor binding position
Cutting out a defined window for motif scanning or local alignment
Generating uppercase-highlighted views of a feature within its genomic context for publication figures
Retrieving the complement-strand window around a known antisense element

⚠️ Common Mistakes & Warnings

Window is clamped at sequence boundaries

If the computed window extends beyond the start or end of the sequence it is automatically clamped. The actual extracted length may therefore be shorter than the requested window size, and this is reflected in the FASTA title of the output.

Non-DNA characters are stripped

Any character that is not a valid IUPAC DNA/RNA letter is removed from the sequence before extraction. Digits, whitespace, and punctuation are all stripped automatically.

Position must be within the sequence

If the anchor position is greater than the sequence length the record is skipped and a warning is shown. Positions start at 1.

❓ Frequently Asked Questions

What is the difference between the three anchor modes?

"Centered on" places the window symmetrically around the given position — for a window of size 61, 30 bases appear before and 30 after the position. "Ending with" returns the W bases that finish at the given position. "Starting with" returns the W bases that begin at the given position.

What happens if the window extends past the end of the sequence?

The window is clamped to the sequence boundaries. You will still receive output, but it will be shorter than the requested window size. The FASTA title records the actual start and end positions used.

What does the Complement strand option do?

The complement strand option takes the extracted window, computes the IUPAC nucleotide complement of every base, then reverses the result — producing the sequence as it would appear on the antisense strand read 5'→3'.

What is the difference between "New sequence" and the context modes?

"New sequence" returns only the bases within the window as a compact FASTA record. "Uppercased in context" returns the full source sequence in lowercase with the window region in UPPERCASE — useful for seeing where the window sits. "Lowercased in context" does the reverse: the full sequence is uppercase and the window is lowercase.

Can I process multiple sequences at once?

Yes. Paste any number of FASTA-formatted sequences. The same window size, position, anchor, and output settings are applied to every sequence independently.