Range Extractor DNA - TheBiologyBro

Q: Can I extract multiple ranges at once?

Yes. Enter ranges separated by commas, e.g. 1..50, 100..150, end. In "New sequence" mode all ranges are concatenated in order. In "Separate FASTA records" mode each range becomes its own FASTA entry.

Q: What do the keywords start, end, center, and length mean?

start and begin equal 1. end and stop equal the sequence length. center equals Math.round(length / 2). length also equals the sequence length. These can be used in arithmetic expressions, e.g. (center-10)..(center+10).

Input DNA

Paste a raw sequence or one or more FASTA sequences. Input limit: 500,000,000 characters.

Ranges:

Use ".." for a span (10..20), commas to separate entries. Keywords: start, end, center, length. Arithmetic allowed: (end-10)..end

Strand:

Return as:

💡 Quick Summary

Range Extractor DNA accepts a DNA sequence along with a set of positions or ranges, and returns the matching bases in your choice of four formats: merged into a new sequence, as separate FASTA records, uppercased within the full sequence, or lowercased within the full sequence. Ranges support numeric positions, spans (10..20), the keywords start/end/center/length, and arithmetic expressions such as (end-10)..end.

📋 How to Use

Paste a raw DNA sequence or one or more FASTA sequences into the top input area.
Enter positions or ranges in the Ranges field, separated by commas. Use x..y for a span. You can use the keywords start, end, center, and length in place of numbers, and arithmetic expressions such as (end-10)..end.
Choose a Strand: Direct returns the bases as-is; Complement returns the reverse complement of the extracted sequence.
Choose a Return as mode: New sequence joins all ranges into one FASTA entry. Separate FASTA records gives each range its own header. Uppercased in context or Lowercased in context returns the full sequence with the range regions changed in case.
Click Extract. Multiple FASTA input sequences are each processed independently.
Use Copy to copy the result to your clipboard.
Click Load Example to try the tool with a 190-base sample sequence using the default ranges 1, 5, 10..20, end.
Click Clear to reset.

🧮 Formulas & Logic

Single position

sequence[ position − 1 ] (1-based → 0-based)

Range

sequence.substring( start − 1, stop ) (both ends inclusive)

Complement strand

reverse( IUPAC_complement( extracted_sequence ) )

Keyword: start

Keyword: end

sequence length

Keyword: center

Math.round( sequence.length / 2 )

Keyword: length

sequence length

Arithmetic

Simple expressions allowed, e.g. (end − 10)..end

📊 Result Interpretation

Sequences Processed

Number of FASTA records (or bare sequences) found in the input.

Ranges Parsed

Number of valid position or range entries extracted from the Ranges field.

🔬 Applications

Extracting a specific exon, promoter, or restriction site by coordinate from a genomic sequence
Obtaining the last N bases of a sequence using the expression (end-N+1)..end
Pulling multiple non-contiguous regions and joining them as a synthetic sequence
Highlighting the position of a feature within its genomic context using Uppercased in context mode
Extracting the complement strand sequence of a specified region for primer design

⚠️ Common Mistakes & Warnings

Positions are 1-based

All coordinates follow the biological convention: position 1 is the first base. The sequence is converted to 0-based indices internally.

Out-of-range positions are skipped

If a position or range end is less than 1 or greater than the sequence length, or if the start is greater than the stop, that range entry is skipped and a warning is shown.

Complement mode on context modes may look unexpected

In Uppercased/Lowercased context modes, complement strand applies to the entire sequence after case-marking. The highlighted regions still show their original positions within the reverse complement.

Non-DNA characters are stripped from the sequence

Before extraction, the input sequence is cleaned to retain only valid IUPAC DNA characters. The range positions apply to the cleaned sequence.

❓ Frequently Asked Questions

Can I extract multiple ranges at once?

Yes. Enter ranges separated by commas, e.g. 1..50, 100..150, end. In "New sequence" mode all ranges are concatenated in order. In "Separate FASTA records" mode each range becomes its own FASTA entry.

What do the keywords start, end, center, and length mean?

start and begin equal 1. end and stop equal the sequence length. center equals Math.round(length / 2). length also equals the sequence length. These can be used in arithmetic expressions, e.g. (center-10)..(center+10).

What is the difference between the four output modes?

"New sequence" concatenates all extracted bases into a single FASTA entry — useful for joining exons or building a synthetic construct. "Separate FASTA records" gives each range its own >base X..Y header. "Uppercased in context" returns the full sequence in lowercase with the specified ranges in uppercase, making it easy to see where the ranges fall. "Lowercased in context" is the reverse — sequence in uppercase with ranges lowercased.

What does Complement strand do?

When "Complement" is selected, the extracted sequence is IUPAC-complemented and then reversed, producing the reverse complement. In "New sequence" mode the complement is taken after joining all ranges. In "Uppercased/Lowercased in context" mode the entire full sequence is reverse-complemented after case-marking.