EMBL to FASTA
Convert EMBL records to FASTA format

Paste the contents of one or more EMBL files. Each record must begin with an "ID " line and end with "//". Input limit: 200,000,000 characters.

💡 Quick Summary

EMBL to FASTA accepts one or more EMBL records and returns the DNA sequence(s) in FASTA format, removing all annotation, feature table entries, and non-sequence data.

📋 How to Use
  1. Paste the contents of one or more EMBL files into the input area.
  2. Click Convert. Each record's DE (description) line becomes the FASTA title; the SQ section is extracted, cleaned, and wrapped to 60 characters per line.
  3. The Records Converted and Total Length stats update immediately.
  4. Use the Copy button to copy the FASTA output to your clipboard.
  5. Click Load Example to try the tool with a sample EMBL record.
  6. Click Clear to reset and start again.
🧮 Formulas & Logic
Title
FASTA title = first DE line content, with "" removed
Sequence
All non-IUPAC-DNA characters (spaces, digits, newlines) stripped from SQ block before // terminator
Wrapping
Sequence wrapped at 60 characters per line (addReturns)
📊 Result Interpretation
Records Converted

Number of EMBL records (ID … //) successfully converted. Each record produces one FASTA entry.

Total Length

Sum of all extracted DNA sequence characters across all converted records.

🔬 Applications
  • Extracting bare DNA sequences from EMBL flat files for use in BLAST or alignment tools
  • Converting downloaded ENA/EMBL records to FASTA for downstream bioinformatics pipelines
  • Stripping annotation from EMBL files before loading sequences into a sequence editor
  • Batch-converting multiple EMBL records into a single multi-FASTA file
⚠️ Common Mistakes & Warnings
Only the DNA sequence is extracted

Feature annotations (CDS, mRNA, gene coordinates), cross-references, and taxonomy information are discarded. Use the EMBL Feature Extractor tool if you need feature data.

Multi-line DE fields are truncated

Only the first DE line is used as the FASTA title. If the description spans multiple lines, subsequent lines are not appended.

Input size limit: 200,000,000 characters

This matches the original SMS limit. Very large whole-genome EMBL files may need to be split before processing.

❓ Frequently Asked Questions

Where can I get EMBL-format sequences?
EMBL flat files can be downloaded from the European Nucleotide Archive (ENA) at www.ebi.ac.uk/ena. Search for an accession number and choose the "EMBL" download format.
What is the EMBL flat file format?
EMBL format is a plain-text sequence format used by the European Molecular Biology Laboratory database. Each record begins with "ID " and ends with "//". The DE line holds the description; the SQ block holds the nucleotide sequence.
Can I paste multiple EMBL records at once?
Yes. Paste the full contents of an EMBL file containing any number of records. Each record is converted to a separate FASTA entry in the output.
What characters are kept in the sequence?
Only valid IUPAC DNA/RNA characters (A T G C U R Y S W K M B D H V N X) are retained. Spaces, digits, and formatting characters from the EMBL SQ block are stripped automatically.