MGEScan, identifying LTR and non-LTR in genome sequences are available on the Galaxy scientific workflow which is a web-based workflow software to support data analysis with various tools.


This tutorial demonstrates a quick start of using MGEScan on Galaxy workflow with a sample dataset, D. melanogaster genome. A public server at Indiana University (http://silo.cs.indiana.edu:38080) provides sample datasets and MGEScan tools to try MGEScan on Galaxy without installation hassle.


Approximate 3 hours and 30 minutes (including 3 hours computation time)

Run MGEScan-LTR and MGEScan-nonLTR for D. melanogaster

In this tutorial, we will try to run both MGEScan-LTR and MGEScan-nonLTR with D. melanogaster genome dataset. You can find the dataset at the Shared Data menu on top and MGEScan tools on the left frame.

Access to Galaxy/MGEScan

Open Galaxy/MGEScan at your web browser:


Login or Register (Optional)

You can save your work if you have account on Galaxy workflow. The user-based history in Galaxy/MGEScan stores your data and launched tasks. The guest user account is able to run the MGEScan tools without the login but results or history data won’t be saved if the web browser session is closed.


Email address is required to sign up.



If you already have an account, you can use your user id and password at the User > Login page.


Get Dataset from Shared Data

You can find sample datasets (e.g. D.melanogaster) at Shared Data menu on top. Click “Shared Data” > “Data Libraries” and find “Sample datasets for MGEScan”.

Example: Drosophila melanogaster

In the Data Library, enable the checkbox for d.melanogaster and click “Select datasets for import into selected histories” from the down arrow at the end.


You will find 8 fasta files are available. We need to import all of them, make them all checked and click “Import library datasets” in the middle of the page.


Once you imported the D. melanogaster datasets into your history, you are ready to run MGEScan tools on Galaxy. Go to the main page, and checkout imported datasets (8 files) on the right frame of the page.


You can select where datasets to be imported.

Run MGEScan for LTR and nonLTR

In the new version of MGEScan, two programs, MGEScan-LTR and MGEScan-nonLTR, can be ran at the same time with a merged result. Open the page at “MGEScan > MGEScan”, a simple tool is available for LTR and nonLTR executions with MPI option for parallel processing.


Find LTR or nonLTR page if you’d like to choose other options to run MGEScan tools in detail.

MGEScan Tool

MGEScan runs both LTR and nonLTR with a selected input genome sequence. Find “MGEScan > MGEScan” tool on the left frame and confirm that the symlink dataset we created in the previous step is loaded in “From” select form like so:


Enable MPI

To accelerate processing time, select “Yes” at “Enable MPI” select form and specify “Number of MPI Processes”. If you have a multi-core system, use up to the number of cores. silo.cs.indiana.edu has 24 cores but we will use 4 in this tutorial to avoid being a noisy neighbor.

Our options are:

  • From: Create a symlink to multiple datasets on data 2 and data 8, and others
  • MGEScan: Both
  • Enable MPI: Yes
  • Number of MPI Processes: 4

And click “Execute”.

Computation Time

Our test case took 3 hours for analyzing LTR and nonLTR of D. melanogaster:

  • nonLTR: 19 minutes
  • LTR: 3 hours
  • Total: 3 hours


Upon the MGEScan tools completion, the output files are accessible via Galaxy in gff3 format, a plain text, or an archived (e.g. tar.gz) file. You will notice that the color of your tools has been changed to green like so:


You can download the output files to your local storage, or get access to Genome Browser with provided links.

Visualization: UCSC or Ensembl Genome Browser

Your genomic data in a Generic Feature Format Version 3 (gff3) can be displayed by a well known visualization tool such as UCSC or Ensembl Genome Browser on Galaxy with custom annotations of MGEScan for LTR and nonLTR. Find the link provided for gff3 to view interactive graphical display of genome sequence data.


UCSC Genome Browser (Example View)


Ensembl (Example View)


Additional Options

There are other options to view results on a web interface or local.

  • View data: Content of the result file
  • Download: Download the file

Description of tools

Each tool in Galaxy has its description to explain how to use.