MGEScan on Galaxy Installation¶
MGEScan on Galaxy can be installed on a local machine or on the cloud e.g. Amazon EC2. The local installation is for Ubuntu 14.04+ distribution. Others (e.g. OpenSUSE, Fedora) are not verified.
Tip
approximate time: 20 minutes
Preparation¶
There are required software to be installed prior to run MGEScan. You need to
install system packages with sudo
command (admin root
privilege is
required). virtualenv
is used for Python package installation.
root
privilege to install packages with sudo
Quick Installation¶
One-liner command provides a quick installation of required software and configuration.
Warning
This one-liner installation script runs several commands without any further confirmation from you. If you’d like to verify each step, skip this quick installation and follow the installation instuctions below.
curl -L https://raw.githubusercontent.com/MGEScan/mgescan/master/one-liner/ubuntu | bash
Start a Galaxy/MGEscan web server with a default port 38080
.
source ~/.mgescanrc
cd $GALAXY_HOME
nohup sh run.sh &
Note
RepeatMasker is not included.
Note
Default admin account is mgescan_admin@mgescan.com
. Sign up with
this account name and your password.
Normal Installation¶
Software for Python¶
If virtualenv
, git
, and python-dev
are available on your system,
you can skip this step.
Ubuntu
sudo apt-get update
sudo apt-get install python-virtualenv python-dev git -y
Fedora
sudo yum update
sudo yum install python-virtualenv python-devel git -y
Environment Variables¶
MGEScan will be installed on a default directory $HOME/mgescan3
. You can
change it if you prefer other location to install MGEScan.
export MGESCAN_HOME=$HOME/mgescan3
export MGESCAN_SRC=$MGESCAN_HOME/src
export GALAXY_HOME=$MGESCAN_HOME/galaxy
export TRF_HOME=$MGESCAN_HOME/trf
export RM_HOME=$MGESCAN_HOME/RepeatMasker
export MGESCAN_VENV=$MGESCAN_HOME/virtualenv/mgescan
Tip
MGEScan on Galaxy uses version 3 in the naming like mgescan3.
Create a MGESCan start file .mgescanrc
cat <<EOF > $HOME/.mgescanrc
export MGESCAN_HOME=\$HOME/mgescan3
export MGESCAN_SRC=\$MGESCAN_HOME/src
export GALAXY_HOME=\$MGESCAN_HOME/galaxy
export TRF_HOME=\$MGESCAN_HOME/trf
export RM_HOME=\$MGESCAN_HOME/RepeatMasker
export MGESCAN_VENV=\$MGESCAN_HOME/virtualenv/mgescan
EOF
Then include it to your startup file (i.e. .bash_profile
).
echo "source ~/.mgescanrc" >> $HOME/.bash_profile
Create a main directory.
source ~/.mgescanrc
mkdir $MGESCAN_HOME
Software for MGEScan¶
Galaxy Workflow, HMMER (3.1b1), EMBOSS Suite and TRF are required. RepeatMasker is optional.
Galaxy¶
Tip
Make sure that $MGESCAN_HOME is set by echo $MGESCAN_HOME
command.
If you don’t see a path similar to /home/.../mgescan3/
, you have to
define environment variables again.
From Github repository (source code):
cd $MGESCAN_HOME
git clone https://github.com/galaxyproject/galaxy/
HMMER and EMBOSS¶
If you have HMMER
and EMBOSS
on your system, you can skip this step.
Ubuntu
sudo apt-get install hmmer emboss -y
Fedora
- HMMER v3.1b2
sudo yum install gcc -y
wget ftp://selab.janelia.org/pub/software/hmmer3/3.1b2/hmmer-3.1b2-linux-intel-x86_64.tar.gz
tar xvzf hmmer-3.1b2-linux-intel-x86_64.tar.gz
cd hmmer-3.1b2-linux-intel-x86_64
./configure
make
make check
make install
- EMBOSS 6.6.0 (latest)
wget ftp://emboss.open-bio.org/pub/EMBOSS/emboss-latest.tar.gz
tar xvzf emboss-latest.tar.gz
cd EMBOSS-*
./configure
make
make check
make install
Virtual Environments (virtualenv) for Python Packages¶
It is recommended to have an isolated environment for MGEScan Python libraries. virtualenv creates a separated space for MGEScan, and issues from dependencies and versions of Python libraries can be avoided. Note that you have to be in the virtualenv of MGEScan before to run any MGEScan command line tools. The following commands create a virtualenv for MGEScan and enable it on your account.
mkdir -p $MGESCAN_VENV
virtualenv $MGESCAN_VENV
source $MGESCAN_VENV/bin/activate
echo "source $MGESCAN_VENV/bin/activate" >> ~/.bash_profile
Note
Skip the last line echo "source ..."
, if you’d like to enable
mgescan
virtualenv manually.
Tandem Repeats Finder (trf)¶
trf
is a single binary executable file to locate and display tandem repeats
in DNA sequences. MGEScan-LTR requires trf
program.
mkdir -p $TRF_HOME
wget http://tandem.bu.edu/trf/downloads/trf407b.linux64 -P $TRF_HOME
RepeatMasker (Optional)¶
RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. MGEScan-LTR has an option to use RepeatMasker.
mkdir $RM_HOME
wget http://www.repeatmasker.org/RepeatMasker-open-4-0-5.tar.gz
tar xvzf RepeatMasker-open-4-0-5.tar.gz
mv RepeatMasker/* $RM_HOME
ln -s $RM_HOME/RepeatMasker $MGESCAN_VENV/bin/
MGEScan Installation¶
MGEScan can be installed from Github repository (source code):
cd $MGESCAN_HOME
git clone https://github.com/MGEScan/mgescan.git
ln -s mgescan src
cd $MGESCAN_SRC
python setup.py install
Configuration¶
Virtual Environments (virtualenv)¶
Make sure you have loaded your virtual environment for MGEScan by:
source $MGESCAN_VENV/bin/activate
You will see (mgescan)
label on your prompt.
Galaxy Configurations for MGEScan¶
MGEScan github repository contains codes and toolkits for MGEScan on Galaxy.
Prior to run a Galaxy Workflow web server, the codes and toolkits should be
installed in the galaxy
main directory.
cp -pr $MGESCAN_SRC/galaxy-modified/* $GALAXY_HOME
trf¶
To run trf
anywhere under mgescan
virtualenv, we create a symlink in
the bin
directory.
ln -s $TRF_HOME/trf407b.linux64 $MGESCAN_VENV/bin/trf
chmod 700 $MGESCAN_VENV/bin/trf
RepeatMasker¶
RepeatMasker also requires configuration.
Ubuntu
cd $RM_HOME
$RM_HOME/configure
Fedora
sudo yum install perl-Data-Dumper perl-Text-Soundex -y
cd $RM_HOME
$RM_HOME/configure
Outputs like so:
RepeatMasker Configuration Program
This program assists with the configuration of the
RepeatMasker program. The next set of screens will ask
you to enter information pertaining to your system
configuration. At the end of the program your RepeatMasker
installation will be ready to use.
<PRESS ENTER TO CONTINUE>
Galaxy Admin User¶
Declare your email address as a Galaxy admin user name.
export GALAXY_ADMIN=mgescan_admin@mgescan.com
Warning
REPLACE mgescan_admin@mgescan.com
with your email address. You
also have to sign up Galaxy with this email address.
sed -i "s/#admin_users = None/admin_users = $GALAXY_ADMIN/" $GALAXY_HOME/universe_wsgi.ini
Start Galaxy¶
Simple run.sh
script starts a Galaxy web server. First run of the script
takes some time to initialize database.
cd $GALAXY_HOME
nohup sh run.sh &
Note
Default port number : 38080 http://[IP ADDRESS]:38080