GloBLAST 1.0 Installation Guide

 

Introduction

This guide explains how to install GloBLAST on a Unix system. For illustration purposes, our samples and screenshots will show installation on a MacOSX system, though with slight modifications to default directory names, it applies to Linux and other Unix platforms as well.

System Requirements

GloBLAST is a Perl-driven CGI application that utilizes some common open-source packages. The version numbers merely indicate the package versions used while building and testing this version of GloBLAST. In general, Unix packages are backward compatible, hence GloBLAST should also run fine with later versions of these packages, and may also be compatible with some earlier versions.

¤       Perl (version 5.6.0)
Source downloadable from http://www.cpan.org and binary ports are also available at http://www.cpan.org/ports/. The following non-default Perl modules are also required, and each may be downloaded individually at http://search.cpan.org/ :-

o      DBI (version 1.40)
the perl database interface

o      DBD::mysql (version 2.9003)
Perl5 database interface to MySQL database

o      CGI (version 3.04)
Perl5 library for writing www CGI scripts

o      CGI::Application (version 3.1)
Framework for building reusable web-applications

o      HTML::Template (version 2.6)
Reusable HTML Templates

o      File::Temp (version 0.14)
Perl module to generate temporary files

o      Bundle::libnet (version 1.00) (optional)
This module is required if a mailserver exists, to provide a programming interface to use Simple Mail Transfer Protocol for email notification.

¤       Either WU-Blast 2.0 (http://blast.wustl.edu/) or
NCBI Blast2 (ftp://ftp.ncbi.nih.gov/blast/)

¤       RepeatMasker (optional)

To mask common biological repeats before BLASTing to reduce false hits. This package is not publicly available but may be requested from the author at http://ftp.genome.washington.edu/RM/RepeatMasker.html )

¤       Platform Load Sharing Facility (optional)

To distribute BLAST jobs across a clustered network. This commercial product may be purchased at http://www.platform.com/products/LSF/

¤       Portable Batch System (optional)

Another well-established clustering system, available at http://www.openpbs.org/

¤       Sun Grid Engine (optional)

A free, open-source alternative to LSF and PBS. Binaries for most platforms are available at http://gridengine.sunsource.net/

¤       Apache HTTP Server (version 1.3.27)

Its source tarball can be obtained from http://httpd.apache.org/download.cgi

¤       MySQL (version 3.23.53)

A very fast and popular open-source relational database server, downloadable from http://www.mysql.com/downloads/

[Note: GloBLAST has been designed and tested to work out-of-the-box with both Apache and MySQL, but with minimal changes will likely work with any SQL database and Perl/CGI enabled web-server]

Obtaining GloBLAST

GloBLAST can be obtained at http://www.bioinformatics.tll.org.sg/GloBLAST as a tarball or zip file, or via anonymous CVS at cvs.bioinformatics.tll.org.sg. (If you do not have CVS installed please download a binary from http://www.cvshome.org)

1.     Identify a target directory for GloBLAST installation on your filesystem. This can technically be anywhere, as long as the user has write access into that directory. However, we recommend that the installation be placed close to the webroot and CGI executable directory of the web server, for easy management. On Mac OS X systems, web documents are placed in /Library/WebServer by default. On other Unix systems, the location is probably /usr/local/apache.

2.     We recommend creating a Modules subdirectory to contain the project sources, e.g.

> mkdir /Library/WebServer/Modules

3.     Go to that newly created directory (letÕs refer to it as $MOD_HOME), e.g.

> cd /Library/WebServer/Modules

4.     Untar or unzip the downloaded package into this directory, e.g.

> tar xzf GloBLAST-1.0.tar.gz          or

> unzip GloBLAST-1.0.zip

Alternatively, if you wish to try out the latest developmental version of GloBLAST, you may login into our CVS server as user ÔcvsÕ with password ÔcvsuserÕ

   > cvs Ðd :pserver:cvs@cvs.bioinformatics.tll.org.sg:/Users/cvs login

> cvs Ðd :pserver:cvs@cvs.bioinformatics.tll.org.sg:/Users/cvs checkout GloBLAST

5.     Within $MOD_HOME (which is /Library/WebServer/Modules in our example), a directory named GloBLAST would have been created, with the following structure:

$MOD_HOME/GloBLAST

|-- Bio                 (Perl modules under Bio::GloBLAST)

|-- conf                (User configuration files)

            |-- cgi-bin             (Wrapper for the CGI Application)

            |-- htdocs              (Project webroot, to be symlinked or copied)

                  |-- images

                  |-- jscript

      |-- static

            |-- include       (User modifiable static HTML pages)

                  |-- jscript

                  |-- templates

|-- sql                 (Database creation scripts)

6.     Create a symbolic link on $MOD_HOME/GloBLAST/htdocs to a desired location within your web serverÕs document root (letÕs call it $WEB_ROOT), e.g. if you wish the projectÕs URL to be http://localhost/GloBLAST, then within ApacheÕs document root (on Mac OS X it is configured to /Library/WebServer/Documents in /etc/httpd/httpd.conf, and on other Unix systems it usually takes the default value /usr/local/apache/htdocs in /usr/local/apache/conf/httpd.conf) create the following symbolic link:

> ln Ðs $MOD_HOME/GloBLAST/htdocs $WEB_ROOT/GloBLAST

7.     Similarly, create a symbolic link on $MOD_HOME/GloBLAST/cgi-bin to a desired location within your web serverÕs CGI directory (letÕs call it $CGI_DIR). (On Mac OS X, the default CGI directory is /Library/WebServer/CGI-Executables but on other Unix systems, it is usually /usr/local/apache/cgi-bin.) To assign the URL of the CGI application to http://localhost/cgi-bin/GloBLAST/service, create a link as follows:

> ln Ðs $MOD_HOME/GloBLAST/cgi-bin $CGI_DIR/GloBLAST

8.     The ÒserviceÓ link above refers to the filename of the CGI application script within $MOD_HOME/GloBLAST/cgi-bin, and it can be renamed if necessary.

9.     Edit the service script to reflect the value of $WEB_HOME if it is different from the default value of /Library/WebServer/Modules. Just modify the unshift statement to unshift @INC, Ô$WEB_HOME/GloBLASTÕ;

10.  Test the CGI application by pointing your browser to URL (dependent on steps 6 and 7 above). In our example, that would be http://localhost/GloBLAST

11.  If the application fails to launch, take a look at the web server logs to figure out the exact nature of the error. The most likely problem is that symbolic links are not allowed within the document root and/or the CGI directory. If this is the case you might have to:

a.     remove the previous links, i.e.
> rm $WEB_ROOT/GloBLAST  and/or
> rm $CGI_DIR/GloBLAST

b.     physically create those directories, i.e.

> rm $WEB_ROOT/GloBLAST  and/or
> rm $CGI_DIR/GloBLAST

c.     manually copy over the contents of htdocs and cgi-bin into the directories, i.e.
> cp $MOD_HOME/GloBLAST/htdocs/* $WEB_ROOT/GloBLAST   and/or
> cp $MOD_HOME/GloBLAST/cgi-bin/* $CGI_DIR/GloBLAST

12.           Now, the CGI application should be visible from the browser and you can proceed with creating a new MySQL database for storing User Accout information. You should already have a MySQL server running and accessible via optional authentication options such as [-h hostname | -u username | -p password]

a.                       Choose a name for the new database, e.g. ÒserviceÓ,
> mysqladmin [auth options] create service

b.                     Load the schema in sql/service.sql into the newly created MySQL database
> mysql [auth options] service < $MOD_HOME/GloBLAST/sql/service.sql

c.                     Populate it with default data
> mysqlimport [auth options] service $MOD_HOME/GloBLAST/sql/db.txt

d.                     Create a GloBLAST administrator account to manage the GloBLAST app
> mysql [auth options] service
mysql> insert into user values (1,Õadmin_name
Õ,password(ÕpasswordÕ), Õfull_nameÕ,ÕorganizationÕ,1,ÕemailÕ);
mysql> insert into privilege values (1,1,ÕadminÕ);

** fill in your own administrator particulars in all the italicized locations

13.  If youÕd like to, replace the header image (800 x 50 pixels) at $MOD_HOME/GloBLAST/htdocs/images/blastserver_title.jpg with your own image


GloBLAST Parameter Configuration

Edit the configuration file in $MOD_HOME/GloBLAST/conf/config.pm to set various settings according to your system configuration

 

ÔwebÕ settings

url                    : the value of $WEB_HOME above

auto_acc          : binary flag for auto-activation of new user accounts [0 | 1]

 

ÔmysqlÕ settings

database          : Database name created above (e.g. service)

host                 : host name of your MySQL server (may be localhost)

port                 : port used by your MySQL server (usually 3306)

user                 : user name used to connect to your MySQL server

password        : password used to connect to your MySQL server

 

ÔblastÕ settings

type                : BLAST variant to use [wublast | ncbi]

max_seq          : Maximum number of BLAST sequences allow at one time

blast_exe         : Full path to your BLAST executables

sys_batch        : Clustering system to use [LSF | PBS | SunGrid | local]

matrix              : A list of matrix files available in your system

filter                : Filter files available to WU-BLAST (not required for NCBI)

 

ÔpathÕ settings

module_dir                  : the value of the $MOD_HOME variable above

tmp_dir                       : directory to store all temp files

blast_dir                      : globally mounted directory to store results (writable by every node)

users_dir                     : globally mounted, writable directory to store user blast databases

matrix_dir                    : full path to matrix location

filter_dir                      : full path to filter location if wublast program is used

repeatmasker_dir        : bin path of ProcessRepeat program, if any

session_dir                  : directory to store session file

htdocs_loc                   : webpath to GloBLAST htdocs (e.g. /GloBLAST)

 

ÔLSFÕ settings

nodes               : number of nodes allocated for the queue name you specify below

queue_name    : queue used to run the jobs in LSF platform

LSF_path        : bin path of LSF

LSF_etc           : etc path of LSF

LSF_lib           : lib path of LSF

LSF_uid          : uid path of LSF

LSF_conf        : conf path of LSF

           

ÔmailÕ settings

webmaster       : email address of webmaster

mail_server      : mail server to use if any