Please note that these documents are for an OBSOLETE version of the Globus Toolkit. For more information see 5.2 End of Life

Installing GT 5.2.2

Introduction

This guide is the starting point for everyone who wants to install Globus Toolkit 5.2.2. It will take you through a basic installation that installs the following basic services: a security infrastructure (GSI), GridFTP, and Execution Services (GRAM5).

This guide is also available as a PDF. However, each component includes online reference material, which this guide sometimes links to.


Chapter 1. Before you begin

Before you start installing the Globus Toolkit 5.2.2, there are a few things you should consider. The toolkit contains several subcomponents, and you may only be interested in some of them.

The Globus Toolkit version 5.2.2 includes:

  • GSI: security
  • GridFTP: file transfer
  • GRAM: job execution/resource management
  • MyProxy: credential repository/certificate authority
  • GSI-openssh: GSI secure single sign-on remote shell
[Important]Important

These all run on Unix platforms only.

If you are new to the toolkit and want to experiment with the components, you may want to use a supported RedHat based or Debian based Linux system. With the new supported native packaging installs, they are the simplest platforms on which to install GT services.

Chapter 2. Installing GT 5.2.2

1. Installing from Native Linux Packages

1.1. Enabling the Globus Repository for your distribution

The GT 5.2.2 release provides source and binary RPM packages for CentOS 4, 5, and 6; Fedora 15, 16, and 17; RedHat Enterprise Server 5 and 6; and Scientific Linux 5 and 6, and a set of .deb packages for several Debian and Ubuntu versions, including Debian 6.0 "squeeze" and 7.0 testing, and Ubuntu 10.04LTS, 10.10, 11.04, 11.10, and 12.04LTS.

This section will show how to set up and use the Globus RPM repository. If your distribution has Globus 5.2.2 packages within its repository, you can skip to "Installing the Toolkit".

The repo configuration packages for the various binary (RPM and deb) repositories can be found at the package download directory. These packages contain the APT or YUM repository definition, the public key used to verify the packages, and a yum group or tasksel group definition file for more easily installing bundles of related packages.

The RPMS in the OS-specific repositories are located at stable repository and testing repository.

The Debian packages in the OS-specific repositories are located at stable repository and testing repository.

The stable repository contains the current release plus all published update packages. The testing repository contains the versions from the latest successful automated build. Packages in the testing repository will be updated more frequently than those in the stable repository.

To install from binary RPMs, get the appropriate configuration rpm from the link above, install it with

# rpm -i Globus-5.2-stable-config.distro.noarch.rpm

To install from binary debs, get the appropriate configuration deb from the link above, install it with

# dpkg -i globus-repository-5.2-stable-distro_0.0.3_all.deb
# apt-get update

1.2. Installing the Toolkit

The components of the toolkit can be installed separately, or all at once. This section will show how to install various components, on both RPM based and Debian based Linux systems.

1.2.1. Install Toolkit components on RPM based systems

using yum:

  • Install GridFTP client

    # yum install globus-data-management-client
  • Install GRAM client

    # yum install globus-resource-management-client
  • Install GridFTP server

    # yum install globus-data-management-server
  • Install GRAM server

    # yum install globus-resource-management-server

    This will install GRAM, but only with a fork LRM. To install a PBS LRM using the scheduler event generator, for example:

    # yum install globus-gram-job-manager-pbs-setup-seg

  • Install GridFTP server and client

    # yum install globus-gridftp
  • Install GRAM server and client

    # yum install globus-gram5

You can also install any given package or set of packages using

# yum install PACKAGENAME

1.2.2. Install Toolkit components on Debian based systems

using tasksel:

  • Install GridFTP client

    # tasksel install globus-data-management-client
  • Install GRAM client

    # tasksel install globus-resource-management-client
  • Install GridFTP server

    # tasksel install globus-data-management-server
  • Install GRAM server

    # tasksel install globus-resource-management-server

    This will install GRAM, but only with a fork LRM. To install a PBS LRM using the scheduler event generator, for example:

    # apt-get install globus-gram-job-manager-pbs-setup-seg
  • Install GridFTP server and client

    # tasksel install globus-gridftp
  • Install GRAM server and client

    # tasksel install globus-gram5

You can also install any given package or set of packages using

# apt-get install PACKAGENAME

1.2.3. Toplevel targets

The toplevel targets for a yum or tasksel install are

  • globus-gridftp
  • globus-gram5
  • globus-gsi
  • globus-data-management-server
  • globus-data-management-client
  • globus-data-management-sdk
  • globus-resource-management-server
  • globus-resource-management-client
  • globus-resource-management-sdk

Your next step is to setup security, which includes picking a CA to trust, getting host certificates, user certificates, and creating a grid-mapfile. The next three chapters cover these topics.

With security setup, you may start a GridFTP server, and configure GRAM5. You may also start a GSI-OpenSSH daemon, or setup a MyProxy server. The following chapters will explain how to configure these technologies. If you follow the chapters in order, you will make sure of performing tasks in dependency order.

1.3. Updating a Globus Installation

Starting with GT 5.2, the package repositories included with the repo configuration packages will have updates enabled. That means that all major bug fixes and security issues for GT 5.2.2 can be easily installed via yum or apt-get. These updates will be published in the GT 5.2 updates rss feed. Also, this means that when the next point release is made, collecting other minor bug fixes, the upgrade can be done via yum or apt-get without installing a new repository definition package.

[Note]Note

To disable the automatic update feature for Debian-based distributions, comment out the 5.2.2 Updates deb and deb-src lines in /etc/apt/sources.list.d/globus-stable.list. To disable the automatic update feature for RPM-based distributions, locate the Globus-updates section of the /etc/yum.repos.d/Globus-stable-config.OS.repo file, and modify it so the enable=1 line reads enable=0.

2. Installation from Source Installer

[Note]Note

Installing using the Source Installer is only recommended on platforms for which native packages are not available. If you are installing onto a RedHat or Debian based Linux system, please see the section above.

[Note]Note

Make you sure you check out Platform Notes for specific installation information related to your platform.

2.1. Required software

To build the Globus Toolkit from the source installer, first download the source from download page, and be sure you have all of the following prerequisites installed.

This table shows specific package names (where available) for systems supported by GT 5.2.2:

PrerequisiteReasonRedHat-based SystemsDebian-based SystemsSolaris 11Mac OS X
C CompilerMost of the toolkit is written in C, using C99 and POSIX.1 features and libraries.gccgccpkg:/developer/gcc-45 or Solaris Studio 12.3XCode
GNU or BSD tarGPT uses the -z option to manipulate compressed tar files.tartar

pkg:/archiver/gnu-tar

(included in OS)
GNU or BSD sedStandard sed does not support long enough lines to process autoconf-generated scripts and Makefilessedsedpkg:/text/gnu-sed(included in OS)
GNU MakeStandard make does not support long enough lines to process autoconf-generated makefilesmakemakepkg:/developer/build/gnu-make(included in XCode)
libltdlThe Globus Toolkit uses this library to portably load shared libraries.libtool-ltdl-devellibltdl-devpkg:/library/libtool/libltdlIncluded in XCode for MacOS X 10.5-10.7; for newer versions, you must install it yourself. See OS X Platform Notes for more information.
OpenSSL 0.9.7 or higherGSI security uses OpenSSL's implementation of the SSL protocol and X.509 certificates.openssl-devellibssl-devpkg:/library/security/openssl(included in base OS)
Perl 5.10 or higherGPT and parts of GRAM5 are written in Perlperlperlpkg:/runtime/perl-512(included in base OS)
Archive::Tar 0.22 or higherGPT uses Archive::Tar to manipulate packagesperl-Archive-Tarperl-modulespkg:/runtime/perl-512(included in base OS)
Compress::Zlib 1.21 or higherGPT uses Compress::Zlib to deal with compressed packages.perl-Compress-Zlibperl-modulespkg:/runtime/perl-512(included in base OS)
Digest::MD5 2.20 or higherGPT uses Digest::MD5 to compute package digests.perlperlpkg:/runtime/perl-512(included in base OS)
File::Spec 0.8 or higherGPT uses File::Spec indirectly via Pod::Parserperlperl-basepkg:/runtime/perl-512(included in base OS)
IO::Zlib 1.1 or higherGPT uses IO::Zlib to deal with compressed packages.perl-IO-Zlibperl-modulespkg:/runtime/perl-512(included in base OS)
Pod::Parser 1.18 or higherGPT uses Pod::Parser to generate command-line help screens.perlperl-modulespkg:/runtime/perl-512(included in base OS)
Test::SimpleGlobus Toolkit tests use thisperl-Test-Simpleperl-modulesInstall Test::Simple from CPAN(included in base OS)
XML::ParserGPT uses this.perl-XML-Parserlibxml-parser-perlpkg:/library/perl-5/xml-parser-512(included in base OS)

[Note]Note

In order to use the GNU versions of sed, tar, and make on Solaris, put /usr/gnu/bin at the head of your path. Also, to use all of the perl executables, add /usr/perl5/bin to your path.

2.2. Installing from Source Installer

  1. Create a user named globus. This non-privileged user will be used to perform administrative tasks, deploying services, etc. Pick an installation directory, and make sure this account has read and write permissions in the installation directory.

    [Tip]Tip

    You might need to create the target directory as root, then chown it to the globus user:

    # mkdir /usr/local/globus-5.2.2
    # chown globus:globus /usr/local/globus-5.2.2
    [Important]Important

    If for some reason you do not create a user named "globus", be sure to run the installation as a non-root user. In that case, make sure to pick an install directory that your user account has write access to.

  2. Download the required software noted in Section 2.1, “Required software”.

  3. The Globus Toolkit Source Installer sets the installation directory by default to /usr/local/globus-5.2.2, but you may replace /usr/local/globus-5.2.2 with whatever directory you wish to install to, by setting the prefix when you configure.

    As the globus user, run:

    globus$ ./configure --prefix=<YOUR_PREFIX_DIRECTORY>

    You can use command line arguments to ./configure for a more custom install. Here are the lines to enable features which are disabled by default:

    Optional Packages:
    [...]
    --with-gsiopensshargs="args"
    Arguments to pass to the build of GSI-OpenSSH, like
    --with-tcp-wrappers

    For a full list of options, see ./configure --help. For a list of GSI-OpenSSH options, see Optional Build-Time Configuration for GSI-OpenSSH. For more information about our packaging or about choosing a flavor, see Packaging Details for Installing GT.

  4. Run:

    globus$ make

    Note that this command can take several hours to complete. If you wish to have a log file of the build, use tee:

    globus$ make 2>&1 | tee build.log

    The syntax above assumes a Bourne shell. If you are using another shell, redirect stderr to stdout and then pipe it to tee.

    [Note]Note

    Using make in parallel mode (-j) is not entirely safe, and is not recommended.

  5. Finally, run:

    globus$ make install

    This completes your installation. Now you may move on to the configuration sections of the following chapters.

    We recommend that you install any security advisories available for your installation, which are available from the Advisories page. You may also be interested in subscribing to some mailing lists for general discussion and security-related announcements.

Your next step is to setup security, which includes picking a CA to trust, getting host certificates, user certificates, and creating a grid-mapfile. The next three chapters cover these topics.

With security setup, you may start a GridFTP server, and configure GRAM5. You may also start a GSI-OpenSSH daemon, or setup a MyProxy server. The following chapters will explain how to configure these technologies. If you follow the chapters in order, you will make sure of performing tasks in dependency order.

2.3. Updating an Installation

The updates available in the native packages described above are also published as GPT source packages on the updates page. To install update packages, use the command

globus$ gpt-build -update package-name flavors

For the update command, package-name is the full path to the update tarball you've downloaded, and flavors is the list of binary flavors that you have installed (typically gcc32dbg or gcc64dbg.

Chapter 3. Basic Security Configuration

1. Obtain host certificates

You must have X509 certificates to use the GT 5.2.2 software securely (referred to in this documentation as host certificates). For an overview of certificates for GSI (security) see GSI Configuration Information and GSI Environmental Variables.

If you will need to be interoperable with other sites, you will need to obtain certs from a trusted Certificate Authority, such as those that are included in IGTF. If you are simply testing the software on your own resources, SimpleCA offers an easy way to create your own certificates (see section below).

Host certificates must:

  • consist of the following two files: hostcert.pem and hostkey.pem
  • be in the appropriate directory for secure services: /etc/grid-security/
  • be for a machine which has a consistent name in DNS; you should not run it on a computer using DHCP where a different name could be assigned to your computer.

You have the following options:

1.1.  Request a certificate from an existing CA

Your best option is to use an already existing CA. You may have access to one from the company you work for or an organization you are affiliated with. Some universities provide certificates for their members and affiliates. Contact your support organization for details about how to acquire a certificate. You may find your CA listed in the TERENA Repository.

If you already have a CA, you will need to follow their configuration directions. If they include a CA setup package, follow the CAs instruction on how to install the setup package. If they do not, you will need to create an /etc/grid-security/certificates directory and include the CA cert and signing policy in that directory. See Configuring a Trusted CA for more details.

This type of certificate is best for service deployment and Grid inter-operation.

1.2. SimpleCA

SimpleCA provides a wrapper around the OpenSSL CA functionality and is sufficient for simple Grid services. Alternatively, you can use OpenSSL's CA.sh command on its own. Instructions on how to use the SimpleCA can be found in Installing SimpleCA.

SimpleCA is suitable for testing or when a certificate authority is not available.

2. Add authorization

Installing Globus services on your resources doesn't automatically authorize your local users to use these services. Each user must have their own user certificate, and each user certificate must be mapped to a local account.

Add authorizations for users:

Create /etc/grid-security/grid-mapfile as root.

You need two pieces of information:

  • the subject name of a user
  • the account name it should map to.

The syntax is one line per user, with the certificate subject followed by the user account name.

Run grid-cert-info to get your subject name, and whoami to get the account name:

gtuser$ grid-cert-info -subject
/O=Grid/OU=GlobusTest/OU=simpleCA-mayed.mcs.anl.gov/OU=mcs.anl.gov/CN=GT User
gtuser$ whoami
gtuser

You may add the line by running the following as root:

root# $GLOBUS_LOCATION/sbin/grid-mapfile-add-entry -dn \
"/O=Grid/OU=GlobusTest/OU=simpleCA-mayed.mcs.anl.gov/OU=mcs.anl.gov/CN=GT User" \
-ln gtuser

The corresponding line in the grid-mapfile should look like:

"/O=Grid/OU=GlobusTest/OU=simpleCA-mayed.mcs.anl.gov/OU=mcs.anl.gov/CN=GT User" gtuser
[Important]Important

The quotes around the subject name are important, because it contains spaces.

3. Verify Basic Security

Now that you have installed a trusted CA, acquired a hostcert and acquired a usercert, you may verify that your security setup is complete. As your user account, run the following command:

gtuser$ grid-proxy-init -verify -debug

User Cert File: /home/gtuser/.globus/usercert.pem
User Key File: /home/gtuser/.globus/userkey.pem

Trusted CA Cert Dir: /etc/grid-security/certificates

Output File: /tmp/x509up_u506
Your identity: /DC=org/DC=doegrids/OU=People/CN=GT User 332900
Enter GRID pass phrase for this identity:
Creating proxy ...++++++++++++
..................++++++++++++
 Done
Proxy Verify OK
Your proxy is valid until: Fri Jan 28 23:13:22 2005

There are a few things you can notice from this command. Your usercert and key are located in $HOME/.globus/. The proxy certificate is created in /tmp/. The "up" stands for "user proxy", and the _u506 will be your UNIX userid. It also prints out your distinguished name (DN), and the proxy is valid for 12 hours.

If this command succeeds, your single node is correctly configured.

If you get an error, or if you want to see more diagnostic information about your certificates, run the following:

gtuser$ grid-cert-diagnostics

For more troubleshooting information, see the GSI troubleshooting guide

4. Firewall configuration

There are four possible firewall scenarios that might present themselves: restrictions on incoming and outgoing ports for both client and server scenarios.

This section divides sites into two categories: client sites, which have users that are acting as clients to Grid services, and server sites, which are running Grid services. Server sites also often act as client sites either because they also have users on site or jobs submitted by users to the site act as clients to other sites by retrieving data from other sites or spawning sub-jobs.

4.1. Client Site Firewall Requirements

This section describes the requirements placed on firewalls at sites containing Globus Toolkit clients. Note that often jobs submitted to sites running Globus services will act as clients (e.g. retrieving files needed by the job, spawning subjobs), so server sites will also have client site requirements.

4.1.1. Allowed Outgoing Ports

Clients need to be able to make outgoing connections freely from ephemeral ports on hosts at the client site to all ports at server sites.

4.1.2. Allowed Incoming Ports

As described in Section 3, “Job State Callbacks and Polling”, the Globus Toolkit GRAM service uses callbacks to communicate state changes to clients and, optionally, to stage files to/from the client. If connections are not allowed back to the Globus Toolkit clients, the following restrictions will be in effect:

  • You cannot do a job submission request and redirect the output back to the client. This means the globus-job-run command won't work. globus-job-submit will work, but you cannot use globus-job-get-output. globusrun with the -o option also will not work.
  • Staging to or from the client will also not work, which precludes the -s and -w options.
  • The client cannot be notified of state changes in the job, e.g. completion.

To allow these callbacks, client sites should allow incoming connection in the ephemeral port range. Client sites wishing to restrict incoming connections in the ephemeral port range should select a port range for their site. The size of this range should be approximately 10 ports per expected simultaneous user on a given host, though this may vary depending on the actual usage characteristics. Hosts on which clients run should have the GLOBUS_TCP_PORT_RANGE environment variable set for the users to reflect the site’s chosen range.

4.1.3. Network Address Translation (NAT)

Clients behind NATs will be restricted as described in Section 4.1.2, “Allowed Incoming Ports” unless the firewall and site hosts are configured to allow incoming connections.

This configuration involves:

  • Select a separate portion of the ephemeral port range for each host at the site on which clients will be running (e.g. 45000-45099 for host A, 45100-45199 for host B, etc.).
  • Configure the NAT to direct incoming connections in the port range for each host back to the appropriate host (e.g., configure 45000-45099 on the NAT to forward to 45000-45099 on host A).
  • Configure the Globus Toolkit clients on each site host to use the selected port range for the host using the techniques described in Section 2.1, “If client is behind a firewall”.
  • Configure Globus Toolkit clients to advertise the firewall as the hostname to use for callbacks from the server host. This is done using the GLOBUS_HOSTNAME environment variable. The client must also have the GLOBUS_HOSTNAME environment variable set to the hostname of the external side of the NAT firewall. This will cause the client software to advertise the firewall's hostname as the hostname to be used for callbacks causing connections from the server intended for it to go to the firewall (which redirects them to the client).

4.2. Server Site Firewall Requirements

This section describes firewall policy requirements at sites that host Grid services. Sites that host Grid services often host Grid clients, however the policy requirements described in this section are adequate for clients as well.

4.2.1. Allowed Incoming Ports

A server site should allow incoming connections to the well-known Grid Service Ports as well as ephemeral ports. These ports are 22/tcp (for gsi-enabled openssh), 2119/tcp (for GRAM) and 2811/tcp for GridFTP.

A server not allowing incoming connections in the ephemeral port range will have the following restrictions:

  • If port 2119/tcp is open, GRAM will allow jobs to be submitted, but further management of the jobs will not be possible.
  • While it will be possible to make GridFTP control connections if port 2811/tcp is open, it will not possible to actually get or put files.

Server sites wishing to restrict incoming connections in the ephemeral port range should select a range of port numbers. The size of this range should be approximately 20 ports per expected simultaneous user on a given host, though this may vary depending on the actual usage characteristics. While it will take some operational experience to determine just how big this range needs to be, it is suggested that any major server site open a port range of at least a few hundred ports. Grid Services should configured as described in Section to reflect the site’s chosen range.

4.2.2. Allowed Outgoing Ports

Server sites should allow outgoing connections freely from ephemeral ports at the server site to ephemeral ports at client sites as well as to Grid Service Ports at other sites.

4.2.3.  Network Address Translation (NAT)

Grid services are not supported to work behind NAT firewalls because the security mechanisms employed by Globus require knowledge of the actual IP address of the host that is being connected to.

We do note there have been some successes in running GT services behind NAT firewalls.

4.3. Summary of Globus Toolkit Traffic

Table 3.1. Summary of Globus Toolkit Traffic

ApplicationNetwork PortsComments
GRAM Gatekeeper(to start jobs)To 2119/tcp on server from controllable ephemeral port on clientConnections back to client (controllable ephemeral port to controllable ephemeral port) required if executable or data staged from client or output from job sent back to client. Port 2119/tcp defined by IANA
GRAM Job-ManagerFrom controllable ephemeral port on client to controllable ephemeral port on server.Port on server selected when original connection made by the client to the Gatekeeper and returned to the client in a URL. May result in connection back to client from ephemeral port on server to controllable ephemeral port on client.
GridFTPFrom controllable ephemeral port on client to port 2811/tcp on server for control channel.Port 2811/tcp defined by IANA.
GSI-Enabled SSHFrom ephemeral port on client to port 22/tcp on server.Same as standard SSH. Port 22/tcp defined by IANA.
MyProxyFrom ephemeral port on client to port 7512/tcp on server.Default. Can be modified by site.

4.4. Controlling The Ephemeral Port Range

Controllable ephemeral ports in the Globus Toolkit can be restricted to a given range. setting the environment variable GLOBUS_TCP_PORT_RANGE can restrict ephemeral ports. The value of this variable should be formatted as min,max (a comma separated pair). This will cause the GT libraries (specifically GlobusIO) to select port numbers for controllable ports in that specified range.

% GLOBUS_TCP_PORT_RANGE=40000,40010
% export GLOBUS_TCP_PORT_RANGE
% globus-gass-server
https://globicus.lbl.gov:40000
^C
%

This environment variable is respected by both clients and servers that are started from within the environment in which it is set. There are better ways, however, to configure a globus-job-manager or a GridFTP server to restrict its port range.

  • globus-job-manager has an option, -globus-tcp-port-range PORT_RANGE that acts in the same manner as the environment variable. It can be specified on the command line or in the configuration file. See the job manager documentation for all of its options.
  • See the GridFTP documentation for information about using GridFTP with firewalls.

Chapter 4. Basic Setup for GT 5.2.2

The Quickstart Guide walks you through setting up basic services on multiple machines.

Chapter 5. Platform Notes

1. Platform Notes

1.1. Mac OS X 10.8 (Mountain Lion)

The libtool library is no longer distributed with MacOS in Mountain Lion. Install the latest libtool from the GNU libtool source mirror prior to building Globus from the source installer. To do so in a way that will work, you'll need to configure libtool with --program-prefix=g to cause the libtool script to be named glibtool to avoid conflicts with the OS X libtool program which provides different functionality than GNU libtool.

If you install libtool in a directory other than your Globus installation directory, you'll need to add it to your build environment, by adding CPPFLAGS="-ILIBTOOL-INSTALLDIR/include" and LDFLAGS="-LLIBTOOL-INSTALLDIR/lib" to your environment when compiling with the installer.

Chapter 6. Appendix

The Install Guide appendix can be found here.

Glossary

G

Grid Security Infrastructure (GSI)

GSI stands for Grid Security Infrastructure and is used to describe the original infrastructure of GT security, which is comprised of SSL, PKI and proxy certificates.