GT 4.2.1 User's Guide

Abstract

You can download the PDF version here. This page contains information for commonly performed tasks using GT components. This assumes a default installation and covers the more basic tasks using common tools. Due to size, all GT command line clients are listed here.

Note that GT itself is typically used as middleware and not intended to be used directly by end-users. Instead, grid developers use the GT to develop higher-level services and systems that are then used by end-users (where GT is essentially the plumbing). However, GT Release Manuals include User's Guides for each established component that describe how the public interfaces are intended to be used - whether it is by a human or a program.


Table of Contents

1. Starting higher-level GT services
1. Setting up your environment
2. Using Java WS Core
2.1. What is the Java WS Core container?
2.2. Starting the container
2.3. Stopping the container
2.4. GT web services based on Java WS Core
2.5. Querying a resource
3. Using C WS Core
3.1. Starting the C Container
3.2. Stopping the C Container
3.3. Accessing Resources Properties with C WS Core
2. Security
1. Obtaining certificates
2. Authenticating (who are you?)
2.1. Generate a valid proxy certificate
3. Authorizing (what are you allowed to do?)
4. Delegate user credentials
5. Basic procedure for using GSI C
6. Troubleshooting Certificates and GridMap Files
6.1. Some tools to validate certificate setup
6.1.1. grid-cert-diagnostics
6.1.2. Check that the user certificate is valid
6.1.3. Connect to the server using s_client
6.1.4. Check that the server certificate is valid
3. Data Management
1. File transfers with GridFTP
1.1. Basic procedure for using GridFTP (globus-url-copy)
1.1.1. Putting files
1.1.2. Getting files
1.1.3. Third party transfers
1.1.4. For more information
1.2. Accessing data from other data interfaces
1.2.1. Accessing data in a non-POSIX file data source that has a POSIX interface
1.2.2. GridFTP and DSIs
1.2.3. Latest information about HPSS
1.2.4. Latest information about SRB
1.3. Pipelining
1.4. GridFTP Where There Is FTP (GWTFTP)
1.5. Multicasting
1.5.1. Advanced multicasting options
1.5.2. Network Overlay
2. Transferring large datasets with Reliable File Transfer (RFT)
2.1. globus-crft
2.1.1. Submitting A Transfer
2.1.2. Non-blocking Transfer
2.1.3. Cleaning Up
2.1.4. More
3. Mapping replicas with Replica Location Service (RLS)
3.1. Ping the server
3.2. Creating replica location mappings
3.3. Adding replica location mappings
3.4. Querying replica location mappings
3.5. Deleting replica location mappings
3.6. Using bulk operations
3.7. Using interactive mode
4. Mapping replicas with WS Replica Location Service (WS RLS)
4.1. Create mappings
4.2. Add mappings
4.3. Define attribute definitions
4.4. Add attributes
4.5. Query mappings
4.6. Query attributes
5. Managing and Transferring Batches of Replicas with Batch Replication Service
5.1. Replication request file
5.2. Create replication resource
5.3. Start replication
5.4. Get replication resource properties
5.5. Find replication item status
5.6. Destroy replication resource
6. Managing and Transferring Replicas with the Replication Client
4. Monitoring your GT services and the Grid
1. Querying the Index Service
1.1. Simple usage
2. Using WebMDS
3. Triggering actions based on information gathered by Index Service
5. Submitting jobs to a job scheduler.
1. Delegating credentials
2. Local resource managers interfaced by a GRAM4 installation
2.1. Finding available local resource managers
2.2. Finding the default local resource manager
3. Submitting Jobs Specified in JDD
3.1. Simple interactive job
3.2. Streaming output
3.3. Using a contact string
3.4. Using a job description
3.5. Using a contact string in the job description
3.6. Specifying a local resource manager
3.6.1. Submitting to the default local resource manager
3.6.2. Submitting to a non-default local resource manager
3.7. Job with staging
3.8. Specifying a local user id in the job description
3.9. Using substitution variables
3.10. Using custom job description extensions
3.11. Multi-Job
4. Submitting jobs with metascheduling functionality
A. Globus Toolkit 4.2.1 Public Interface Guides
B. Globus Toolkit 4.2.1 Errors
Glossary

Chapter 1. Starting higher-level GT services

1. Setting up your environment

This step is usually a prerequisite for using GT commands. Make sure you have set GLOBUS_LOCATION to the location of your Toolkit installation. There are two environment scripts called $GLOBUS_LOCATION/etc/globus-user-env.sh and $GLOBUS_LOCATION/etc/globus-user-env.csh. You should read in whichever one corresponds to the type of shell you are using.

For example, in csh or tcsh, you would run:

source $GLOBUS_LOCATION/etc/globus-user-env.csh

In sh, bash, ksh, or zsh, you would run:

. $GLOBUS_LOCATION/etc/globus-user-env.sh

Set Globus location:

$ export GLOBUS_LOCATION='/opt/globus/apps/globus-4.2.1'

Source it..

source $GLOBUS_LOCATION/etc/globus-user-env.sh
source $GLOBUS_LOCATION/etc/globus-devel-env.sh

Start container (for default installation using Java WS Core):

globus-start-container

Create new grid proxy certificate with grid-proxy-init.

2. Using Java WS Core

2.1. What is the Java WS Core container?

The Java WS Core container is the web services hosting environment based on Java on which the GT higher-level Java web services (such as RFT and CAS) are based.

2.2. Starting the container

To start the Java WS Core container in any default installation of GT, run globus-start-container:

$GLOBUS_LOCATION bin/globus-start-container

If you want to run without transport-level security, use the -nosec option:

$GLOBUS_LOCATION bin/globus-start-container -nosec

2.3. Stopping the container

To stop the container, run:

$GLOBUS_LOCATION bin/globus-stop-container

2.4. GT web services based on Java WS Core

The following GT components are higher-level web services based on Java WS Core

2.5. Querying a resource

You can use the wsrf-query command to query any WSRF resource property document. For example, you can use the following command to query the WS MDS Index Service for all the resource properties collected by the default Index Service on your local host:

$GLOBUS_LOCATION/bin/wsrf-query -s http://localhost:8443/wsrf/services/DefaultIndexService '/*'

3. Using C WS Core

3.1. Starting the C Container

The globus-wsc-container command is an implementation of a Web Service container for hosting services written in C. By default, the container will run in the foreground and process SOAP requests until terminated by a signal. See globus-wsc-container documentation for a complete list of command-line options.

% globus-wsc-container

Contact: https://grid.example.org:8443/

3.2. Stopping the C Container

There is no special command for stopping a C container. If the command is in the foreground (default), then sending the TERM signal (typically ctrl-C).

% globus-wsc-container

Contact: https://grid.example.org:8443

^C

Execution cancelled, cleaning up.

% 

If the container is in the background, it can be terminated with the POSIX-standard kill command. If the container was started with the -pidfile command-line option, that fill can be read to determine which process to kill. For example:

% globus-wsc-container -bg -pidfile $GLOBUS_LOCATION/var/wsc.pid

Contact: https://grid.example.org:8443

% cat $GLOBUS_LOCATION/var/wsc.pid

19773

% kill 19773

% 

The container will automatically remove the PID file ($GLOBUS_LOCATION/var/wsc.pid in this example).

3.3. Accessing Resources Properties with C WS Core

WSRF services share information on resource state through resource properties. C WS Core provides several tools for inspecting these properties. A list of the properties provided by Globus Toolkit services is available in the developer's guide.

The globus-wsrf-get-property and globus-wsrf-get-properties commands provide two options for getting the value of a single resource property or multiple resource properties, respectively. For this example, we'll explore some of the properties provided by the GRAM4 service.

First, we'll check the version information of a GRAM4 service using globus-wsrf-get-property:

% globus-wsrf-get-property -s https://grid.example.org:8443/wsrf/services/ManagedJobFactoryService \
    '{http://mds.globus.org/metadata/2005/02}ServiceMetaDataInfo'

<ns1:ServiceMetaDataInfo xmlns:ns1="http://mds.globus.org/metadata/2005/02">
    <ns1:startTime>2008-06-19T16:34:31.248Z</ns1:startTime>
    <ns1:version>4.1.0</ns1:version>
    <ns1:serviceTypeName>ManagedJobFactoryService</ns1:serviceTypeName>
</ns1:ServiceMetaDataInfo>

% 

Now, we'll check for some system-specific information using globus-wsrf-get-properties:

% globus-wsrf-get-properties -s https://grid.example.org:8443/wsrf/services/ManagedJobFactoryService \
    '{http://www.globus.org/namespaces/2004/10/gram/job}hostCPUType' \
    '{http://www.globus.org/namespaces/2004/10/gram/job}hostOSName'


<ns1:hostCPUType xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">i686</ns1:hostCPUType>
<ns2:hostOSName xmlns:ns2="http://www.globus.org/namespaces/2008/03/gram/job">Linux</ns2:hostOSName>


% 

The globus-wsrf-query-propery program can be used to perform more sophisticated queries of resource properties using XPath expressions. We can check for the number of local resource managers supported by this installation:

% globus-wsrf-query \
    -s https://grid.example.org:8443/wsrf/services/ManagedJobFactoryService \
    'count(//*[local-name() = "availableLocalResourceManagers"])'

2
% 

We can then get the names of the local resource managers:

% globus-wsrf-query \
    -s https://grid.example.org:8443/wsrf/services/ManagedJobFactoryService \
    '//*[local-name() = "availableLocalResourceManagers"]/*[1]/text()'

Fork

% globus-wsrf-query \
    -s https://grid.example.org:8443/wsrf/services/ManagedJobFactoryService \
    '//*[local-name() = "availableLocalResourceManagers"]/*[2]/text()'

Multi

% 

Chapter 2. Security

This chapter provides information about basic security tasks in GT 4.2.1.

1. Obtaining certificates

Security is at the heart of Globus, and unless you are running without security (only recommended for testing), you will not be able to use most of Globus unless you have obtained a certificate for yourself. (Note that you may use GridFTP without certificates if you are only using ftp:// or http:// protocols.)

For basic informationa bout obtaining certificates, see Obtaining host certificates in the Installation Guide.

Remember to keep track of when your certificates expire. If your certificates expire, you may not be able to use your services until they are refreshed.

2. Authenticating (who are you?)

2.1. Generate a valid proxy certificate

Before using many of the tools in GT, a user must generate a valid user proxy. Use grid-proxy-init. The following is an example:

% $GLOBUS_LOCATION/bin/grid-proxy-init
Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA.mymachine/OU=mymachine/CN=John Doe
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Tue Oct 26 01:33:42 2004

3. Authorizing (what are you allowed to do?)

Basic authorization in GT is enforced via a grid map file, a file that contains mappings of certificate subject names to local user names, like the following:

 "/O=Grid/O=Globus/OU=your.domain.edu/CN=Your Name"    youruser

For more information about gridmaps see Section 4, “Add authorization”, Section 4, “Configuring Credential Mappings” and Globus Toolkit Gridmap Processing.

4. Delegate user credentials

Once you have generated a valid proxy you must create a delegated credential It is important to ensure that you give your delegated credential enough lifetime to support the running time of your replication activities. To delegate your credential use globus-credential-delegate.

% $GLOBUS_LOCATION/bin/globus-credential-delegate -h myhostname \
 -p 8443 mycredential.epr
EPR will be written to: mycredential.epr
Delegated credential EPR:
Address: https://128.9.72.118:8443/wsrf/services/DelegationService
Reference property[0]:
<ns1:DelegationKey xmlns:ns1="http://www.globus.org/08/2004/delegationService"
>3b6cb210-e9b2-11d9-ab74-f7fa10f094cd</ns1:DelegationKey>

5. Basic procedure for using GSI C

In most cases, an individual will do the following:

  • Acquire a user certificate from a certification authority (CA) with grid-cert-request. This certificate will typically be valid for a year or more and will be stored in a file in the individual's home directory.

    It is important to keep in mind when your cert will expire - after your user certificate expires, you may not be able to use secure services in GT!

  • Use the end-user certificate to create a proxy certificate using grid-proxy-init. This will be used to authenticate the individual to grid services. Proxy certificates typically have a much shorter lifetime than end-user certificates (usually 12 hours). Once your proxy certificate expires, simply rerun grid-proxy-init.

6. Troubleshooting Certificates and GridMap Files

For common errors, see Certificates and Gridmap errors.

6.1. Some tools to validate certificate setup

6.1.1. grid-cert-diagnostics

The grid-cert-diagnostics program checks prints diagnostics about the user's certificates, and host security environment.

% grid-cert-diagnostics -p

6.1.2. Check that the user certificate is valid

openssl verify -CApath /etc/grid-security/certificates
  -purpose sslclient ~/.globus/usercert.pem

6.1.3. Connect to the server using s_client

openssl s_client -ssl3 -cert ~/.globus/usercert.pem -key 
  ~/.globus/userkey.pem -CApath /etc/grid-security/certificates 
  -connect <host:port>

Here <host:port> denotes the server and port you connect to.

If it prints an error and puts you back at the command prompt, then it typically means that the server has closed the connection, i.e. that the server was not happy with the client's certificate and verification. Check the SSL log on the server.

If the command "hangs" then it has actually opened a telnet style (but secure) socket, and you can "talk" to the server.

You should be able to scroll up and see the subject names of the server's verification chain:

depth=2 /DC=net/DC=ES/O=ESnet/OU=Certificate Authorities/CN=ESnet Root CA 1
verify return:1
depth=1 /DC=org/DC=DOEGrids/OU=Certificate Authorities/CN=DOEGrids CA 1
verify return:1
depth=0 /DC=org/DC=doegrids/OU=Services/CN=wiggum.mcs.anl.gov
verify return:1
    

In this case, there were no errors. Errors would give you an extra line next to the subject name of the certificate that caused the error.

6.1.4. Check that the server certificate is valid

Requires root login on server:

    openssl verify -CApath /etc/grid-security/certificates -purpose sslserver 
     /etc/grid-security/hostcert.pem

Chapter 3. Data Management

Table of Contents

1. File transfers with GridFTP
1.1. Basic procedure for using GridFTP (globus-url-copy)
1.1.1. Putting files
1.1.2. Getting files
1.1.3. Third party transfers
1.1.4. For more information
1.2. Accessing data from other data interfaces
1.2.1. Accessing data in a non-POSIX file data source that has a POSIX interface
1.2.2. GridFTP and DSIs
1.2.2.1. GridFTP Protocol Module
1.2.2.2. Data Transform Functionality
1.2.2.3. Data Storage Interface (DSI) / Data Transform module
1.2.3. Latest information about HPSS
1.2.4. Latest information about SRB
1.3. Pipelining
1.4. GridFTP Where There Is FTP (GWTFTP)
1.5. Multicasting
1.5.1. Advanced multicasting options
1.5.2. Network Overlay
2. Transferring large datasets with Reliable File Transfer (RFT)
2.1. globus-crft
2.1.1. Submitting A Transfer
2.1.2. Non-blocking Transfer
2.1.3. Cleaning Up
2.1.4. More
3. Mapping replicas with Replica Location Service (RLS)
3.1. Ping the server
3.2. Creating replica location mappings
3.3. Adding replica location mappings
3.4. Querying replica location mappings
3.5. Deleting replica location mappings
3.6. Using bulk operations
3.7. Using interactive mode
4. Mapping replicas with WS Replica Location Service (WS RLS)
4.1. Create mappings
4.2. Add mappings
4.3. Define attribute definitions
4.4. Add attributes
4.5. Query mappings
4.6. Query attributes
5. Managing and Transferring Batches of Replicas with Batch Replication Service
5.1. Replication request file
5.2. Create replication resource
5.3. Start replication
5.4. Get replication resource properties
5.5. Find replication item status
5.6. Destroy replication resource
6. Managing and Transferring Replicas with the Replication Client

1. File transfers with GridFTP

1.1. Basic procedure for using GridFTP (globus-url-copy)

If you just want the "rules of thumb" on getting started (without all the details), the following options using globus-url-copy will normally give acceptable performance:

globus-url-copy -vb -tcp-bs 2097152 -p 4 source_url destination_url

where:

-vb

specifies verbose mode and displays:

  • number of bytes transferred,
  • performance since the last update (currently every 5 seconds), and
  • average performance for the whole transfer.
-tcp-bs

specifies the size (in bytes) of the TCP buffer to be used by the underlying ftp data channels. This is critical to good performance over the WAN.

How do I pick a value?

-p

Specifies the number of parallel data connections that should be used. This is one of the most commonly used options.

How do I pick a value?

The source/destination URLs will normally be one of the following:

  • file:///path/to/my/file if you are accessing a file on a file system accessible by the host on which you are running your client.
  • gsiftp://hostname/path/to/remote/file if you are accessing a file from a GridFTP server.

1.1.1. Putting files

One of the most basic tasks in GridFTP is to "put" files, i.e., moving a file from your file system to the server. So for example, if you want to move the file /tmp/foo from a file system accessible to the host on which you are running your client to a file name /tmp/bar on a host named remote.machine.my.edu running a GridFTP server, you would use this command:

globus-url-copy -vb -tcp-bs 2097152 -p 4 file:///tmp/foo gsiftp://remote.machine.my.edu/tmp/bar

[Note]Note

In theory, remote.machine.my.edu could be the same host as the one on which you are running your client, but that is normally only done in testing situations.

1.1.2. Getting files

A get, i.e, moving a file from a server to your file system, would just reverse the source and destination URLs:

[Tip]Tip

Remember file: always refers to your file system.

globus-url-copy -vb -tcp-bs 2097152 -p 4 gsiftp://remote.machine.my.edu/tmp/bar file:///tmp/foo

1.1.3. Third party transfers

Finally, if you want to move a file between two GridFTP servers (a third party transfer), both URLs would use gsiftp: as the protocol:

globus-url-copy -vb -tcp-bs 2097152 -p 4 gsiftp://other.machine.my.edu/tmp/foo gsiftp://remote.machine.my.edu/tmp/bar

1.1.4. For more information

If you want more information and details on URLs and the command line options, the Key Concepts gives basic definitions and an overview of the GridFTP protocol as well as our implementation of it.

1.2. Accessing data from other data interfaces

1.2.1. Accessing data in a non-POSIX file data source that has a POSIX interface

If you want to access data in a non-POSIX file data source that has a POSIX interface, the standard server will do just fine. Just make sure it is really POSIX-like (out of order writes, contiguous byte writes, etc).

1.2.2. GridFTP and DSIs

The following information is helpful if you want to use GridFTP to access data in DSIs (such as HPSS and SRB), and non-POSIX data sources.

Architecturally, the Globus GridFTP server can be divided into 3 modules:

  • the GridFTP protocol module,
  • the (optional) data transform module, and
  • the Data Storage Interface (DSI).

In the GT 4.2.1 implementation, the data transform module and the DSI have been merged, although we plan to have separate, chainable, data transform modules in the future.

[Note]Note

This architecture does NOT apply to the WU-FTPD implementation (GT3.2.1 and lower).

1.2.2.1. GridFTP Protocol Module

The GridFTP protocol module is the module that reads and writes to the network and implements the GridFTP protocol. This module should not need to be modified since to do so would make the server non-protocol compliant, and unable to communicate with other servers.

1.2.2.2. Data Transform Functionality

The data transform functionality is invoked by using the ERET (extended retrieve) and ESTO (extended store) commands. It is seldom used and bears careful consideration before it is implemented, but in the right circumstances can be very useful. In theory, any computation could be invoked this way, but it was primarily intended for cases where some simple pre-processing (such as a partial get or sub-sampling) can greatly reduce the network load. The disadvantage to this is that you remove any real option for planning, brokering, etc., and any significant computation could adversely affect the data transfer performance. Note that the client must also support the ESTO/ERET functionality as well.

1.2.2.3. Data Storage Interface (DSI) / Data Transform module

The Data Storage Interface (DSI) / Data Transform module knows how to read and write to the "local" storage system and can optionally transform the data. We put local in quotes because in a complicated storage system, the storage may not be directly attached, but for performance reasons, it should be relatively close (for instance on the same LAN).

The interface consists of functions to be implemented such as send (get), receive (put), command (simple commands that simply succeed or fail like mkdir), etc..

Once these functions have been implemented for a specific storage system, a client should not need to know or care what is actually providing the data. The server can either be configured specifically with a specific DSI, i.e., it knows how to interact with a single class of storage system, or one particularly useful function for the ESTO/ERET functionality mentioned above is to load and configure a DSI on the fly.

See Appendix A, Developing DSIs for GridFTP for more information.

1.2.3. Latest information about HPSS

Last Update: August 2005

Working with Los Alamos National Laboratory and the High Performance Storage System (HPSS) collaboration (http://www.hpss-collaboration.org), we have written a Data Storage Interface (DSI) for read/write access to HPSS. This DSI would allow an existing application that uses a GridFTP compliant client to utilize an HPSS data resources.

This DSI is currently in testing. Due to changes in the HPSS security mechanisms, it requires HPSS 6.2 or later, which is due to be released in Q4 2005. Distribution for the DSI has not been worked out yet, but it will *probably* be available from both Globus and the HPSS collaboration. While this code will be open source, it requires underlying HPSS libraries which are NOT open source (proprietary).

[Note]Note

This is a purely server side change, the client does not know what DSI is running, so only a site that is already running HPSS and wants to allow GridFTP access needs to worry about access to these proprietary libraries.

1.2.4. Latest information about SRB

Last Update: August 2005

Working with the SRB team at the San Diego Supercomputing Center, we have written a Data Storage Interface (DSI) for read/write access to data in the Storage Resource Broker (SRB) (http://www.npaci.edu/DICE/SRB). This DSI will enable GridFTP compliant clients to read and write data to an SRB server, similar in functionality to the sput/sget commands.

This DSI is currently in testing and is not yet publicly available, but will be available from both the SRB web site (here) and the Globus web site (here). It will also be included in the next stable release of the toolkit. We are working on performance tests, but early results indicate that for wide area network (WAN) transfers, the performance is comparable.

When might you want to use this functionality:

  • You have existing tools that use GridFTP clients and you want to access data that is in SRB
  • You have distributed data sets that have some of the data in SRB and some of the data available from GridFTP servers.

1.3. Pipelining

Pipelining allows the client to have many outstanding, unacknowledged transfer commands at once. Instead of being forced to wait for the "Finished response" message, the client is free to send transfer commands at any time.

Pipelining is enabled by using the -pp option:

globus-url-copy -pp

1.4. GridFTP Where There Is FTP (GWTFTP)

GridFTP Where There Is FTP (GWTFTP) is an intermediate program that acts as a proxy between existing FTP clients and GridFTP servers. Users can connect to GWFTP with their favorite standard FTP client, and GWFTP will then connect to a GridFTP server on the client’s behalf. To clients, GWFTP looks much like an FTP proxy server. When wishing to contact a GridFTP server, FTP clients instead contact GWTFTP.

Clients tell GWFTP their ultimate destination via the FTP USER <username> command. Instead of entering their username, client users send the following:

USER <GWTFTP username>::<GridFTP server URL>

This command tells GWTFTP the GridFTP endpoint with which the client wants to communicate. For example:

USER bresnaha::gsiftp://wiggum.mcs.anl.gov:2811/
[Note]Note

Requires GSI C security.

1.5. Multicasting

To transfer a single file to many destinations in a multicast/broadcast, use the new -mc option.

[Note]Note

To use this option, the admin must enable multicasting. Click here for more information.

globus-url-copy -vb -tcp-bs 2097152 -p 4 -mc filename source_url

The filename must contain a line-separated list of destination urls. For example:

gsiftp://localhost:5000/home/user/tst1
gsiftp://localhost:5000/home/user/tst3
gsiftp://localhost:5000/home/user/tst4
 

For more flexibility, you can also specify a single destination url on the command line in addition to the urls in the file. Examples are:

globus-url-copy -MC multicast.file gsiftp://localhost/home/user/src_file

or

globus-url-copy -MC multicast.file gsiftp://localhost/home/user/src_file gsiftp://localhost/home/user/dest_file1

1.5.1. Advanced multicasting options

Along with specifying the list of destination urls in a file, a set of options for each url can be specified. This is done by appending a ? to the resource string in the url followed by semicolon-separated key value pairs. For example:

gsiftp://dst1.domain.com:5000/home/user/tst1?cc=1;tcpbs=10M;P=4

This indicates that the receiving host dst1.domain.com will use 4 parallel stream, a tcp buffer size of 10 MB, and will select 1 host when forwarding on data blocks. This url is specified in the -mc file as described above.

The following is a list of key=value options and their meanings:

P=integer
The number of parallel streams this node will use when forwarding.
cc=integer
The number of urls to which this node will forward data.
tcpbs=formatted integer
The TCP buffer size this node will use when forwarding.
urls=string list
The list of urls that must be children of this node when the spanning tree is complete.
local_write=boolean: y|n
Determines if this data will be written to a local disk, or just forwarded on to the next hop. This is explained more in the Network Overlay section.
subject=string
The DN name to expect from the servers this node is connecting to.

1.5.2. Network Overlay

In addition to allowing multicast, this function also allows for creating user-defined network routes.

If the local_write option is set to n, then no data will be written to the local disk, the data will only be forwarded on.

If the local_write option is set to n and is used with the cc=1 option, the data will be forwarded on to exactly 1 location.

This allows the user to create a network overlay of data hops using each GridFTP server as a router to the ultimate destination.

2. Transferring large datasets with Reliable File Transfer (RFT)

The Java clients, rft and rft-delete commands are available for very simple transfers. For more options, use the programming instructions here.

2.1. globus-crft

Beginning with 4.2.0, RFT also offers a new C client, globus-crft.

2.1.1. Submitting A Transfer

To submit a transfer request the user must first create a 'transfer file'. Each line of this ASCII text file is a source/destination URL pair. There can be any number of of lines per file. An example file follows:

    gsiftp://localhost:2811/etc/group  gsiftp://localhost:2811/tmp/test_crft
    gsiftp://ftp.globus.org:2811/pub/README gsiftp://myhost.here/home/user/file

This file requests two transfers. The first will user the GridFTP server running on the localhost to transfer /etc/group to /tmp/test_crft. The second will transfer the file /pub/README on ftp.globus.org to the file /home/user/file located on myhost.here

Once the transfer file is created globus-crft can be used in a variety of ways to transfer a file. The most simple is the blocking transfer:

    % globus-crft -c -s -m -vb -f <transfer file> -e <container contact string>

Looking at each option individually, this command line does the following

-c Create a new RFT server., -s Submit the transfer request.

Since RFT is a 2 phase commit we allow the client the ability to do them in separate stages, however it is expected that the vast majority of the time -c and -s will be used together.

-m

Monitor the transfers. When this option is used the client will block until all transfers have completed. It monitors the status of the transfers along the way and can report it to the user.

-vb

Display verbose output. This just increases the level of diagnostic messages sent to stdout. When combined with -m it will allow the user to see the status of a transfer.

-f <transfer file>

This option is a pointer to the transfer file described above.

-e <container contact strings>

The contact string is in the following form:

https://hostname.com:8443/wsrf/services/

The strings ___ and ___ will be appended to the given string in order for the client to interact with that containers delegation service and RFT service.

2.1.2. Non-blocking Transfer

The client can do non-blocking RFT submission. It can submit an RFT request and then terminate, returning later to monitor the status of the request. To accomplish this the client saves the EPR of the newly created RFT service to disk.

% globus-crft -c -s -f <transfer file> -e <container contact string> \
                -ef <epr output file>

At some point later the client uses this same file to monitor the state of the transfer:

% globus-crft -ef <epr input file> --getOverallStatus

[Note]Note

Note that in both cases the option -ef is used. In the first case, since the -c option is used, we are creating a new service and the -ef option is a pointer to an output file. In all cases where -c is not used, the -ef switch is a pointer to an input file.

2.1.3. Cleaning Up

Once a transfer request completes, the user should destroy the resources associated with it. If the user stored the EPR of the service it created, this can be done with:

% globus-crft -ef <epr input file> --destroy

2.1.4. More

For a list of more options run:

globus-crft --help

3. Mapping replicas with Replica Location Service (RLS)

3.1. Ping the server

To check whether your server is active you may use the globus-rls-admin(1) ping command.

% $GLOBUS_LOCATION/bin/globus-rls-admin -p rls://localhost
ping rls://localhost: 0 seconds
        

3.2. Creating replica location mappings

When the RLS server is first installed its database of replica location information will be empty, as expected. To create a replica location mapping, use the globus-rls-cli(1) create command. Replica information in RLS is represented as mappings from logical names to target names. Typically, the logical name will be a unique identifier for a given replicated data set and the target name will be a URL identifying a particular replica of the data set.

% $GLOBUS_LOCATION/bin/globus-rls-cli create my-logical-name-1 url-for-target-name-1 rls://localhost
        
[Note]Note

The create command is intended for creating the initial replica mapping entry for a given logical name. If the user attempts to create another entry using an existing logical name, RLS will report a user error. To map additional target names to an existing logical name, see Section 4, “Adding replica location mappings”.

3.3. Adding replica location mappings

To map additional target names to a logical name created by the previously described create command, use the globus-rls-cli(1) add command.

% $GLOBUS_LOCATION/bin/globus-rls-cli add my-logical-name-1 url-for-target-name-2 rls://localhost
        

3.4. Querying replica location mappings

Once your RLS server is populated with replica location mappings, you can query the server for useful information using the globus-rls-cli(1) query command.

% $GLOBUS_LOCATION/bin/globus-rls-cli query lrc lfn my-logical-name-1 rls://localhost
my-logical-name-1: url-for-target-name-1
my-logical-name-1: url-for-target-name-2
        

3.5. Deleting replica location mappings

To remove unwanted replica location mappings from your RLS server, use the globus-rls-cli(1) delete command. The delete operation works directly on the mapping and indirectly on the logical and target names. When the delete operation is performed by the RLS server the association between the specified logical name and the specified target name is eliminated. However, there may still be other target names associated with the logical name, and there could still be other logical names associated with the target name, though the latter scenario is less likely. Only when all mapping associations for a given logical name (or a given target name) are eliminated (i.e., the specified logical name has no target names associated with it) will the logical (or target) name be deleted from the RLS server.

% $GLOBUS_LOCATION/bin/globus-rls-cli delete my-logical-name-1 url-for-target-name-1 rls://localhost
% $GLOBUS_LOCATION/bin/globus-rls-cli query lrc lfn my-logical-name-1 rls://localhost
my-logical-name-1: url-for-target-name-2
% $GLOBUS_LOCATION/bin/globus-rls-cli delete my-logical-name-1 url-for-target-name-2 rls://localhost
% $GLOBUS_LOCATION/bin/globus-rls-cli query lrc lfn my-logical-name-1 rls://localhost
globus_rls_client: LFN doesn't exist: my-logical-name-1
        

3.6. Using bulk operations

The globus-rls-cli(1) supports a variety of bulk operations that enhance productivity for users and reduce network connection overhead from making multiple, separate invocations of the client. The general pattern for bulk operation support as implemented by the client is a parameter list consisting of bulk command-name [command-modifiers] param-1 param-2 param-N, such as bulk query lrc lfn my-logical-name-1 my-logical-name-2 my-logical-name-3.

% $GLOBUS_LOCATION/bin/globus-rls-cli bulk create my-logical-name-1 url-for-target-name-1-1 my-logical-name-2 url-for-target-name-2-1 rls://localhost
% $GLOBUS_LOCATION/bin/globus-rls-cli bulk add my-logical-name-1 url-for-target-name-1-2 my-logical-name-2 url-for-target-name-2-2 rls://localhost
% $GLOBUS_LOCATION/bin/globus-rls-cli bulk query lrc lfn my-logical-name-1 my-logical-name-2 my-logical-name-3 rls://localhost
my-logical-name-3: LFN doesn't exist
my-logical-name-2: url-for-target-name-2-1
my-logical-name-2: url-for-target-name-2-2
my-logical-name-1: url-for-target-name-1-1
my-logical-name-1: url-for-target-name-1-2
        

3.7. Using interactive mode

The globus-rls-cli(1) supports an interactive mode in addition to the general command-line mode. To enter the interactive mode, simply invoke the client without any command.

% $GLOBUS_LOCATION/bin/globus-rls-cli rls://localhost
rls> query lrc lfn my-logical-name-2
my-logical-name-2: url-for-target-name-2-1
my-logical-name-2: url-for-target-name-2-2
rls> query lrc lfn my-logical-name-1
my-logical-name-1: url-for-target-name-1-1
my-logical-name-1: url-for-target-name-1-2
rls> bulk delete my-logical-name-1 url-for-target-name-1-1 my-logical-name-1 
url-for-target-name-1-2 my-logical-name-2 url-for-target-name-2-1 
my-logical-name-2 url-for-target-name-2-2
rls> bulk query lrc lfn my-logical-name-2 my-logical-name-1
my-logical-name-1: LFN doesn't exist
my-logical-name-2: LFN doesn't exist
rls> exit
        

4. Mapping replicas with WS Replica Location Service (WS RLS)

4.1. Create mappings

Use the globus-replicalocation-createmappings(1) tool to create mappings.

% $GLOBUS_LOCATION/bin/globus-replicalocation-createmappings \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  mydata1 gsiftp://path/a/to/mydata1
        

No output is expect from this command when successful.

4.2. Add mappings

Use the globus-replicalocation-addmappings(1) tool to add mappings.

% $GLOBUS_LOCATION/bin/globus-replicalocation-addmappings \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  mydata1 gsiftp://path/b/to/mydata1
        

No output is expect from this command when successful.

4.3. Define attribute definitions

Use the globus-replicalocation-defineattributes(1) tool to define attribute definitions.

% $GLOBUS_LOCATION/bin/globus-replicalocation-defineattributes \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  myattr1 logical string
        

No output is expect from this command when successful.

4.4. Add attributes

Use the globus-replicalocation-addattributes(1) tool to add attributes.

% $GLOBUS_LOCATION/bin/globus-replicalocation-addattributes \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  mydata1 myattr1 logical string attribute-value-goes-here
        

No output is expect from this command when successful.

4.5. Query mappings

Use the wsrf-query tool to query mappings.

% $GLOBUS_LOCATION/bin/wsrf-query \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  "query-target: mydata1" \
  "http://globus.org/replica/location/06/01/QueryDialect"
<ns1:MappingStatusType ns1:logical="mydata1" 
ns1:target="gsiftp://path/a/to/mydata1" 
xmlns:ns1="http://www.globus.org/namespaces/2005/08/replica/location"/>
<ns1:MappingStatusType ns1:logical="mydata1" 
ns1:target="gsiftp://path/b/to/mydata1" 
xmlns:ns1="http://www.globus.org/namespaces/2005/08/replica/location"/>
        

4.6. Query attributes

Use the wsrf-query tool to query attributes.

% $GLOBUS_LOCATION/bin/wsrf-query \
  -s https://localhost:8443/wsrf/services/ReplicaLocationCatalogService \
  "query-logical-attributes: mydata1" \
  "http://globus.org/replica/location/06/01/QueryDialect"
<ns1:AttributeStatusType ns1:key="mydata1" ns1:name="myattr1"
 ns1:objtype="logical" ns1:status="attributeExists" ns1:valtype="string"
 xmlns:ns1="http://www.globus.org/namespaces/2005/08/replica/location">
 <_value xmlns="">attribute-value-goes-here</_value>
</ns1:AttributeStatusType>
        

5. Managing and Transferring Batches of Replicas with Batch Replication Service

5.1. Replication request file

A key parameter for any replication request is the request file. The replication request file is a text file containing CRLF-terminated rows of tab-delimited pairs of Logical Filename (LFN) names and destination (URL) locations. An example of such a file is shown.

% cat testrun.req
testrun-1      gsiftp://myhost:9001/sandbox/files/testrun-1
testrun-2      gsiftp://myhost:9001/sandbox/files/testrun-2
testrun-3      gsiftp://myhost:9001/sandbox/files/testrun-3
testrun-4      gsiftp://myhost:9001/sandbox/files/testrun-4
testrun-5      gsiftp://myhost:9001/sandbox/files/testrun-5
        
[Note]Note

The LFNs in the left column of the request file (e.g., testrun-1, testrun-2, and so on shown in the example) must be registered in the RLS catalog and indexed in the RLS index service. This typically involves using the add RLI command (e.g., globus-rls-admin -a rls-receiver-url rls-sender-url) to tell the RLS to send updates to another (or the same) RLS, and then the create command (e.g., globus-rls-cli create testrun-1 gsiftp://sourcehost:9001/path/to/testrun-1 rls-sender-url) to register the LFN at the RLS catalog service. For details see globus-rls-admin(1) and globus-rls-cli(1).

5.2. Create replication resource

The initial step for any replication is to create the replication resource. Creating the resource depends on the availability of a Batch Replicator service, a delegated credential, and a properly formatted replication request file. The replication request file must be specified by its URL. Currently supported URL schemes for the request file include file, http, and ftp. If the replication client is run local to the service the file scheme is appropriate, whereas if the client is remote than the latter schemes must be used. It is a good practice to specify a filename to save the replication resource's endpoint reference. The endpoint reference is required for all other operations on the resource, such as getting resource properties, starting/stopping, and destroying it. Numerous options are available to influence the behavior of the data replication activities (see globus-replication-create(1)). One option of particular interest is the --start option, which immediately starts the replication activities following creation of the replication resource. An example of using the globus-replication-create(1) tool is shown.

% $GLOBUS_LOCATION/bin/globus-replication-create -s \
 https://myhost:8443/wsrf/services/ReplicationService \
 -C mycredential.epr -V myreplicator.epr file:///scratch/testrun.req
        

This command does not write to stdout when successful unless the --debug option is specified.

5.3. Start replication

Once a replication resource has been create, the replication activities may be started. As mentioned in Create replication resource the replication may be immediately started after it is created. If the immediate start option is not specified, the globus-replication-start(1) tool must be used to start the replication.

% $GLOBUS_LOCATION/bin/globus-replication-start -e myreplicator.epr
        

No output is expect from this command when successful.

5.4. Get replication resource properties

Throughout the lifecycle and after the completion of the replication resource, it will be important to lookup its Resource Properties. The standard WS-RF port types are supported and the supplied tools (e.g., wsrf-get-property) may be used with the Batch Replicator and its resources.

% $GLOBUS_LOCATION/bin/wsrf-get-property -e myreplicator.epr \
 "{http://www.globus.org/namespaces/2005/05/replica/replicator}status"
<ns1:status xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
Active</ns1:status>
        

And,

% $GLOBUS_LOCATION/bin/wsrf-get-property -e myreplicator.epr \ 
 "{http://www.globus.org/namespaces/2005/05/replica/replicator}count"
<ns1:count xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
 <ns1:total>10</ns1:total>
 <ns1:finished>0</ns1:finished>
 <ns1:failed>0</ns1:failed>
 <ns1:terminated>0</ns1:terminated>
</ns1:count>
        

5.5. Find replication item status

Throughout the lifecycle and after the completion of the replication resource, it may be helpful to find individual replication items in the replication resource to inspect the detailed status of the replication activities. The globus-replication-finditems(1) tool is used to find replication items. The following example demonstrates the usage when finding a limited number of items, offset into the lookup results set, for a specified status.

% $GLOBUS_LOCATION/bin/globus-replication-finditems -e myreplicator.epr -S Pending -O 1 -L 2
<ns1:FindItemsResponse xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
 <ns1:items xsi:type="ns1:ReplicationItemType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">testrun-2</ns1:uri>
  <ns1:priority xsi:type="xsd:int" xmlns:xsd="http://www.w3.org/2001/XMLSchema">1000</ns1:priority>
  <ns1:status xsi:type="ns1:ReplicationItemStatusEnumerationType">Pending</ns1:status>
  <ns1:destinations xsi:type="ns1:DestinationType">
   <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     gsiftp://myhost:9001/sandbox/files/testrun-2</ns1:uri>
   <ns1:status xsi:type="ns1:DestinationStatusEnumerationType">Pending</ns1:status>
  </ns1:destinations>
 </ns1:items>
 <ns1:items xsi:type="ns1:ReplicationItemType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">testrun-3</ns1:uri>
  <ns1:priority xsi:type="xsd:int" xmlns:xsd="http://www.w3.org/2001/XMLSchema">1000</ns1:priority>
  <ns1:status xsi:type="ns1:ReplicationItemStatusEnumerationType">Pending</ns1:status>
  <ns1:destinations xsi:type="ns1:DestinationType">
   <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     gsiftp://myhost:9001/sandbox/files/testrun-3</ns1:uri>
   <ns1:status xsi:type="ns1:DestinationStatusEnumerationType">Pending</ns1:status>
  </ns1:destinations>
 </ns1:items>
</ns1:FindItemsResponse>
        

5.6. Destroy replication resource

When the replication is complete, the replication resource may be destroyed. Destroying the replication resource will help to free up system resources (namely, memory), especially in the case that the replication entailed a large amount of individual replication activities (i.e., many files specified in the replication request file). The standard WS-RF port types are supported and the supplied wsrf-destroy tool may be used to destroy the Batch Replicator resource.

% $GLOBUS_LOCATION/bin/wsrf-destroy -e myreplicator.epr
Destroy operation was successful
        

6. Managing and Transferring Replicas with the Replication Client

[fixme]

Chapter 4. Monitoring your GT services and the Grid

1. Querying the Index Service

To view the information contained in an Index Service, you can use either Java WS Core commands (such as wsrf-query) or WebMDS.

1.1. Simple usage

A typical example of using the default Index Service is with the wsrf-query Java WS Core command. For example:

$GLOBUS_LOCATION/bin/wsrf-query -s https://localhost:8443/wsrf/services/DefaultIndexService '/*'

displays all the resource properties collected by the default Index Service on your local host.

You can also use an XPath query to drill down your search as well as other Java WS Core commands such as wsrf-get-property and wsrf-get-properties. For more information, review the User's Guide.

2. Using WebMDS

Once you've deployed the WebMDS servlet, simply point your web browser at http://your-tomcat-host:your-tomcat-port/webmds and click on the link labelled "A list of resources registered to the local default index service". For more information, see Chapter 2, Graphical User Interface.

For more detailed information about changing the look of WebMDS and more advanced configuration, see the WebMDS Admin Guide.

3. Triggering actions based on information gathered by Index Service

End-users will typically interact with the Trigger Service indirectly, using some mechanism specific to the triggered executable program (for example, an executable program may send mail to an end-user or write a structured log file that will later be read by some other program).

For more detailed information, see the Trigger Basic How To.

Chapter 5. Submitting jobs to a job scheduler.

1. Delegating credentials

There are three different uses of delegated credentials:

  1. for use by the MEJS to create a remote user proxy
  2. for use by the MEJS to contact RFT
  3. for use by RFT to contact the GridFTP servers. The EPRs to each of these are specified in three job description elements -- they are jobCredentialEndpoint, stagingCredentialEndpoint, and transferCredentialEndpoint respectively. Please Job Description Schema Reference and RFT transfer request schema documentation for more details about these elements.

The globusrun-ws client can either delegate these credentials automatically for a particular job, or it can reuse pre-delegated credentials (see next paragraph) through the use of command-line arguments for specifying the credentials' EPR files. Please see the GRAM4 Commands for details on these command-line arguments.

It is possible to use delegation command-line tools to obtain and refresh delegated credentials in order to use them when submitting jobs to GRAM4. This, for instance, enables the submission of many jobs using a shared set of delegated credentials. This can significantly decrease the number of remote calls for a set of jobs, thus improving performance.

The following example shows how to delegate credentials. globus-credential-delegate delegates to the specified delegation factory on lucky0.mcs.anl.gov, prints some information and stores the endpoint reference of the delegated credentials into the file delegCred.epr

[martin@osg-test1 ~]$ globus-credential-delegate \
> -s https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationFactoryService \
> delegCred.epr
Delegated credential EPR:
Address: https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationService
Reference property[0]:
<ns1:DelegationKey xmlns:ns1="http://www.globus.org/08/2004/delegationService">
  55e2a450-58be-11dd-b83c-e4ec640dfe13
</ns1:DelegationKey>

To destroy the delegated credential use wsrf-destroy:

[martin@osg-test1 jobs]$  wsrf-destroy -e delegCred.epr 
Destroy operation was successful

For more information about the delegation command-line tools see Command-line tools

2. Local resource managers interfaced by a GRAM4 installation

A GRAM4 instance can interface to more than one local resource manager (LRM), as shown in the previous section. A user can explicitly specify what LRM should be used for a job. But in a larger Grid it might be confusing for users to remember which LRM's are available on which machines.

That's why GRAM4 configures a default local resource manager, which is used for job submission if the client didn't explicitly specify one.

2.1. Finding available local resource managers

You can check the resource property availableLocalResourceManagers of a GRAM4 factory service to get that information. Replace host and port in the below example to query against other containers:

[martin@osg-test1 ~]$ globus-wsrf-get-property \
  -s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  "{http://www.globus.org/namespaces/2008/03/gram/job}availableLocalResourceManagers"

The result on that machine is (formatted for better readability) shows that the local resource managers Fork, Multi, Condor and PBS are available:

<ns1:availableLocalResourceManagers 
      xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
  <ns1:localResourceManager>Fork</ns1:localResourceManager>
  <ns1:localResourceManager>Multi</ns1:localResourceManager>
  <ns1:localResourceManager>Condor</ns1:localResourceManager>
  <ns1:localResourceManager>PBS</ns1:localResourceManager>
</ns1:availableLocalResourceManagers>

A more typical result in a production environment is probably Fork, Multi and just one additional LRM like Condor, PBS or LSF.

2.2. Finding the default local resource manager

You can check the resource property defaultLocalResourceManagers of a GRAM4 factory service to get that information. Replace host and port in the below example to query against other containers:

[martin@osg-test1 ~]$ globus-wsrf-get-property \
  -s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  "{http://www.globus.org/namespaces/2008/03/gram/job}localResourceManager"

The result on that machine shows that PBS is the default local resource managers:

<ns1:localResourceManager xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
    PBS
</ns1:localResourceManager>

3. Submitting Jobs Specified in JDD

3.1. Simple interactive job

Use the globusrun-ws program to submit a simple job without writing a job description document. Use the -c argument, a job description will be generated assuming the first arg is the executable and the remaining are arguments. For example:

% globusrun-ws -submit -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:4a92c06c-b371-11d9-9601-0002a5ad41e5
Termination time: 04/23/2005 20:58 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.

Confirm on the server-side that the job worked by verifying the file was touched:

% ls -l ~/touched_it 
-rw-r--r--  1 smartin globdev 0 Apr 22 15:59 /home/smartin/touched_it

% date
Fri Apr 22 15:59:20 CDT 2005

Note: You did not tell globusrun-ws where to run your job, so the default of localhost was used.

Also note, that globusrun-ws destroyed the job after it was fully processed.

We call this kind of job interactive, because globusrun-ws does not return after submission. It subscribes for status update notifications of the job and informs the user about a status change as soon as it changes. Once it gets the information the the job has been fully processed it destroys the job, which means that internal state belonging to the job is cleaned up on the server-side.

3.2. Streaming output

A user can request that the output of the program is sent back directly to the client as soon as it's available. This is useful if a user does not want to do additional file staging for a quick job. To enable this, specify the -s option.

[martin@osg-test1 ~]$ globusrun-ws -submit \
    -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
    -s -c /bin/echo hello world!
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:1731f602-22fe-11dd-879c-0013d4c3b957
Termination time: 05/16/3008 04:10 GMT
Current job state: Active
Current job state: CleanUp-Hold
hello world!
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

If you want the output of the job to be written to a local file instead of the terminal you'll have to add the -so option:

[martin@osg-test1 ~]$ globusrun-ws -submit \
    -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
    -s -so job.out -c /bin/echo hello world!
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:1731f602-22fe-11dd-879c-0013d4c3b957
Termination time: 05/16/3008 04:10 GMT
Current job state: Active
Current job state: CleanUp-Hold
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

Check the output in the specified file:

[martin@osg-test1 ws-gram]$ cat job.out 
hello, world!

Note that a GridFTP server must be running on the remote machine (lucky0) to enable streaming.

Note that streaming output adds some overhead to the submission and will probably be significantly slower compared to a job without streaming. An alternative to streaming is to use staging to transport the output of the executable back to the client. This however requires that a GridFTP server is running on the client machine.

3.3. Using a contact string

Use globusrun-ws to submit the same touch job, but this time tell globusrun-ws to run the job on another machine (lucky0.mcs.anl.gov:8443). A GT4 server with GRAM4 installed must run on that machine and listen on port 8443.

% globusrun-ws -submit \
   -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
   -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:3050ad64-b375-11d9-be11-0002a5ad41e5
Termination time: 04/23/2005 21:26 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.

Type globusrun-ws -help to learn the details about the contact string.

3.4. Using a job description

The specification of a job to submit is to be written by the user in a job description XML file.

Here is an example of a simple job description:

<job>
    <executable>/bin/echo</executable>
    <argument>this is an example_string </argument>
    <argument>Globus was here</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

Tell globusrun-ws to read the job description from a file, using the -f argument:

% bin/globusrun-ws -submit -f simple.xml
    Submitting job...Done.
    Job ID: uuid:c51fe35a-4fa3-11d9-9cfc-000874404099
    Termination time: 12/17/2004 20:47 GMT
    Current job state: Active
    Current job state: CleanUp
    Current job state: Done
    Destroying job...Done.
    

Note the usage of the substitution variable ${GLOBUS_USER_HOME} which resolves to the user home directory.

Here is an example with more job description parameters:

<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>/bin/echo</executable>
    <directory>/tmp</directory>
    <argument>12</argument>
    <argument>abc</argument>
    <argument>34</argument>
    <argument>this is an example_string </argument>
    <argument>Globus was here</argument>
    <environment>
        <name>PI</name>
        <value>3.141</value>
    </environment>
    <stdin>/dev/null</stdin>
    <stdout>stdout</stdout>
    <stderr>stderr</stderr>
    <count>2</count>
</job>

Note that in this example, a <directory> element specifies the current directory for the execution of the command on the execution machine to be /tmp, and the standard output is specified as the relative path stdout. The output is therefore written to /tmp/stdout:

% cat /tmp/stdout
    12 abc 34 this is an example_string  Globus was here
    

3.5. Using a contact string in the job description

Instead of specifying the contact string on the command-line, you can also put it in the job description:

<job xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <factoryEndpoint>
      <wsa:Address>
          https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
    </factoryEndpoint>
    <executable>/bin/date</executable>
</job>

Submit the job with the following command (assuming the above description has been stored in the file job.xml):

% bin/globusrun-ws -submit -f job.xml
[Note]Note

This time you don't have to specify the -F option.

3.6. Specifying a local resource manager

Note that at this point you didn't specify any local resource manager related information. If a user does not specify anything then the job is run by the default local resource manager, that is defined on the server-side. If an admin e.g. configured Condor as default local resource manager, then the jobs submitted so far will be managed by Condor on the server-side.

Check the section Local resource managers interfaced by a GRAM4 installation to find out which local resource managers are available in a GRAM4 installation and which one is configured as the default.

3.6.1. Submitting to the default local resource manager

As said, if you want to submit a job to the default local resource manager, all you have to do is to just NOT specify any local resource manager in your submission, neither in the job description, nor on the command-line. The above examples show how to do it.

3.6.2. Submitting to a non-default local resource manager

If you want to submit a job to a non-default local resource manager, or if you just want to be explicit in what you specify, you'll have to specify the local resource manager in your submission. Using globusrun-ws, there are two ways to specify a local resource manager:

  • as command-line argument of globusrun-ws (-Ft <lrm>)
  • in the factoryEndpoint element in the job description

Example: the following job will be submitted to Condor:

globusrun-ws -submit \
  -F osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  -Ft Condor \
  -c /bin/date

Or with a job description that contains a factoryEndpoint:

<job xmlns:wsa="http://www.w3.org/2005/08/addressing"
    xmlns:gram="http://www.globus.org/namespaces/2008/03/gram/job">
    <factoryEndpoint>
      <wsa:Address>
          https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
      <wsa:ReferenceParameters>
        <gram:ResourceID>Condor</gram:ResourceID>
      </wsa:ReferenceParameters>
    </factoryEndpoint>
    <executable>/bin/date</executable>
</job>

Submit that job (assuming the description is stored in the file myJob.xml):

globusrun-ws -submit -f myJob.xml

3.7. Job with staging

In order to do file staging one must add specific elements to the job description and delegate credentials appropriately (see Section 2, “Delegating credentials”). The file transfer directives follow the RFT syntax, which allows only for third-party transfers. Each file transfer must therefore specify a source URL and a destination URL. URLs are specified as GridFTP URLs (for remote files) or as file URLs (for files local to the service--these are converted internally to full GridFTP URLs by the service).

For instance, in the case of staging a file in, the source URL would be a GridFTP URL (for instance gsiftp://job.submitting.host:2811/tmp/mySourceFile ) resolving to a source document accessible on the file system of the job submission machine (for instance /tmp/mySourceFile ). At run-time the Reliable File Transfer service used by the MEJS on the remote machine would reliably fetch the remote file using the GridFTP protocol and write it to the specified local file (for instance file:///${GLOBUS_USER_HOME}/my_transfered_file, which resolves to ~/my_transfered_file). Here is how the stage-in directive would look like:

<fileStageIn>
    <transfer>
        <sourceUrl>gsiftp://job.submitting.host:2811/tmp/mySourceFile</sourceUrl>
        <destinationUrl>file:///${GLOBUS_USER_HOME}/my_transfered_file</destinationUrl>
    </transfer>
</fileStageIn>

Note: additional RFT-defined quality of service requirements can be specified for each transfer. See the RFT documentation for more information.

Here is an example job description with file stage-in and stage-out:

<job>
    <executable>my_echo</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <argument>Hello</argument>
    <argument>World!</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
    <fileStageIn>
        <transfer>
            <sourceUrl>gsiftp://job.submitting.host:2811/bin/echo</sourceUrl>
            <destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl>
        </transfer>
    </fileStageIn>
    <fileStageOut>
        <transfer>
            <sourceUrl>file:///${GLOBUS_USER_HOME}/stdout</sourceUrl>
            <destinationUrl>gsiftp://job.submitting.host:2811/tmp/stdout</destinationUrl>
        </transfer>
    </fileStageOut>
    <fileCleanUp>
        <deletion>
            <file>file:///${GLOBUS_USER_HOME}/my_echo</file>
        </deletion>
    </fileCleanUp>
</job>

Note that the job description XML does not need to include a reference to the schema that describes its syntax. As a matter of fact it is possible to omit the namespace in the GRAM job description XML elements as well. The submission of this job to the GRAM services causes the following sequence of actions:

  1. The /bin/echo executable is transfered from the submission machine to the GRAM host file system. The destination location is the HOME directory of the user on behalf of whom the job is executed by the GRAM services (see <fileStageIn>).
  2. The transfered executable is used to print a test string (see <executable>, <directory> and the <argument> elements) on the standard output, which is redirected to a local file (see <stdout>).
  3. The standard output file is transfered to the submission machine (see <fileStageOut>).
  4. The file that was initially transfered during the stage-in phase is removed from the file system of the GRAM installation (see <fileCleanup>).

Submit that job (assuming the description is stored in the file myJob.xml):

globusrun-ws -submit -S -f myJob.xml

The flag -S tells globusrun-ws to delegate credentials so that Gram4 can call the file transfer service RFT on behalf of the submitting user, and that RFT can interact with the gridftp servers on behalf of the submitting user.

If you already delegated credentials (see Delegating credentials for how to delegate a credential) and have an endpoint reference of that delegated credentials stored in the file delegCred.epr and want them to be used for the transfers instead of globusrun-ws delegating new credentials, you can tell globusrun-ws to use your credentials:

globusrun-ws -submit -Sf delegCred.epr -Tf delegCred.epr -f myJob.xml

The -Sf flag tells that the specified credential is to be used by Gram4 to call RFT on behalf of the user, and the -Tf flag tells that the specified credential is to be used by RFT to interact with the GridFTP servers.

3.8. Specifying a local user id in the job description

If a user has more than one user account on a server and the distinguished name (DN) of the user's certificate is mapped to all these user accounts, a user can specify which local account should be used by GRAM4 for the job submission. By default the first local user account that is defined is used for job submission. If this is not the one that should be used the user must explicitly specify the account to be used. The following dummy job description shows how to do this:

<job>
    <localUserId>stu</localUserId>
    <executable>/bin/date</executable>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

3.9. Using substitution variables

To allow for customization of values, such as paths, on a per-job basis; a job description substitution variable named "GLOBUS_JOB_ID" can be used.

For example:

<job>
    <executable>/bin/date</executable>
    <stdout>/tmp/stdout.${GLOBUS_JOB_ID}</stdout>
    <stderr>/tmp/stderr.${GLOBUS_JOB_ID}</stderr>
    <fileStageOut>
        <transfer>
            <sourceUrl>file:///tmp/stdout.${GLOBUS_JOB_ID}</sourceUrl>
            <destinationUrl>gsiftp://mymachine.mydomain.com/out.${GLOBUS_JOB_ID}</destinationUrl>
        </transfer>
    </fileStageOut>
</job>

More information about substitution variables can found here.

3.10. Using custom job description extensions

Basic support is provided for specifying custom extensions to the job description. There are plans to improve the usability of this feature, but at this time it involves a bit of work.

Specifying the actual custom elements in the job description is trivial. Simply add any elements that you need between the beginning and ending extensions tags at the bottom of the job description as in the following basic example:

    <job>
        <executable>/home/user1/myapp</executable>
        <extensions>
            <myData>
                <flag1>on</flag1>
                <flag2>off</flag2>
            </myData>
        </extensions>
    </job>
    

To handle this data, you will have to alter the appropriate Perl scheduler script (i.e. $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/fork.pm for the Fork scheduler, etc...) to parse the data returned from the $description->extensions() sub.

For more information about extensions see the Extensions section.

3.11. Multi-Job

The job description XML schema allows for specification of a multijob i.e. a job that is itself composed of several executable jobs, which we will refer to as subjobs (note: subjobs cannot be multijobs, so the structure is not recursive). This is useful for instance in order to bundle a group of jobs together and submit them as a whole to a remote GRAM installation.

Note that no relationship can be specified between the subjobs of a multijob. The subjobs are submitted to job factory services in their order of appearance in the multijob description.

Within a multijob description, each subjob description must come along with an endpoint for the factory to submit the subjob to. This enables the at-once submission of several jobs to different hosts. The factory to which the multijob is submitted acts as an intermediary tier between the client and the eventual executable job factories.

Here is an example of a multijob description:

<?xml version="1.0" encoding="UTF-8"?>
<multiJob xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <factoryEndpoint>
       <wsa:Address>
          https://localhost:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
    </factoryEndpoint>
    <directory>${GLOBUS_LOCATION}</directory>
    <count>1</count>

    <job>
       <factoryEndpoint>
         <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
       </factoryEndpoint>
       <executable>/bin/date</executable>
       <stdout>${GLOBUS_USER_HOME}/stdout.p1</stdout>
       <stderr>${GLOBUS_USER_HOME}/stderr.p1</stderr>
       <count>2</count>
    </job>

    <job>
       <factoryEndpoint>
         <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
       </factoryEndpoint>
       <executable>/bin/echo</executable>
       <argument>Hello World!</argument>        
       <stdout>${GLOBUS_USER_HOME}/stdout.p2</stdout>
       <stderr>${GLOBUS_USER_HOME}/stderr.p2</stderr>
       <count>1</count>
    </job>
</multiJob>

Submit the multi-job with the following command:

% bin/globusrun-ws -submit -f test_multi.xml
    Delegating user credentials...Done.
    Submitting job...Done.
    Job ID: uuid:bd9cd634-4fc0-11d9-9ee1-000874404099
    Termination time: 12/18/2004 00:15 GMT
    Current job state: Active
    Current job state: CleanUp
    Current job state: Done
    Destroying job...Done.
    Cleaning up any delegated credentials...Done.
[Note]Note

When you submit a multi-job you don't have to specify the local resource manager, you can do so though. The fact that it's a multi-job is detected on the server-side and the right "local resource manager" Multi is used automatically.

[Note]Note

In this multi-job description the sub-jobs are submitted to the default local resource manager. If you want them to be submitted to a non-default local resource manager you'll have to specify that in an additional ReferenceParameters element in the factoryEndpoint element of each sub-job. See here for more information about this.

A multijob resource is created by the factory and exposes a set of WSRF resource properties different than the resource properties of an executable job. The state machine of a multijob is also different since the multijob represents the overall execution of all the executable jobs it is composed of.

4. Submitting jobs with metascheduling functionality

Check GT 4.2.1 GridWay: User's Guide if you are looking for information on metascheduling functionality.

Appendix A. Globus Toolkit 4.2.1 Public Interface Guides

This page contains links to each GT 4.2.1 component's Public Interfaces Guide.

Appendix B. Globus Toolkit 4.2.1 Errors

Table B.1. Java WS Core Errors

Error CodeDefinitionPossible Solutions
Failed to acquire notification consumer home instance from registryCaused by javax.naming.NameNotFoundException: Name services is not bound in this Context error. Please see Running client programs from any directory if a client fails with this error.
The WS-Addressing 'To' request header is missingThis warning is logged by the container if the request did not contain the necessary WS-Addressing headers. The client either did not attempt to send those headers at all or is somehow misconfigured.If you are using a Java client and launching it directly using the java executable, take a look at Appendix B, Running client programs from any directory.
java.io.IOException: Token length X > 33554432 If you see this error in the container log, it usually means you are trying to connect to HTTPS server using HTTP. For example, the service address specifies 8443 as a port number and http as the protocol name.In general, use 8443 port number with the https protocol, and 8080 port number with the http protocol.
java.lang.NoSuchFieldError: DOCUMENTThis error usually indicates a mismatch between the version of Apache Axis that the code was compiled with and the version of Axis that the code is currently running with. Make sure that the same version of Axis is used at compile time and at runtime.
org.globus.wsrf. InvalidResourceKeyException: Argument key is null / Resource key is missingThese errors usually indicate that a resource key was not passed with the request or that an invalid resource key was passed with the request (that is, the element QName of the resource key did not match what the service expected).Make sure that the EPR used to invoke the service that contains the appropriate resource key. If you are using some command-line tool make sure to specify the resource key using the -k option or pass a complete EPR from a file using the -e option.
Unable to connect to localhost:xxxCannot resolve localhost. The machine's /etc/hosts isn't set up correctly and/or you do not have DNS for these machines. There should always be an entry in /etc/hosts (or /etc/hostname/ on Debian) for localhost in the following format (IP address/fully qualified domain name/short name):
140.221.8.109   cognito.mcs.anl.gov cognito
org.globus.common.ChainedIOException: Failed to initialize security contextThis may indicate that the user's proxy is invalid.To correct the error, the user must properly initialize the user proxy. See grid-proxy-init for more information on proxy initialization.
Error: org.xml.sax.SAXException: Unregistered type: class xxxThis may indicate that an Axis generated XML type, defined by the WS RLS XSD, was not properly registered. While all the XML types should get registered upon deployment without intervention by the user, sometimes they do not.To remedy the situation add a typeMapping to the server-config.wsdd file under globus_wsrf_replicalocation_service. Use the format shown here.
No socket factory for 'https' protocol

When a client fails with the following exception:

 java.io.IOException: No socket factory for 'https' protocol at
        org.apache.axis.transport.http.HTTPSender.getSocket(HTTPSender.java:179) at
        org.apache.axis.transport.http.HTTPSender.writeToSocket(HTTPSender.java:397) at
        org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:135)

FIXME - it may have happened because...

Add the following to the client:

 import org.globus.axis.util.Util; ... static { Util.registerTransport(); }
...
No client transport named 'https' found

When a client fails with the following exception:

No client transport named 'https' found at
        org.apache.axis.client.AxisClient.invoke(AxisClient.java:170) at
        org.apache.axis.client.Call.invokeEngine(Call.java:2726)

The client is most likely loading an incorrect client-config.wsdd configuration file.

Ensure that the GT4 installation directory is listed as the first entry in the CLASSPATH of the client. For example:

CLASSPATH=/usr/local/globus-4.2.0:/foo/bar/others.jar:...

If you are seeing this problem in Tomcat, copy the client-config.wsdd from the GT4 installation directory to the Web application's WEB-INF/classes directory.

ConcurrentModificationException in Tomcat 5.0.x

If the following exception is visible in the Tomcat logs at startup, it might cause the HTTPSValve to fail:

java.util.ConcurrentModificationException at
        java.util.HashMap$HashIterator.nextEntry(HashMap.java:782) at
        java.util.HashMap$EntryIterator.next(HashMap.java:824) at
        java.util.HashMap.putAllForCreate(HashMap.java:424) at
        java.util.HashMap.clone(HashMap.java:656) at
        mx4j.server.DefaultMBeanRepository.clone(DefaultMBeanRepository.java:56)

The HTTPSValve might fail with the following exception:

java.lang.NullPointerException at
        org.apache.coyote.tomcat5.CoyoteRequest.setAttribute(CoyoteRequest.java:1472) at
        org.apache.coyote.tomcat5.CoyoteRequestFacade.setAttribute(CoyoteRequestFacade.java:351) at
        org.globus.tomcat.coyote.valves.HTTPSValve.expose(HTTPSValve.java:99)

These exceptions will prevent the transport security from working properly in Tomcat.

This is a Tomcat bug. Keep restarting Tomcat until it starts without the ConcurrentModificationException or switch to a different version of Tomcat.

java.net.SocketException: Invalid argument or cannot assign requested address

FIXME - what causes this?

If you see the java.net.SocketException: Invalid argument or cannot assign requested address error in the container log or on the client side, try setting the following property:

 $ export GLOBUS_OPTIONS="-Djava.net.preferIPv4Stack=true"
GAR deploy/undeploy fails with container is running error

A GAR file can only be deployed or undeployed locally while the container is off. However, GAR deployment/undeployment might still sometimes fail with this error even if the container is off. This usually happens if the container has crashed or was stopped improperly, preventing the container from cleaning up its state files.

To resolve this problem, delete any files under the $GLOBUS_LOCATION/var/state directory and try to redeploy/reundeploy the GAR file again.

Table B.2. C WS Core Errors

Error CodeDefinitionPossible Solutions
globus_soap_message_module: Failed sending request http://widgets.com/WidgetPortType/createWidgetRequest. globus_xio: Unable to connect to grid.example.org:8080 globus_xio: System error in connect: Connection refused globus_xio: A system call failed: Connection refused Unable to contact service container Check that the service endpoint refers to a running container.
globus_soap_message_module: Failed sending request http://widgets.com/WidgetPortType/createWidgetRequest. globus_xio_gsi: gss_init_sec_context failed. GSS Major Status: Unexpected Gatekeeper or Service Name globus_gsi_gssapi: Authorization denied: The name of the remote entity (/C=US/O=Globus Alliance/OU=Service/CN=host/grid.example.org), and the expected name for the remote entity (/C=US/O=Globus Alliance/OU=Service/CN=host/cloud.example.org) do not match Service is not running with the expected security credential. Verify that the service credential being presented by the service (first parenthesized name) is a reasonable certificate name for the service. If so, set the GLOBUS_SOAP_MESSAGE_PEER_IDENTITY_KEY attribute on the soap message handle to that identity. For most command-line wsrf tools, this can be done by passing it as an argument to the -z command-line parameter.
globus_soap_message_module: SOAP Fault Fault code: Client Fault string: globus_service_engine_module: Failed to find operation: {XXXX}YYYY for service: {ZZZZ}BBBB The service port type {ZZZZ}BBBB does not contain a {XXXX}YYYY operation. Verify that the client bindings are built from the same WSDL and XML Schema documents as the service.
globus_soap_message_module: Failed receiving response http://widgets.com/WidgetPortType/createWidgetResponse. ws_addressing: Addressing header is a draft version of WS Addressing: "http://schemas.xmlsoap.org/ws/2004/03/addressing". This could be a GT version mismatch, client is GT 4.2.x and response is from GT 4.0.x server The service is running on a container which is using a draft version of the WS-Addressing specification. This was used by GT 4.0.xUpdate the service to work with GT 4.2.x or compile your client with GT 4.0.x libraries.
globus_soap_message_module: Failed sending request http://widgets.com/WidgetPortType/createWidgetRequest. globus_xio: The GSI XIO driver failed to establish a secure connection. The failure occured during a handshake read. globus_xio: An end of file occurred The service container either did not support SSL authentication, or the service container did not trust the client certificateConsult the service administrator to verify that the service container supports SSL and that your certificate is issued by a certificate authority trusted by the service.

Table B.3. XIO Errors

Error CodeDefinitionPossible Solutions
Operation was canceledAn I/O operation has been canceled by a close or a cancel In most cases this will be intentionally performed by the application developer. In unexpected cases the applciation developer should verify that there is not a race condition relating to closing a handle.
Operation timed out Occurs when the application developer associates a timeout with a handle's I/O operations. If no I/O is performed before the timeout expires this error will be triggered. The remote side of connection might be hung and busy. The network could have higher latencies than expected. The filesystem might be over worked.
An end of file occurred This occurs when and EOF is detected on the file descriptor When doing file I/O this like means you read to the end of the file and thus you are finished and should now close it. On network connections however it means the socket was closed on the remote end. This can happen it the remote side suddenly dies (seg-fault is common here) or if the remote side chooses to close the connection.
Contact string invalidA poorly formed contact string was passed in to open Verify the format of the contact string with the documentation of the drivers in use.
Memory allocation failed on XXXXmalloc failed. The system is likely quite overloaded Free up memory in your application
System error in XXXXA low level system error occurred. The errno and errstring should indicate more information.
Invalid stack The requested stack does not meet XIO standards Most likely a transport driver is not on the bottom of the stack, or 2 transport drivers are in the stack.
Operation already registered With certain common drivers like TCP and FILE, only one specific operations can be registered at a time (1 read, 1 write). If another operation of the same type is posted to the handle before receiving the previous operations callback, this error can occur. Restructure the application code so that it waits for the callback before registering the next IO operation.
Unexpected stateThe internal logic of XIO came across a logical path that should not be possible. Often times this is due to application memory corruption or trying to perform an IO operation on a closed or otherwise invalid handle. Use valgrind or some sort of memory managment tool to verify there is no memory corruption. Try to recreate the problem in a small program. Submit the program and the memory trace at bugzilla.globus.org
Driver in handle has been unloadedA driver associated with the offending operation has already been unloaded by the application code. Verify that you are not unloading drivers until they are no longer in use.
Module not activatedglobus_module_activate(GLOBUS_XIO_MODULE); has not been called. Call this before making any other XIO API calls.

Table B.4. Credential Errors

Error CodeDefinitionPossible Solutions
Your proxy credential may have expiredYour proxy credential may have expired.Use grid-proxy-info to check whether the proxy credential has actually expired. If it has, generate a new proxy with grid-proxy-init.
The system clock on either the local or remote system is wrong.This may cause the server or client to conclude that a credential has expired.Check the system clocks on the local and remote system.
Your end-user certificate may have expiredYour end-user certificate may have expiredUse grid-cert-info to check your certificate's expiration date. If it has expired, follow your CA's procedures to get a new one.
The permissions may be wrong on your proxy fileIf the permissions on your proxy file are too lax (for example, if others can read your proxy file), Globus Toolkit clients will not use that file to authenticate.You can "fix" this problem by changing the permissions on the file or by destroying it (with grid-proxy-destroy) and creating a new one (with grid-proxy-init).

Important: However, it is still possible that someone else has made a copy of that file during the time that the permissions were wrong. In that case, they will be able to impersonate you until the proxy file expires or your permissions or end-user certificate are revoked, whichever happens first.

The permissions may be wrong on your private key fileIf the permissions on your end user certificate private key file are too lax (for example, if others can read the file), grid-proxy-init will refuse to create a proxy certificate.You can "fix" this by changing the permissions on the private key file.

Important: However, you will still have a much more serious problem: it is possible that someone has made a copy of your private key file. Although this file is encrypted, it is possible that someone will be able to decrypt the private key, at which point they will be able to impersonate you as long as your end user certificate is valid. You should contact your CA to have your end-user certificate revoked and get a new one.

The remote system may not trust your CAThe remote system may not trust your CAVerify that the remote system is configured to trust the CA that issued your end-entity certificate. See Installing GT 4.2.1 for details.
You may not trust the remote system's CAYou may not trust the remote system's CAVerify that your system is configured to trust the remote CA (or that your environment is set up to trust the remote CA). See Installing GT 4.2.1 for details.
There may be something wrong with the remote service's credentialsThere may be something wrong with the remote service's credentialsIt is sometimes difficult to distinguish between errors reported by the remote service regarding your credentials and errors reported by the client interface regarding the remote service's credentials. If you cannot find anything wrong with your credentials, check for the same conditions on the remote system (or ask a remote administrator to do so) .

Table B.5. Gridmap Errors

Error CodeDefinitionPossible Solutions
The content of the grid map file does not conform to the expected formatThe content of the grid map file does not conform to the expected format Run grid-mapfile-check-consistency to make sure that your gridmap file conforms to the expected format.
The grid map file does not contain a entry for your DNThe grid map file does not contain a entry for your DN Use grid-mapfile-add-entry to add the relevant entry.

Table B.6. Java WS A&A Errors

Error CodeDefinitionPossible Solutions
[JWSSEC-248] Secure container requires valid credentialsThis error occurs when globus-start-container is run without any valid credentials. Either a proxy certificate or service/host certificate needs to be configured for the container to start up.
  1. If you are not looking to start up a container that uses GSI Secure Transport, which is used by the container by default, use globus-start-container -nosec. You will be able to use insecure clients and services. However, this also implies that if you have not configured individual services with credentials, you will not be able to securely access the service.

  2. If you are running a personal container, generate a proxy certificate with grid-proxy-init. If the proxy certificate is not in the default location, configure the container security descriptor as described in Configuring Container Security Descriptor.

  3. If you want to use host certificates, configure the container security descriptor as described Configuring Credentials.

Failed to start container: Container failed to initialize [Caused by: [JWSSEC-250] Failed to load certificate/key file]This error occurs if the file path to the container certificate and key configured are invalid.
  1. The path to the container certificate and key are configured in $GLOBUS_LOCATION/etc/globus_wsrf_core/ global_security_descriptor.xml. This file is loaded as described [here - fixme link]. Ensure that the path is correct.

Failed to start container: Container failed to initialize [Caused by: [JWSSEC-249] Failed to load proxy file]This error occurs if container proxy file configured is invalid.
  1. The path to the container proxy certificates are configured in $GLOBUS_LOCATION/etc/globus_wsrf_core/ global_security_descriptor.xml. This file is loaded as described [here - fixme link]. Ensure that the path is correct.

Failed to start container: Container failed to initialize [Caused by: [JWSSEC-245] Error parsing file: "etc/globus_wsrf_core/ global_security_descriptor.xml" [Caused by: ...]This error occurs if the container security descriptor configured is invalid.
  1. The container security descriptor should conform to the Container Security Descriptor Schema.

  2. Refer to the "Caused by: " section for details on the specific element that is not correct.

[JGLOBUS-77] Unknown CAThis error occurs if the CA certificate for the credentials being used is not installed correctly.
  1. If this issue occurs on the server side, the container is not configured with CA certificates. The container looks for trusted certificates in the default location as described Java CoG Toolkit FAQ

  2. On the server side, the trusted certificates can be configured as described in Trusted Certificates

  3. On the client side, trusted certificates can be configured as described in Configuring Trusted Credentials

Table B.7. GridShib Errors

Error CodeDefinitionPossible Solutions
error1description1 solutions or links to solutions

Table B.8. MyProxy Errors

Error CodeDefinitionPossible Solutions
MyProxy server name does not match expected name

This error appears as a mutual authentication failure or a server authentication failure, and the error message should list two names: the expected name of the MyProxy server and the actual authenticated name.

By default, the MyProxy clients expect the MyProxy server to be running with a host certificate that matches the target hostname. This error can occur when running the MyProxy server under a non-host certificate or if the server is running on a machine with multiple hostnames.

The MyProxy clients authenticate the identity of the MyProxy server to avoid sending passphrases and credentials to rogue servers.

If the expected name contains an IP address, your system is unable to do a reverse lookup on that address to get the canonical hostname of the server, indicating either a problem with that machine's DNS record or a problem with the resolver on your system.

If the server name shown in the error message is acceptable, set the MYPROXY_SERVER_DN environment variable to that name to resolve the problem.
Error in bind(): Address already in useThis error indicates that the myproxy-server port (default: 7512) is in use by another process, probably another myproxy-server instance. You cannot run multiple instances of the myproxy-server on the same network port. If you want to run multiple instances of the myproxy-server on a machine, you can specify different ports with the -p option, and then give the same -p option to the MyProxy commands to tell them to use the myproxy-server on that port.
grid-proxy-init failedThis error indicates that the grid-proxy-init command failed when myproxy-init attempted to run it, which implies a problem with the underlying Globus installation. Run
grid-proxy-init -debug -verify
for more information.
User not authorizedAn error from the myproxy-server saying you are "not authorized" to complete an operation typically indicates that the myproxy-server.config file settings are restricting your access to the myproxy-server. It is possible that the myproxy-server is running with the default myproxy-server.config file, which does not authorize any operations. See Configuring for more information.

Table B.9. GSI-OpenSSH Errors

Error CodeDefinitionPossible Solutions
GSS-API error Failure acquiring GSSAPI credentials: GSS_S_CREDENTIALS_EXPIREDThis means that your proxy certificate has expired. Run grid-proxy-init to acquire a new proxy certificate, then run gsissh again.
...no proxy credentials...Failing to run grid-proxy-init to create a user proxy with which to connect will result in the client notifying you that no local credentials exist. Any attempt to authenticate using GSI will fail in this case. Verify that your GSI proxy has been properly initialized via grid-proxy-info. If you need to initialize the proxy, use the command grid-proxy-init.
...bad file system permissions on private key; key must only be readable by the user...The host key that the SSH server is using for GSI authentication must only be readable by the user which owns it. Any other permissions will cause this error. Make sure that the host key's UNIX permissions are mode 400 (that is, it should only have mode readable for the user that owns the file, and no other mode bits should be set).
...gssapi received empty username; failed to set username from gssapi context; Failed external-keyx for <user> from <host> <port>...If the server was passed an "implicit username" (i.e. requested to map the incoming connection to a username based on some contextual clues such as the certificate's subject), and no entry exists in the grid-mapfile for the incoming connection's certificate subject, the server should output a clue that states it is unable to set the username against which to authenticate. Add an entry for the user to the [grid-mapfile fixme link].
...INTERNAL ERROR: authenticated invalid user xxx...If the subject name given in the system's grid-mapfile points to a non-existent user, the server will give an internal error which is best caught when it is running in debugging mode. Add a new account to the system matching the username pointed at by the user's subject in the grid-mapfile.
...gssapi received empty username; no suitable client data; failed to set username from gssapi context; Failed external-keyx for <user> from <host> <port>... Should the user attempt to connect without first creating a proxy certificate, or if the user is connecting via a SSH client that does not support GSI authentication, the server will note that no GSSAPI data was sent to it. Verify that the client is able to connect through another GSI service (such as the gatekeeper) to make sure that the user's proxy has been created correctly. Verify that you are using a GSI-enabled SSH client and that your GSI proxy has been properly initialized via grid-proxy-info. If you need to initialize this proxy, use the command grid-proxy-init.

Table B.10. GridFTP Errors

Error CodeDefinitionPossible Solutions
globus_ftp_client: the server responded with an error 530 530-globus_xio: Authentication Error 530-OpenSSL Error: s3_srvr.c:2525: in library: SSL routines, function SSL3_GET_CLIENT_CERTIFICATE: no certificate returned 530-globus_gsi_callback_module: Could not verify credential 530-globus_gsi_callback_module: Can't get the local trusted CA certificate: Untrusted self-signed certificate in chain with hash d1b603c3 530 End. This error message indicates that the GridFTP server doesn't trust the certificate authority (CA) that issued your certificate. You need to ask the GridFTP server administrator to install your CA certificate chain in the GridFTP server's trusted certificates directory.
globus_ftp_control: gss_init_sec_context failed OpenSSL Error: s3_clnt.c:951: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed globus_gsi_callback_module: Could not verify credential globus_gsi_callback_module: Can't get the local trusted CA certificate: Untrusted self-signed certificate in chain with hash d1b603c3 This error message indicates that your local system doesn't trust the certificate authority (CA) that issued the certificate on the resource you are connecting to. You need to ask the resource administrator which CA issued their certificate and install the CA certificate in the local trusted certificates directory.

Table B.11. Reliable File Transfer (RFT) Errors

Error CodeDefinitionPossible Solutions
Error creating RFT Home: Failed to connect to database ... Until this is corrected all RFT request will fail and all GRAM jobs that require staging will failThis occurs when you start the container if RFT is not configured properly to talk to a PostgreSQL database. The usual cause is that Postmaster is not accepting TCP connections, which means that you must restart Postmaster with the -i option (see Configuring RFT).
ERROR service.RFTResourceManager [Thread-13,transferCompleted:517] Unable to update on finished org.globus.transfer.reliable.service.database.RftDBException: RFT database update error [Caused by: Syntax error: Encountered ")" at line 1, column 47.] This error occurs as a result of a dynamically built SQL update string. The update occurs when a transfer completes. It is used to notify transfer requests using the same hosts that resources on that host have been freed. The error message occurs when no rows in the database match that host. Users of RFT may safely ignore this error. The message is harmless to the functionality of RFT and will not affect the results of a transfer in any way. The exception is safely caught. Future versions of RFT will have optimizations to avoid this step.

Table B.12. Replica Locator Service (RLS) Errors

Error CodeDefinitionPossible Solutions
Error with credential: The proxy credential: <credential> with subject: <subject> expired <minutes> minutes ago Expired proxy credential Create a new proxy with grid-proxy-init.
Unable to connect to localhost:xxxx Unable to connect to the local host. This can be due to a variety of reasons, including a wrong address or port number in the RLS connection URL or an issue with a firewall configuration.
  • Double-check the address and port number in the RLS connection URL. parameters are correct.

  • If a firewall configuration is preventing connections to the target host for a particular port, you may need to consult the system administrator.

"connection timeout"At times, a client may experience a connection timeout when interacting with the RLS server due to a variety of reasons:
  • One reason could simply be due to wide-area network latency or congestion.

  • Another situation that users eventually encounter is due to scaling of the system. As the RLS server's database of replica location mappings grows in size, some query operations, such as bulk queries involving large quantities of mappings or wildcard queries that result in a large subset of mappings, will begin to take more time both to process the query and to return the large results set to the client over the network.

If timeouts are experienced with increasing frequency, increase the RLS server's timeout configuration parameter found in the $GLOBUS_LOCATION/var/globus-rls-server.conf file. You may also use the -t timeout option of the globus-rls-cli tool.

Table B.13. WS Replica Location Service (WS RLS) Errors

Error CodeDefinitionPossible Solutions
Error: java.lang.NullPointerExceptionWhen invoking the WS RLS command-line clients, a system-level exception like the one above may be encountered. The admin should check the container logs for the exact error.
Error: A server error occured while processing the requestWhen invoking the WS RLS command-line clients, a server error like the one above may be encountered. The admin should check the container logs for the exact error.
java.lang.UnsatisfiedLinkErrorThis exception when using the WS RLS may indicate that the native RLS libraries that WS RLS depends on could not be located. To correct this problem, ensure that the $GLOBUS_LOCATION/lib directory is in the library search path (on some systems this is the LD_LIBRARY_PATH variable).
Unable to connect to localhost:39281The WS RLS is an interface layer that depends on the RLS for the replica location functionality. You must install and run RLS and configure WS RLS to use the RLS via its JNDI configuration. Check that RLS is installed, running, and check that the WS RLS JNDI configuration uses the correct hostname and port to connect to the RLS.
org.globus.common.ChainedIOException: Failed to initialize security contextIf this exception occurs while using WS RLS, it may indicate that the user's proxy is invalid. To correct the error, the user must properly initialize the user proxy. See grid-proxy-init for more information on proxy initialization.
Error: org.xml.sax.SAXException: Unregistered type: class xxxIf this exception occurs when using the WS RLS, it may indicate that an Axis generated XML type, defined by the WS RLS XSD, was not properly registered. While all the XML types should get registered upon deployment without intervention by the user, sometimes they do not. To remedy the situation add a typeMapping to the server-config.wsdd file under globus_wsrf_replicalocation_service. Use the format shown here.

Table B.14. Batch Replicator Errors

Error CodeDefinitionPossible Solutions
Authorization failed. Expected <hostname1> target but received <hostname2>Did not receive expected hostname When authorization is enabled on the container, you may need to use the proper hostname when referencing the Batch Replicator service rather than using localhost.
org.globus.wsrf.ResourceException: Failed to create Replication: /scratch/testrun (No such file or directory)Cannot find the request file Ensure that the request file's filename is correct, that it is reachable by the Batch Replicator service, and that it has the appropriate permissions for the Batch Replicator service to access it.
org.globus.wsrf.ResourceException: Failed to create Replication: String index out of range: -1The request file is malformed (for example by using spaces instead of a delimiting tab character) which is resulting in a runtime exception. Make sure your request file is in the correct form as described here.

Table B.15. Replication Client Errors

Error CodeDefinitionPossible Solutions
fixmefixme fixme

Table B.16. WS MDS Index Service Error Messages

Error CodeDefinitionPossible Solutions
error what causes this possible solutions
WS MDS is built on Java WS Core, please see Java WS Core Error Codes for more error code documentation.

Table B.17. WS MDS Trigger Service Error Messages

Error CodeDefinitionPossible Solutions
Error ; nested exception is: org.apache.commons.httpclient. NoHttpResponseException: The server xxx.x.x.x failed to respondHappens when trying to create a trigger for the Trigger Service. The above error is accompanied by the following error in container: [JWSCORE-192] Error processing request java.io.IOException: Token length 1347375956 > 33554432. FIXME - what causes this? Be sure that you have properly edited the client-config-settings file under globus_wsrf_mds_trigger. The DefaultServiceAddress parameter should properly reflect the service prefix from your container, e.g.: https://127.0.0.1:8444/wsrf/services/. The services you wish to monitor should also be consistent.
WS MDS is built on Java WS Core, please see Java WS Core Error Codes for more error code documentation.

Table B.18. WS MDS Aggregator Error Messages

Error CodeDefinitionPossible Solutions
error what causes this possible solutions
WS MDS is built on Java WS Core, please see Java WS Core Error Codes for more error code documentation.

Table B.19. WS MDS Trigger Service Error Messages

Error CodeDefinitionPossible Solutions
java.net.ConnectException: Connection refused If you attempt to use WebMDS to collect information from a service that is not running, you will see a stack trace that begins with:
org.globus.mds.webmds.xmlSources.resourceProperties.ResourcePropertySourceException: ; nested exception is: 
	java.net.ConnectException: Connection refused
Make sure the service you are trying to collect information from is running.
faultString: org.globus.common.ChainedIOException: Authentication failed [Caused by: Failure unspecified at GSS-API level [Caused by: Unknown CA]] When WebMDS sends resource property queries to a secure WSRF service instance (such as an WS MDS Index Server), the WebMDS server must trust the certificate authority that issued the certificate used by the WSRF service instance. If the WebMDS server does not trust the CA used by the remote service, then WebMDS queries will produce a stack trace that includes this message. This can be solved by configuring the Tomcat server that hosts WebMDS to trust the appropriate CA, by either:
  • placing the CA certificate in /etc/grid-security/certificates, or

  • placing the CA certificate somewhere else, and setting the Tomcat process's X509_CERT_DIR system parameter to the directory in which the CA certificate was installed. One way to do this is to set the CATALINA_OPTS environment variable and then restart Tomcat:

    export CATALINA_OPTS=-DX509_CERT_DIR=/path/to/cert/dir
    $CATALINA_HOME/bin/shutdown.sh
    $CATALINA_HOME/bin/startup.sh

WebMDS connections to secure Index Servers (or other secure WSRF servers) just hang If the JVM used by Tomcat is configured to use a blocking random-number source, WebMDS connections to secure Index Servers (or other secure WSRF servers) can hang. This is the default configuration for many installations. One solution is to set the CATALINA_OPTS environment variable to ensure that Tomcat's JVM will use a non-blocking random-number source:
export CATALINA_OPTS=-Djava.security.egd=/dev/urandom
$CATALINA_HOME/bin/shutdown.sh
$CATALINA_HOME/bin/startup.sh
[Note]Note

f you encounter this problem with WebMDS, you may also encounter a similar problem with the Globus container on the same system.

Table B.20. GRAM4 Errors

Error CodeDefinitionPossible Solutions
globusrun-ws - error querying job state

During job submission, an error like this occurs:

globusrun-ws failed: Delegating user credentials...Done. Submitting job...Done. Job ID: xxxx Termination time: xxxx Current job state: Unsubmitted globusrun-ws: Error querying job state globus_soap_message_module: Failed sending request ManagedJobPortType_GetMultipleResourceProperties. globus_xio: An end of file occurred
Periodically, globusrun-ws will query the GRAM service to check on the job state. The "End of file" indicates that the GRAM server dropped a connection when globusrun-ws tried to read a response. This could be caused by temporary network issues between the client and service, or possibly caused by an overloaded service host.
globusrun-ws - error querying job state

During job submission, an error like this occurs:

globusrun-ws failed: Delegating user credentials...Done. Submitting job...Done. Job ID: xxxx Termination time: xxxx Current job state: Unsubmitted globusrun-ws: Error querying job state globus_soap_message_module: Failed sending request ManagedJobPortType_GetMultipleResourceProperties. globus_xio: System error in read: Connection reset by peer globus_xio: A system call failed: Connection reset by peer
Periodically, globusrun-ws will query the GRAM service to check on the job state. The
System error in read: Connection reset by peer
indicates that the GRAM server dropped the connection while trying to write the response. This could be caused by temporary network issues between the client and service, or possibly caused by an overloaded service host.
globusrun-ws - error submitting job

During job submission, an error like this occurs:

globusrun-ws -Ft PBS -F https://host.teragrid.org:8444 -submit -b -f /tmp/wsgram.rsl -o /tmp/wsgram.epr failed: Submitting job...Failed. globusrun-ws: Error submitting job globus_soap_message_module: Failed sending request ManagedJobFactoryPortType_createManagedJob. globus_xio: Operation was canceled globus_xio: Operation timed out
The
Operation timed out
indicates that the GRAM service was not able to accept the job request and respond in time. This could be caused by temporary network issues between the client and service, or possibly caused by an overloaded service host.

Table B.21. GRAM2 Errors

Error CodeDefinitionPossible Solutions
error1description1 solutions or links to solutions

Table B.22. Gridway Errors

Error CodeDefinitionPossible Solutions
Lock file existsAnother GWD may be running. Be sure that no other GWD is running, then remove the lock file and try again.
Error in MAD initializationThere may be problems with the proxy certificate, bin directory, or the executable name of a MAD may not be in the correct location. Check that you have generated a valid proxy (for example with the grid-proxy-info command). Also, check that the directory $GW_LOCATION/bin is in your path, and the executable name of all the MADs is defined in gwd.conf.
Could not connect to gwdGridWay may not be running or there may be something wrong with the connection. Be sure that GWD is running; for example:
pgrep -l gwd
If it is running, check that you can connect to GWD; for example:
telnet `cat $GW_LOCATION/var/gwd.port`

Glossary

C

client

A process that sends commands and receives responses. Note that in GridFTP, the client may or may not take part in the actual movement of data.

M

Managed Executable Job Service (MEJS)

[FIXME]

P

proxy certificate

A short lived certificate issued using a EEC. A proxy certificate typically has the same effective subject as the EEC that issued it and can thus be used in its place. GSI uses proxy certificates for single sign on and delegation of rights to other entities.

For more information about types of proxy certificates and their compatibility in different versions of GT, see http://dev.globus.org/wiki/Security/ProxyCertTypes.

R

resource properties

A resource is composed of zero or more resource properties which describe the resource. For example, a resource can have the following three resource properties: Filename, Size, and Descriptors. The resource properties are defined in the web service's WSDL interface description.

S

server

A process that receives commands and sends responses to those commands. Since it is a server or service, and it receives commands, it must be listening on a port somewhere to receive the commands. Both FTP and GridFTP have IANA registered ports. For FTP it is port 21, for GridFTP it is port 2811. This is normally handled via inetd or xinetd on Unix variants. However, it is also possible to implement a daemon that listens on the specified port. This is described more fully in in the Architecture section of the GridFTP Developer's Guide.

T

third party transfers

In the simplest terms, a third party transfer moves a file between two GridFTP servers.

The following is a more detailed, programmatic description.

In a third party transfer, there are three entities involved. The client, who will only orchestrate, but not actually take place in the data transfer, and two servers one of which will be sending data to the other. This scenario is common in Grid applications where you may wish to stage data from a data store somewhere to a supercomputer you have reserved. The commands are quite similar to the client/server transfer. However, now the client must establish two control channels, one to each server. He will then choose one to listen, and send it the PASV command. When it responds with the IP/port it is listening on, the client will send that IP/port as part of the PORT command to the other server. This will cause the second server to connect to the first server, rather than the client. To initiate the actual movement of the data, the client then sends the RETR “filename” command to the server that will read from disk and write to the network (the “sending” server) and will send the STOR “filename” command to the other server which will read from the network and write to the disk (the “receiving” server).

See Also client/server transfer.

U

user certificate

A EEC belonging to a user. When using GSI, this certificate is typically stored in $HOME/.globus/usercert.pem. For more information on possible user certificate locations, see this.

W

Web Services Addressing (WSA)

The WS-Addressing specification defines transport-neutral mechanisms to address web services and messages. Specifically, it defines XML elements to identify web service endpoints and to secure end-to-end endpoint identification in messages. See the W3C WS Addressing Working Group for details.

X

XML Path Language (XPath)

XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document. See the XPath specification for details.