Please note that these documents are for an OBSOLETE version of the Globus Toolkit. For more information see 5.2 End of Life

GT 5.2.2 Release Notes: GRAM5


1. Component Overview

The Grid Resource Allocation and Management (GRAM5) component is used to locate, submit, monitor, and cancel jobs on Grid computing resources. GRAM5 is not a Local Resource Manager, but rather a set of services and clients for communicating with a range of different batch/cluster job schedulers using a common protocol. GRAM5 is meant to address a range of jobs where reliable operation, stateful monitoring, credential management, and file staging are important.

2. Feature summary

New Features new since 5.2.1:

  • Improved memory management and process management.
  • Improved scalability and reliability

Other Standard Supported Features

  • Remote job execution and management
  • Uniform and flexible interface to local resource managers
  • File staging before and after job execution
  • File and directory clean up after job termination
  • Service auditing for each submitted

Removed Features

  • Condor SEG module is no longer included. Its functionality has been moved into the core of the job manager program.

3. Summary of Changes in GRAM5

3.1. New Features: GRAM5

None.

3.2. Improvements: GRAM5

  • GT-154: Kill off perl processes when idle
  • GT-185: globus-personal-gatekeeper creates too-long paths on MacOS
  • GT-205: gatekeeper should log a message when it exits due to the presence of /etc/nologin
  • GT-224: Manage GRAM execution per client host for scalability for different clients

4. Fixed Bugs for GRAM5

  • GT-149: Memory leaks in globus-job-manager
  • GT-155: Job manager deletes job dir sometimes
  • GT-163: Condor fake-SEG loses track of job
  • GT-198: globusrun crashes when authentication fails for status check
  • GT-199: GRAM audit checks result username incorrectly
  • GT-209: job manager crash in query
  • GT-212: Missing debian packages
  • GT-236: gram audit makefile has missing parameter to mkdir
  • GT-252: Missing dependency in gass cache program
  • GT-253: gatekeeper and job manager don't build on hurd

5. Known Problems in GRAM5

  • GT-48: Held Condor jobs should be reported as SUSPENDED
  • GT-53: RSL eval doesn't indicate what symbol was not found

6. Technology dependencies

GRAM depends on the following GT components:

  • Globus Common
  • GSI C
  • GridFTP server

7. Tested platforms

Tested platforms for GRAM5:

  • Linux

    • CentOS 4 x86_64
    • CentOS 5, 6 x86_64, i386
    • Fedora 15, 16, 17 x86_64, i386
    • Red Hat Enterprise Linux 5, 6 x86_64, i386
    • Scientific Linux 5, 6 x86_64, i386
    • Debian 6, 7 (testing) x86_64, i386
    • Ubuntu 10.04LTS, 10.10, 11.04, 11.10, 12.04LTS x86_64, i386

  • Mac OS X

    • Mac OS X 10.7 (Lion)

  • Solaris

    • Solaris 11

8. Backward compatibility summary

Protocol changes in GRAM since GT4 series:

  • The GRAM5 service uses a superset of the GRAM2 protocol for communciation between the client and service. The extensions supported in GRAM5 are implemented in such a way that they are ignored by GRAM2 services or clients. These extensions provide improved error messages and version detection.
  • GRAM5 does not support task coallocation using DUROC and its related protocols. Jobs submitted using DUROC directives will fail.
  • GRAM5 does not support file streaming. The standard output and standard error streams are sent after the job completes instead of during execution. As a special case, support for the Condor grid monitor program implements a small subset of the streaming capabilities of GRAM2 in GT 4.2.x.

9. Associated Standards

None

10. For More Information

See GRAM5 for more information about this component.

Glossary

L

Local Resource Manager (LRM)

A system which controls access to a compute resource, such as a compute cluster or parallel computer. Such systems provide batch execution interfaces, which GRAM uses to execute jobs. Condor, Portable Batch System, GridEngine are examples of local resource managers.

See Also Condor, Portable Batch System, Oracle GridEngine.