Appendix A. Building and Installing RLS

The following procedures include the optional steps to set up an RLS server using MySQL or PostgreSQL and ODBC libraries of your choice. Post setup configuration (tuning the server parameters, etc) are not included in this document.

1. Requirements

You need to download and install the following software (follow the links to download):

  • Installation of GT 4.0
  • A Relational Database Server (RDBMS) that supports ODBC. We provide instructions for PostgreSQL and MySQL.

    • If you use PostgreSQL, you'll also need psqlODBC (the ODBC driver for PostgreSQL).
    • If you use MySQL, you'll also need the MyODBC (Connector/ODBC) packages. MySQL's top level installation directory must be specified. By default these are assumed to be in $GLOBUS_LOCATION.
  • The iODBC package is used to interface to the ODBC layer of the RDBMS. The location of iODBC and the odbc.ini file must be specified before installing the RLS server.

2. Setting environment variables

The following environment variables can be used to override the default locations. These should be set prior to installing the RLS server.

The location of iODBC and the odbc.ini file must be specified before installing the RLS server. Also, if you're using MySQL its top level installation directory must be specified. By default, these are assumed to be in $GLOBUS_LOCATION.

In addition, if you're building from source and wish to build the client Java API (included in the server bundles), you need to set the path to the Java Development Toolkit (JDK), version 1.4 or later.

Table A.1. RLS Build Environment Variables

VariableDefault
GLOBUS_IODBC_PATH $GLOBUS_LOCATION
ODBCINI $GLOBUS_LOCATION/var/odbc.ini
JAVA_HOME none
GLOBUS_MYSQL_PATH $GLOBUS_LOCATION (if using MySQL)

You can use the following commands to set these variables. You only need to set these variables for RLS installation; they are not used when running RLS. This document assumes you are using the csh shell or one of its variants. If you're using sh or something similar (eg bash), you should change the setenv commands to export variable=value.

  • setenv GLOBUS_IODBC_PATH $GLOBUS_LOCATION
  • setenv ODBCINI $GLOBUS_LOCATION/var/odbc.ini
  • setenv JAVA_HOME /usr/jdk/1.4
  • setenv GLOBUS_MYSQL_PATH $GLOBUS_LOCATION # if using MySQL

3. Installing iODBC

[Caution]Caution

Please note that at the time of the GT 4.0 release, incompatibility issues were identified between iODBC and MyODBC. Our brief evaluation indicated that iODBC 3.52.2 is incompatible with MyODBC 3.51.11 and possibly earlier versions as well. We have used iODBC 3.51.1 and 3.51.2 in combination with MyODBC 3.51.06. Installing incompatible iODBC and MyODBC versions from binary packages may not indicate an error until runtime. Building these libraries from source packages may be the best way to ensure that you have installed a compatible combination.

[Important]Important

Recommended Version: 3.51.2

3.1. Run the install commands

The following commands were used during RLS development to install iODBC.

% cd $IODBCSRC % ./configure --prefix=$GLOBUS_IODBC_PATH --disable-gtktest --with-pthreads --disable-gui
    --with-iodbc-inidir=$ODBCINIDIR % gmake % gmake install 

where:

  • $IODBCSRC is the directory where you untarred the iODBC sources
  • $ODBCINIDIR is the directory where you plan to install the odbc.ini file (which you will create in the next step).

3.2. Create the odbc.ini file

Create the odbc.ini file in $ODBCINIDIR:

The contents should include the path to where you intend to install the ODBC driver for your RDBMS (such as psqlodbc.so or libmyodbc3.so).

The following is an example that should work with psqlODBC. It assumes you will name your LRC and RLI databases lrc1000 and rli1000:

[ODBC Data Sources]
lrc1000=lrc database
rli1000=rli database


[lrc1000]
Description=LRC database
DSN=lrc1000
Servertype=postgres
Servername=localhost
Database=lrc1000
ReadOnly=no


[rli1000]
Description=RLI database
DSN=rli1000
Servertype=postgres
Servername=localhost
Database=rli1000
ReadOnly=no


[Default]
Driver=/path/to/psqlodbc.so
Port=5432 

Note: You do not need an RLI database if you plan to use Bloom filters for LRC to RLI updates (Bloom filters are kept in memory). In this case you can omit the RLI entries below.

Bug: psqlODBC will not find a Data Source Name (DSN) in the system odbc.ini file $ODBCINIDIR/odbc.ini. It will find DSNs in the user's odbc.ini file if it exists at $HOME/.odbc.ini.

One work around is to copy or symlink the system odbc.ini file to each user's home directory. psqlODBC does find system DSNs in a file called odbcinst.ini, which is looked for in the etc subdirectory where iODBC was installed, $GLOBUS_IODBC_PATH/etc/odbcinst.ini. So another option besides creating user .odbc.ini files is to copy or symlink the system odbc.ini file to $GLOBUS_IODBC_PATH/etc/odbcinst.ini. Someone who understands this better may have a better answer.

3.3. Changing how clients connect to the server (for MySQL only)

If you're using MySQL and have changed how MySQL clients connect to the MySQL server in my.cnf (e.g., the port number or socket name), then you should set the option to 65536 in odbc.ini for each database. This tells MyODBC to read the client section of my.cnf to find the changed connection parameters.

[lrc1000] option = 65536 
[rli1000] option = 65536 

4. Installing the relational database

We include instructions for both PostgreSQL (Section 4.1, “Using PostgreSQL”) and MySQL (Section 4.2, “Using MySQL ”).

4.1. Using PostgreSQL

If your relational database of choice is PostgreSQL, you need to install and configure both PostgreSQL and psqlODBC (the ODBC driver for PostgreSQL) as follows:

4.1.1. Installing PostgreSQL

4.1.1.1. Running the install commands

The commands used to install PostgreSQl 7.2.3 on the RLS development system are as follows.

% cd $POSTGRESSRC % ./configure --prefix=$GLOBUS_LOCATION % gmake % gmake install 

$POSTGRESSRC is the directory where the PostgreSQL source was untarred.

4.1.1.2. Initializing PostgreSQL

Initialize PostgreSQL and start the server by running:

initdb -D /path/to/postgres-datadir
postmaster -D /path/to/postgres-datadir -i -o -F 

The -o -F flags to postmaster disable fsync() calls after transactions (which, although it improves performance, raises the risk of DB corruption).

4.1.1.3. Creating the user and password

Create the database user (in our example, called dbuser) and password that RLS will use:

createuser -P dbuser 

Important: Be sure to do periodic vacuum and analyze commands on all your PostgreSQl databases. The PostgreSQl documentation recommends doing this daily from cron. Failure to do this can seriously degrade performance, to the point where routine RLS operations (such as LRC to RLI soft state updates) timeout and fail. Please see the PostgreSQl documentation for further details.

4.1.2. Installing psqlODBC

Install psqlODBC by running the following commands (which were used to install psqlODBC 7.2.5):

% cd $PSQLODBCSRC 
% setenv CPPFLAGS -I$(IODBC_INSTALLDIR)/include 
% ./configure --prefix=$GLOBUS_LOCATION --enable-pthreads 
% gmake 
% gmake install 

where $PSQLODBCSRC is the directory where you untarred the psqlODBC source.

Note: The configure script that comes with psqlODBC supports a --with-iodbc option. However, when the RLS developers used this it resulted in RLS servers with corrupt memory that would dump core while opening the database connection. It seems to work fine (with iODBC) without this option.

You can now continue to instructions for Installing the RLS Server. See Section 5, “Installing the RLS Server”.

4.2. Using MySQL

If your relational database of choice is MySQL, you'll need to install and configure both MySQL and the MyODBC (Connector/ODBC) packages as follows:

4.2.1. Installing MySQL

Once you've installed and configured MySQL you must start the database server and create the database user/password that RLS will use to connect to the database.

4.2.1.1. Starting database server

Start the database server by running:

mysqld_safe [--defaults-file path to your my.cnf file ] 
4.2.1.2. Creating the user and password

To create the database user and password that RLS will use you must run the MySQL command line tool mysql, and specify the following commands:

mysql>  use mysql; 
mysql>  grant all on lrc1000.* to dbuser@localhost identified by 'dbpassword'; 
mysql>  grant all on rli1000.* to dbuser@localhost identified by 'dbpassword'; 

These commands assume the username you will create for RLS is dbuser with password dbpassword, and the database(s) you will create for your LRC and/or RLI server are lrc1000 and rli1000.

Creation of the LRC and/or RLI databases is covered below in Section 6, “Configuring the RLS Database”.

4.2.2. Installing MyODBC

[Important]Important

Recommended Version: 3.51.06

Please read the note under Section 3, “Installing iODBC”.

If you cannot locate this version on a public site or mirror, you can find it here.

These instructions assume that iODBC was installed in $GLOBUS_LOCATION. This may be changed by changing the --with-iodbc-includes and --with-iodbc-libs options or the --with-iodbc option.

4.2.2.1. Running install commands

Install MyODBC in $GLOBUS_LOCATION (you may choose a different directory if you wish, by changing the --prefix option to configure below):

% cd $MYODBCSRC 
% ./configure --prefix=$GLOBUS_LOCATION 
     --with-mysql-libs=$GLOBUS_MYSQL_PATH/lib/mysql
     --with-mysql-includes=$GLOBUS_MYSQL_PATH/include/mysql
     --with-iodbc=$GLOBUS_LOCATION
     --with-odbc-ini=$ODBCINIDIR 
% gmake
% gmake install

where:

  • $MYODBCSRC is the directory where you untarred the MyODBC sources.
  • $ODBCINIDIR is the directory where you created the odbc.ini file.

Bug: There is a bug in MyODBC version 3.51.05 and earlier. The debug code is not thread safe, and the RLS server will get a segmentation violation and die if this code is enabled. In versions 3.51.05 and later the debug code can be disabled with the configure option --without-debug. In earlier versions it is disabled by defining DBUG_OFF, as in the following example:

setenv CFLAGS -DBUG_OFF 

You can now continue to instructions for installing the RLS Server. See Section 5, “Installing the RLS Server”.

5. Installing the RLS Server

Download the appropriate bundle. RLS is included as part of the Globus Toolkit bundle. See the Globus Toolkit Development Downloads for a listing of available software.

RLS is installed as a part of the standard install. For basic installation instructions, see the Installation Guide.

6. Configuring the RLS Database

RLS server configuration is specified in $GLOBUS_LOCATION/etc/globus-rls-server.conf; please see the man page for globus-rls-server(8) for complete details. Some of the configuration options (such as database user/password) are mentioned below.

6.1. Creating a user and password

Create a database user that the RLS server will use to connect to the DBMS.

The database user and password you pick must be specified in the RLS server configuration file, as well as the name of the database(s) you will create (see below).

db_user dbuser 
db_pwd dbpassword 
lrc_dbname lrc1000  # optional (if LRC server) 
rli_dbname rli1000  # optional (if RLI server) 

6.2. Choosing database for RLS server

Decide which database(s) the RLS server will use (and that you will create in Section 6.4):

  • If the RLS server is a Local Replica Catalog (LRC) server you, will need to create the LRC database.
  • If the server is a Replica Location Index (RLI) server, you may need to create a RLI database.

An RLI server can receive updates from LRC servers in one of two forms, as LFN lists (in which case the RLI database must be created) or as highly compressed Bloom filters. Since Bloom filters are so small, they are kept in memory and no database is required. An RLS server can be configured as both an LRC and RLI server.

6.3. Configuring database schema

Configure the schema file(s) for the database(s) you will create.

GT 4.0 installed the schema files for the LRC and RLI databases in $GLOBUS_LOCATION/setup/globus.

For PostgreSQL, use:

  • globus-rls-lrc-postgres.sql
  • globus-rls-rli-postgres.sql

For MySQL , use:

  • globus-rls-lrc-mysql.sql
  • globus-rls-rli-mysql.sql

Edit these files to set the name of the database user you created for RLS and the names of the databases configured in $GLOBUS_LOCATION/etc/globus-rls-server.conf.

By default the database user is dbuser, the LRC database name is lrc1000 and the RLI database name is rli1000.

6.4. Creating the database(s)

Create the database(s) with the following commands (note once again that you do not need to create an RLI database if you are configuring an RLI server updated by Bloom filters):

For PostgreSQL, run:

createdb -O dbuser  -U dbuser  -W lrc1000 
createdb -O dbuser  -U dbuser  -W rli1000 
psql -W -U dbuser  -d lrc1000  -f $GLOBUS_LOCATION/setup/globus/globus-rls-lrc-postgres.sql 
psql -W -U dbuser  -d rli1000  -f $GLOBUS_LOCATION/setup/globus/globus-rls-rli-postgres.sql 

For MySQL, run:

mysql -p -u dbuser  < $GLOBUS_LOCATION/setup/globus/globus-rls-lrc-mysql.sql 
mysql -p -u dbuser  < $GLOBUS_LOCATION/setup/globus/globus-rls-rli-mysql.sql
[Important]Important

Before continuing, it is recommended that you first test the database configuration using the iodbctest utility provided with a typical iODBC installation.

Testing with iODBC, run:

% $GLOBUS_IODBC_PATH/bin/iodbctest "DSN=lrc1000;UID=dbuser;PWD=dbpassword"
iODBC Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.51.0002.0224
Driver: 03.51.06

SQL>show tables;
 
Tables_in_lrc1000
-----------------
t_attribute
t_date_attr
t_flt_attr
t_int_attr
t_lfn
t_map
t_pfn
t_rli
t_rlipartition
t_str_attr
 
 result set 1 returned 10 rows.

SQL>quit
 
Have a nice day.

Use the show tables command if you are using a MySQL database. Use the postgresql equivalent command if you are using a Postgresql database. Also the driver version number (03.51.06 above) will vary depending on the ODBC driver you are using.

[Warning]Warning

If the the above test fails, then RLS will not run properly. You must have a valid database configuration before proceeding with RLS installation and configuration.

7. Configuring the RLS Server

Review the server configuration file $GLOBUS_LOCATION/etc/globus-rls-server.conf and change any options you want. The server man page globus-rls-server(8) has complete details on all the options.

A minimal configuration file for both an LRC and RLI server would be:

# Configure the database connection info 
  db_user dbuser 
  db_pwd dbpassword 
   
# If the server is an LRC server 
  lrc_server true 
  lrc_dbname lrc1000 
   
# If the server is an RLI server 
  rli_server true 
  rli_dbname rli1000 # Not needed if updated by Bloom filters 
   
# Configure who can make requests of the server 
  acl .*: all 

# RE matching grid-mapfile users or DNs from x509 certs 

The server uses a host certificate to identify itself to clients. By default this certificate is located in the files /etc/grid-security/hostcert.pem and /etc/grid-security/hostkey.pem. Host certificates have a distinguished name of the form /CN=host/FQDN. If the host you plan to run the RLS server on does not have a host certificate, you must obtain one from your Certificate Authority. The RLS server must be run as the same user who owns the host certificate files (typically root). The location of the host certificate files may be specified in $GLOBUS_LOCATION/etc/globus-rls-server.conf:

rlscertfile path-to-cert-file # default /etc/grid-security/hostcert.pem 
rlskeyfile path-to-key-file # default /etc/grid-security/hostkey.pem 

It is possible to run the RLS server without authentication, by starting it with the -N option, and using URL's of the form rlsn://server to connect to it. If authentication is enabled, RLI servers must include acl configuration options that match the identities of LRC servers that update it and that grant the rli_update permission to the LRCs.

8. Starting the RLS Server

Start the RLS Server by running:

$GLOBUS_LOCATION/sbin/SXXrls start

8.1. Notes on RLS Initialization

Please be advised (and advise other users responsible for bringing up the RLS) that the startup initialization may take a few minutes before the RLS may be accessible. The initialization involves two key operations that may consume significant resources causing the server to appear temporarily unresponsive. Users of RLS may mistakenly assume that RLS failed to startup and may kill the server and start over. Some users may fall into this in a repeated cycle, believing that the RLS is unable to startup properly.

If the RLS is configured to send compressed updates (Bloom filters) to other RLIs, the RLS startup will involve initialization of the Bloom filter representing the current contents of the local replica catalog (LRC). This step is a prerequisite before any additional operations may be allowed, therefore no client connections are permitted until the initialization is complete. In our test environment, we have seen over 30 seconds delay due to creation of the Bloom filter corresponding to 1 million LFN names on a system with Dual 1 GHz CPU and 1.5 GB RAM. You may experience greater delays at larger scales and/or when running RLS with more limited system resources.

If the RLS is configured to send uncompressed updates (LFN lists) to other RLIs, the RLS startup will not involve any additional initialization delay. However, the RLS will spawn an initial full catalog update to all RLIs it updates. Though these updates will take place on separate threads of execution after the initialization of the system, they will consume a great amount of processor activity. Depending on the volume of the local replica catalog (LRC), this processor activity may initially interfere with a client operation. In our test environment, we have seen our initial "globus-rls-admin ping..." operation may suffer a delay and timeout in 30 seconds, the second "ping" may delay for a few seconds but will successfully return, and the third and every subsequent "ping" operation will successfully return immediately throughout the duration of the update. The system exhibits the same behavior for any other client operation, such as a "globus-rls-cli query..." operation.

9. Stopping the RLS Server

Stop the RLS Server by running:

$GLOBUS_LOCATION/sbin/SXXrls stop 

10. Configuring the RLS Server for the MDS2 GRIS

The server package includes a program called globus-rls-reporter that will report information about an RLS server to the MDS2 GRIS. Use this procedure to enable this program:

  1. To enable Index Service reporting, add the contents of the file $GLOBUS_LOCATION/setup/globus/rls-ldif.conf to the MDS2 GRIS configuration file $GLOBUS_LOCATION/etc/grid-info-resource-ldif.conf.
  2. If necessary, set your virtual organization (VO) name in $GLOBUS_LOCATION/setup/globus/rls-ldif.conf . The default value is local. The VO name is referenced twice, on the lines beginning dn: and args:.
  3. You must restart your MDS (GRIS) server after modifying $GLOBUS_LOCATION/etc/grid-info-resoruce-ldif.conf. You can use the following commands to do so:
$GLOBUS_LOCATION/sbin/SXXgris stop 
$GLOBUS_LOCATION/sbin/SXXgris start 

11. Configuring the RLS Server for the WS MDS Index Service

The server package includes a script $GLOBUS_LOCATION/libexec/aggrexec/globus-rls-aggregatorsource.pl that may be used as an Execution Aggregator Source by MDS. See GT 4.0 Index Services for more information on setting up and using the Execution Aggregator Source scripts in MDS. The script may be invoked as follows and will generate output in the format as depicted.

% $GLOBUS_LOCATION/libexec/aggrexec/globus-rls-aggregatorsource.pl rls://mysite
<?xml version="1.0" encoding="UTF-8"?>
<rlsStats>
  <site>rls://mysite</site>
  <version>4.0</version>
  <uptime>03:08:15</uptime>
  <serviceList>
    <service>lrc</service>
    <service>rli</service>
  </serviceList>
  <lrc>
    <updateMethodList>
      <updateMethod>lfnlist</updateMethod>
      <updateMethod>bloomfilter</updateMethod>
    </updateMethodList>
    <updatesList>
      <updates>
        <site>rls://myothersite:39281</site>
        <method>bloomfilter</method>
        <date>08/01/05</date>
        <time>16:16:38</time>
      </updates>
    </updatesList>
    <numlfn>283902</numlfn>
    <numpfn>593022</numpfn>
    <nummap>593022</nummap>
  </lrc>
  <rli>
    <updatedViaList>
      <updatedVia>bloomfilters</updatedVia>
    </updatedViaList>
    <updatedByList>
      <updatedBy>
        <site>rls://myothersite:39281</site>
        <date>08/01/05</date>
        <time>10:03:21</time>
      </updatedBy>
    </updatedByList>
  </rli>
</rlsStats>
[Important]Important

Be sure to configure the security context of the container running the MDS, and be sure that the security configuration on the RLS host recognizes the MDS security context.

When following the instructions provided by the GT 4.0 Index Services, you will need to consider the security context used by the MDS to invoke the Execution Aggregator Source script provided by RLS. Most deployments of RLS run the service with security enabled. Therefore any client connections, including administrative status operations, require authentication and authorization. In order for MDS to use the provided script to check RLS status, it must invoke the script with a valid user proxy or user certificate and key. The RLS must recognize the DN from the user certificate (i.e., the DN should be in the gridmap file).

One way to configure the MDS security context for use with RLS monitoring is to set the environment variables X509_USER_CERT and X509_USER_KEY to point to the container certificate and key. Run the MDS with these environment settings. Also, add the DN from the container certificate to the gridmap file on the host running the RLS.

Alternatively, you could modify the provided script so that it sets the environment variables to another user certificate and key (or proxy) as desired before calling the RLS.

12. RedHat 9 Incompatibility

This note applies to RedHat 9 but could also apply to other Linux distributions.

There have been occurrences of RLS servers hanging on RedHat 9 systems. The external symptoms are:

  1. The server does not accept new connections from clients, with an error message similar to:

    connect(rls://XXXXX): globus_rls_client: IO timeout:
    globus_io_tcp_register_connect() timed out after 30 seconds

  2. Often, the server continues to receive and send updates as configured and respond to signals. You can check this by querying other servers that interact with the one that's hung. Under gdb: All the server threads are waiting to be signaled on a condition variable. Sometimes, this is in globus_io functions, particularly in globus_io_cancel().

12.1. Probable cause

This seems to be due to a problem in the new kernel and thread libraries of RedHat 9. A problem in pthread_cond_wait() causes threads not to wake up correctly.

This problem has been seen with the following kernels and glibc packages:

  • Kernels:

    • 2.4.20-30.9
    • 2.4.20-8

  • glibc:

    • glibc-2.3.2-27.9.7

12.2. Suggested workaround

The problems don't seem to arise when RLS is linked with older pthread libraries. This can be done as by adding a couple of lines to the RLS startup script in $GLOBUS_LOCATION/sbin/SXXrls, as shown:

<--- START --->
#!/bin/sh

GLOBUS_LOCATION=/opt/gt3.2
MYSQL=/opt/mysql
IODBC=/opt/iodbc

export GLOBUS_LOCATION

#RedHat 9 workaround
LD_ASSUME_KERNEL=2.4.1
export LD_ASSUME_KERNEL
<--- END --->

On i586 systems, set:

LD_ASSUME_KERNEL=2.2.5