This site is a work in progress — you can help! Please see the Site news for details.

Resource Agents

From Linux-HA

Jump to: navigation, search

A resource agent is a standardized interface for a cluster resource. In translates a standard set of operations into steps specific to the resource or application, and interprets their results as success or failure.

Resource Agents have been managed as a separate Linux-HA sub-project since their 1.0 release, which coincided with the Heartbeat 2.99 release. Previously, they were a part of the then-monolithic Heartbeat project, and had no collective name.

Supported Operations

Operations which a resource agent my perform on a resource instance include:

  • start: enable or start the given resource
  • stop: disable or stop the given resource
  • status: return the status of the given resource (running or not running)
  • monitor: like status, but also check specifically for unexpected not running states
  • validate: validate the resource's configuration
  • meta-data: return information about the resource agent itself (used by GUIs and other management utilities, and documentation tools)

Implementation

Most resource agents are coded as shell scripts. This, however, is by no means a necessity – the defined interface is language agnostic.

They are synchronous in nature. That is, you start them, and they complete some time later, and you are expected to wait for them to complete. Certain operations (notably start, stop and monitor) may take considerable time to complete. Considerable time means seconds to many minutes in some cases.

Source Code Repository

Source code for Resource Agents is being maintained in the http://hg.linux-ha.org/agents Mercurial Repository.

Available Resource Agents (current as at 2010-05-28)

anything Manages an arbitrary service
This is a generic OCF RA to manage almost anything.
AoEtarget Manages ATA-over-Ethernet (AoE) target exports
This resource agent manages an ATA-over-Ethernet (AoE) target using vblade.
It exports any block device, or file, as an AoE target using the 
specified Ethernet device, shelf, and slot number.
apache Manages an Apache web server instance
This is the resource agent for the Apache web server.
Thie resource agent operates both version 1.x and version 2.x Apache
servers.

The start operation ends with a loop in which monitor is
repeatedly called to make sure that the server started and that
it is operational. Hence, if the monitor operation does not
succeed within the start operation timeout, the apache resource
will end with an error status.

The monitor operation by default loads the server status page
which depends on the mod_status module and the corresponding
configuration file (usually /etc/apache2/mod_status.conf).
Make sure that the server status page works and that the access
is allowed *only* from localhost (address 127.0.0.1).
See the statusurl and testregex attributes for more details.

See also http://httpd.apache.org/
AudibleAlarm Emits audible beeps at a configurable interval
Resource script for AudibleAlarm. It sets an audible alarm running by beeping 
at a set interval. 
ClusterMon Runs crm_mon in the background, recording the cluster status to an HTML file
This is a ClusterMon Resource Agent.
It outputs current cluster status to the html.
CTDB CTDB Resource Agent
This resource agent manages CTDB, allowing one to use Clustered Samba
in a Linux-HA/Pacemaker cluster.  You need a shared filesystem
(e.g. OCFS2) on which CTDB lock and Samba state will be stored.
Configure shares in smb.conf on all nodes, and create /etc/ctdb/nodes
containing a list of private IP addresses of each node in the cluster.
Configure this RA as a clone, and it will take care of the rest.
For more information see http://linux-ha.org/wiki/CTDB_(resource_agent)
db2 Manages an IBM DB2 Universal Database instance
Resource script for db2. It manages a DB2 Universal Database instance as an HA resource.
Delay Waits for a defined timespan
This script is a test resource for introducing delay.
drbd Manages a DRBD resource (deprecated)
Deprecation warning: This agent is deprecated and may be removed from
a future release. See the ocf:linbit:drbd resource agent for a
supported alternative. --
This resource agent manages a Distributed
Replicated Block Device (DRBD) object as a master/slave
resource. DRBD is a mechanism for replicating storage; please see the
documentation for setup details.
Dummy Example stateless resource agent
This is a Dummy Resource Agent. It does absolutely nothing except 
keep track of whether its running or not.
Its purpose in life is for testing and to serve as a template for RA writers.

NB: Please pay attention to the timeouts specified in the actions
section below. They should be meaningful for the kind of resource
the agent manages. They should be the minimum advised timeouts,
but they shouldn't/cannot cover _all_ possible resource
instances. So, try to be neither overly generous nor too stingy,
but moderate. The minimum timeouts should never be below 10 seconds.
eDir88 Manages a Novell eDirectory directory server
Resource script for managing an eDirectory instance. Manages a single instance
of eDirectory as an HA resource. The "multiple instances" feature or
eDirectory has been added in version 8.8. This script will not work for any
version of eDirectory prior to 8.8. This RA can be used to load multiple
eDirectory instances on the same host.

It is very strongly recommended to put eDir configuration files (as per the
eDir_config_file parameter) on local storage on each node. This is necessary for
this RA to be able to handle situations where the shared storage has become
unavailable. If the eDir configuration file is not available, this RA will fail,
and heartbeat will be unable to manage the resource. Side effects include
STONITH actions, unmanageable resources, etc...

Setting a high action timeout value is _very_ _strongly_ recommended. eDir
with IDM can take in excess of 10 minutes to start. If heartbeat times out
before eDir has had a chance to start properly, mayhem _WILL ENSUE_.

The LDAP module seems to be one of the very last to start. So this script will
take even longer to start on installations with IDM and LDAP if the monitoring
of IDM and/or LDAP is enabled, as the start command will wait for IDM and LDAP
to be available.
Evmsd Controls clustered EVMS volume management

(deprecated)

Deprecation warning: EVMS is no longer actively maintained and should not be used. This agent is deprecated and may be removed from a future release. --
This is a Evmsd Resource Agent.
EvmsSCC Manages EVMS Shared Cluster Containers (SCCs) (deprecated)
Deprecation warning: EVMS is no longer actively maintained and should not be used. This agent is deprecated and may be removed from a future release. --
Resource script for EVMS shared cluster container. It runs evms_activate on one node in the cluster.
exportfs

Manages NFS exports

Exportfs uses the exportfs command to add/remove nfs exports.
It does NOT manage the nfs server daemon.
It depends on Linux specific NFS implementation details,
so is considered not portable to other platforms yet.
Filesystem Manages filesystem mounts
Resource script for Filesystem. It manages a Filesystem on a
shared storage medium. 

The standard monitor operation of depth 0 (also known as probe)
checks if the filesystem is mounted. If you want deeper tests,
set OCF_CHECK_LEVEL to one of the following values:

10: read first 16 blocks of the device (raw read)

This doesn't exercise the filesystem at all, but the device on
which the filesystem lives. This is noop for non-block devices
such as NFS, SMBFS, or bind mounts.

20: test if a status file can be written and read

The status file must be writable by root. This is not always the
case with an NFS mount, as NFS exports usually have the
"root_squash" option set. In such a setup, you must either use
read-only monitoring (depth=10), export with "no_root_squash" on
your NFS server, or grant world write permissions on the
directory where the status file is to be placed.
fio fio IO load generator
fio is a generic I/O load generator. This RA allows start/stop of fio
instances to simulate load on a cluster without configuring complex
services.
ICP Manages an ICP Vortex clustered host drive
Resource script for ICP. It Manages an ICP Vortex clustered host drive as an 
HA resource. 
ids Manages an Informix Dynamic Server (IDS) instance
OCF resource agent to manage an IBM Informix Dynamic Server (IDS) instance as an High-Availability resource.
IPaddr Manages virtual IPv4 addresses (portable version)
This script manages IP alias IP addresses
It can add an IP alias, or remove one.
IPaddr2 Manages virtual IPv4 addresses (Linux specific version)
This Linux-specific resource manages IP alias IP addresses.
It can add an IP alias, or remove one.
In addition, it can implement Cluster Alias IP functionality
if invoked as a clone resource.
IPsrcaddr Manages the preferred source address for outgoing IP packets
Resource script for IPsrcaddr. It manages the preferred source address
modification. 
iscsi Manages a local iSCSI initiator and its connections to iSCSI targets
OCF Resource Agent for iSCSI. Add (start) or remove (stop) iSCSI
targets.
iSCSILogicalUnit Manages iSCSI Logical Units (LUs)
Manages iSCSI Logical Unit. An iSCSI Logical unit is a subdivision of 
an SCSI Target, exported via a daemon that speaks the iSCSI protocol.
iSCSITarget iSCSI target export agent
Manages iSCSI targets. An iSCSI target is a collection of SCSI Logical
Units (LUs) exported via a daemon that speaks the iSCSI protocol.
LinuxSCSI Enables and disables SCSI devices through the

kernel SCSI hot-plug subsystem (deprecated)

Deprecation warning: This agent makes use of Linux SCSI hot-plug
functionality which has been superseded by SCSI reservations. It is
deprecated and may be removed from a future release. See the
scsi2reservation and sfex agents for alternatives. --
This is a resource agent for LinuxSCSI. It manages the availability of a
SCSI device from the point of view of the linux kernel. It make Linux
believe the device has gone away, and it can make it come back again.
LVM Controls the availability of an LVM Volume Group
Resource script for LVM. It manages an  Linux Volume Manager volume (LVM) 
as an HA resource. 
MailTo Notifies recipients by email in the event of resource takeover
This is a resource agent for MailTo. It sends email to a sysadmin whenever 
a takeover occurs.
ManageRAID Manages RAID devices
Manages starting, stopping and monitoring of RAID devices which
are preconfigured in /etc/conf.d/HB-ManageRAID.
ManageVE Manages an OpenVZ Virtual Environment (VE)
This OCF complaint resource agent manages OpenVZ VEs and thus requires
a proper OpenVZ installation including a recent vzctl util.
mysql Manages a MySQL database instance
Resource script for MySQL. 
May manage a standalone MySQL database, a clone set with externally
managed replication, or a complete master/slave replication setup.
mysql-proxy Manages a MySQL Proxy daemon
This script manages MySQL Proxy as an OCF resource in a high-availability setup.
Tested with MySQL Proxy 0.7.0 on Debian 5.0.
nfsserver Manages an NFS server
Nfsserver helps to manage the Linux nfs server as a failover-able resource in Linux-HA.
It depends on Linux specific NFS implementation details, so is considered not portable to other platforms yet.
oracle Manages an Oracle Database instance
Resource script for oracle. Manages an Oracle Database instance
as an HA resource.
oralsnr Manages an Oracle TNS listener
Resource script for Oracle Listener. It manages an
Oracle Listener instance as an HA resource.
pgsql Manages a PostgreSQL database instance
Resource script for PostgreSQL. It manages a PostgreSQL as an HA resource.
pingd Monitors connectivity to specific hosts or

IP addresses ("ping nodes") (deprecated)

Deprecation warning: This agent is deprecated and may be removed from
a future release. See the ocf:pacemaker:pingd resource agent for a
supported alternative. --
This is a pingd Resource Agent.
It records (in the CIB) the current number of ping nodes a node can connect to.
portblock Block and unblocks access to TCP and UDP ports
Resource script for portblock. It is used to temporarily block ports 
using iptables. In addition, it may allow for faster TCP reconnects
for clients on failover. Use that if there are long lived TCP
connections to an HA service. This feature is enabled by setting the
tickle_dir parameter and only in concert with action set to unblock.
Note that the tickle ACK function is new as of version 3.0.2 and
hasn't yet seen widespread use.
postfix Manages a highly available Postfix mail server instance
This script manages Postfix as an OCF resource in a high-availability setup.
Tested with Postfix 2.5.5 on Debian 5.0.
proftpd OCF Resource Agent compliant FTP script.
This script manages Proftpd in an Active-Passive setup
Pure-FTPd Manages a Pure-FTPd FTP server instance
This script manages Pure-FTPd in an Active-Passive setup
Raid1 Manages a software RAID1 device on shared storage
Resource script for RAID1. It manages a software Raid1 device on a shared 
storage medium. 
Route Manages network routes
Enables and disables network routes.

Supports host and net routes, routes via a gateway address,
and routes using specific source addresses.

This resource agent is useful if a node's routing table
needs to be manipulated based on node role assignment.
Consider the following example use case:
-  One cluster node serves as an IPsec tunnel endpoint.
-  All other nodes use the IPsec tunnel to reach hosts
in a specific remote network.
Then, here is how you would implement this scheme making use
of the Route resource agent:
-  Configure an ipsec LSB resource.
-  Configure a cloned Route OCF resource.
-  Create an order constraint to ensure 
that ipsec is started before Route.
-  Create a colocation constraint between the
ipsec and Route resources, to make sure no instance
of your cloned Route resource is started on the
tunnel endpoint itself.
rsyncd Manages an rsync daemon
This script manages rsync daemon
SAPDatabase Manages any SAP database (based on Oracle, MaxDB, or DB2)
Resource script for SAP databases. It manages a SAP database of any type as an HA resource.
SAPInstance Manages a SAP instance
Resource script for SAP. It manages a SAP Instance as an HA resource.
scsi2reservation

scsi-2 reservation

The scsi-2-reserve resource agent is a place holder for SCSI-2 reservation.
A healthy instance of scsi-2-reserve resource, indicates the own of the specified SCSI device.
This resource agent depends on the scsi_reserve from scsires package, which is Linux specific.
SendArp Broadcasts unsolicited ARP announcements
This script send out gratuitous Arp for an IP address
ServeRAID Enables and disables shared ServeRAID merge groups
Resource script for ServeRAID. It enables/disables shared ServeRAID merge groups.
sfex Manages exclusive acess to shared storage using Shared Disk File EXclusiveness (SF-EX)
Resource script for SF-EX. It manages a shared storage medium exclusively .
SphinxSearchDaemon Manages the Sphinx search daemon.
This is a searchd Resource Agent. It manages the Sphinx Search Daemon.
Squid Manages a Squid proxy server instance
The resource agent of Squid.
This manages a Squid instance as an HA resource.
Stateful Example stateful resource agent
This is an example resource agent that impliments two states
SysInfo Records various node attributes in the CIB
This is a SysInfo Resource Agent.
It records (in the CIB) various attributes of a node
Sample Linux output:
arch:   i686
os:     Linux-2.4.26-gentoo-r14
free_swap:      1999
cpu_info:       Intel(R) Celeron(R) CPU 2.40GHz
cpu_speed:      4771.02
cpu_cores:      1
cpu_load:       0.00
ram_total:      513
ram_free:       117
root_free:      2.4

Sample Darwin output:
arch:   i386
os:     Darwin-8.6.2
cpu_info:       Intel Core Duo
cpu_speed:      2.16
cpu_cores:      2
cpu_load:       0.18
ram_total:      2016
ram_free:       787
root_free:      13

Units:
free_swap: Mb
ram_*:     Mb
root_free: Gb
cpu_speed (Linux): bogomips
cpu_speed (Darwin): Ghz

syslog-ng Syslog-ng resource agent
This script manages a syslog-ng instance as an HA resource.
tomcat Manages a Tomcat servlet environment instance
Resource script for tomcat. It manages a Tomcat instance as an HA resource.
VIPArip Manages a virtual IP address through RIP2
Virtual IP Address by RIP2 protocol.
This script manages IP alias in different subnet with quagga/ripd.
It can add an IP alias, or remove one.
VirtualDomain Manages virtual domains through the libvirt virtualization framework
Resource agent for a virtual domain (a.k.a. domU, virtual machine,
virtual environment etc., depending on context) managed by libvirtd.
vmware Manages VMWare Server 2.0 virtual machines
OCF compliant script to control vmware server 2.0 virtual machines.
WAS Manages a WebSphere Application Server instance
Resource script for WAS. It manages a Websphere Application Server (WAS) as 
an HA resource.
WAS6 Manages a WebSphere Application Server 6 instance
Resource script for WAS6. It manages a Websphere Application Server (WAS6) as
an HA resource.
WinPopup Sends an SMB notification message to selected hosts
Resource script for WinPopup. It sends WinPopups message to a 
sysadmin's workstation whenever a takeover occurs.
Xen Manages Xen unprivileged domains (DomUs)
Resource Agent for the Xen Hypervisor.
Manages Xen virtual machine instances by mapping cluster resource
start and stop,  to Xen create and shutdown, respectively.

A note on names

We will try to extract the name from the config file (the xmfile
attribute). If you use a simple assignment statement, then you
should be fine. Otherwise, if there's some python acrobacy
involved such as dynamically assigning names depending on other
variables, and we will try to detect this, then please set the
name attribute. You should also do that if there is any chance of
a pathological situation where a config file might be missing,
for example if it resides on a shared storage. If all fails, we
finally fall back to the instance id to preserve backward
compatibility.

Para-virtualized guests can also be migrated by enabling the
meta_attribute allow-migrate.

Xinetd Manages an Xinetd service
Resource script for Xinetd. It starts/stops services managed
by xinetd.

Note that the xinetd daemon itself must be running: we are not
going to start it or stop it ourselves.

Important: in case the services managed by the cluster are the
only ones enabled, you should specify the -stayalive option for
xinetd or it will exit on Heartbeat stop. Alternatively, you may
enable some internal service such as echo.

To regenerate the table above, use:

 #!/bin/bash
 #
 # Run this from the root of the agents source tree to get a mediawiki
 # table of all RAs that will be built.  Requires xmlstarlet.
 #
 echo '{|'
 AGENTS=$(sed -n '/^ocf_SCRIPTS/,/^$/p' heartbeat/Makefile.am | sed -e 's/ocf_SCRIPTS[[:space:]]*=[[:space:]]*//' -e 's/\\//' | sort)
 for ra in $AGENTS; do
 	echo '|-'
 	echo "|[http://linux-ha.org/doc/re-ra-$ra.html $ra]"
 	sed -n '/<?xml/,/<\/resource-agent>/p' heartbeat/$ra | \
 		xml sel -T -t -o '|' -v '//shortdesc' -n -o '|-' -n \
 			-o '|||&lt;pre&gt;' \
 			-v '//longdesc' \
 			-o '&lt;/pre&gt;' -n |
 		sed -e 's/^[[:space:]]*//g'
 done
 echo '|}'
 
Personal tools