Resource Agents
From Linux-HA
A resource agent is a standardized interface for a cluster resource. In translates a standard set of operations into steps specific to the resource or application, and interprets their results as success or failure.
Resource Agents have been managed as a separate Linux-HA sub-project since their 1.0 release, which coincided with the Heartbeat 2.99 release. Previously, they were a part of the then-monolithic Heartbeat project, and had no collective name.
Supported Operations
Operations which a resource agent my perform on a resource instance include:
- start: enable or start the given resource
- stop: disable or stop the given resource
- status: return the status of the given resource (running or not running)
- monitor: like status, but also check specifically for unexpected not running states
- validate: validate the resource's configuration
- meta-data: return information about the resource agent itself (used by GUIs and other management utilities, and documentation tools)
Implementation
Most resource agents are coded as shell scripts. This, however, is by no means a necessity – the defined interface is language agnostic.
They are synchronous in nature. That is, you start them, and they complete some time later, and you are expected to wait for them to complete. Certain operations (notably start, stop and monitor) may take considerable time to complete. Considerable time means seconds to many minutes in some cases.
Source Code Repository
Source code for Resource Agents is being maintained in the http://hg.linux-ha.org/agents Mercurial Repository.
Available Resource Agents (current as at 2010-05-28)
| anything | Manages an arbitrary service |
This is a generic OCF RA to manage almost anything. | |
| AoEtarget | Manages ATA-over-Ethernet (AoE) target exports |
This resource agent manages an ATA-over-Ethernet (AoE) target using vblade. It exports any block device, or file, as an AoE target using the specified Ethernet device, shelf, and slot number. | |
| apache | Manages an Apache web server instance |
This is the resource agent for the Apache web server. Thie resource agent operates both version 1.x and version 2.x Apache servers. The start operation ends with a loop in which monitor is repeatedly called to make sure that the server started and that it is operational. Hence, if the monitor operation does not succeed within the start operation timeout, the apache resource will end with an error status. The monitor operation by default loads the server status page which depends on the mod_status module and the corresponding configuration file (usually /etc/apache2/mod_status.conf). Make sure that the server status page works and that the access is allowed *only* from localhost (address 127.0.0.1). See the statusurl and testregex attributes for more details. See also http://httpd.apache.org/ | |
| AudibleAlarm | Emits audible beeps at a configurable interval |
Resource script for AudibleAlarm. It sets an audible alarm running by beeping at a set interval. | |
| ClusterMon | Runs crm_mon in the background, recording the cluster status to an HTML file |
This is a ClusterMon Resource Agent. It outputs current cluster status to the html. | |
| CTDB | CTDB Resource Agent |
This resource agent manages CTDB, allowing one to use Clustered Samba in a Linux-HA/Pacemaker cluster. You need a shared filesystem (e.g. OCFS2) on which CTDB lock and Samba state will be stored. Configure shares in smb.conf on all nodes, and create /etc/ctdb/nodes containing a list of private IP addresses of each node in the cluster. Configure this RA as a clone, and it will take care of the rest. For more information see http://linux-ha.org/wiki/CTDB_(resource_agent) | |
| db2 | Manages an IBM DB2 Universal Database instance |
Resource script for db2. It manages a DB2 Universal Database instance as an HA resource. | |
| Delay | Waits for a defined timespan |
This script is a test resource for introducing delay. | |
| drbd | Manages a DRBD resource (deprecated) |
Deprecation warning: This agent is deprecated and may be removed from a future release. See the ocf:linbit:drbd resource agent for a supported alternative. -- This resource agent manages a Distributed Replicated Block Device (DRBD) object as a master/slave resource. DRBD is a mechanism for replicating storage; please see the documentation for setup details. | |
| Dummy | Example stateless resource agent |
This is a Dummy Resource Agent. It does absolutely nothing except keep track of whether its running or not. Its purpose in life is for testing and to serve as a template for RA writers. NB: Please pay attention to the timeouts specified in the actions section below. They should be meaningful for the kind of resource the agent manages. They should be the minimum advised timeouts, but they shouldn't/cannot cover _all_ possible resource instances. So, try to be neither overly generous nor too stingy, but moderate. The minimum timeouts should never be below 10 seconds. | |
| eDir88 | Manages a Novell eDirectory directory server |
Resource script for managing an eDirectory instance. Manages a single instance of eDirectory as an HA resource. The "multiple instances" feature or eDirectory has been added in version 8.8. This script will not work for any version of eDirectory prior to 8.8. This RA can be used to load multiple eDirectory instances on the same host. It is very strongly recommended to put eDir configuration files (as per the eDir_config_file parameter) on local storage on each node. This is necessary for this RA to be able to handle situations where the shared storage has become unavailable. If the eDir configuration file is not available, this RA will fail, and heartbeat will be unable to manage the resource. Side effects include STONITH actions, unmanageable resources, etc... Setting a high action timeout value is _very_ _strongly_ recommended. eDir with IDM can take in excess of 10 minutes to start. If heartbeat times out before eDir has had a chance to start properly, mayhem _WILL ENSUE_. The LDAP module seems to be one of the very last to start. So this script will take even longer to start on installations with IDM and LDAP if the monitoring of IDM and/or LDAP is enabled, as the start command will wait for IDM and LDAP to be available. | |
| Evmsd | Controls clustered EVMS volume management
(deprecated) |
Deprecation warning: EVMS is no longer actively maintained and should not be used. This agent is deprecated and may be removed from a future release. -- This is a Evmsd Resource Agent. | |
| EvmsSCC | Manages EVMS Shared Cluster Containers (SCCs) (deprecated) |
Deprecation warning: EVMS is no longer actively maintained and should not be used. This agent is deprecated and may be removed from a future release. -- Resource script for EVMS shared cluster container. It runs evms_activate on one node in the cluster. | |
| exportfs |
Manages NFS exports |
Exportfs uses the exportfs command to add/remove nfs exports. It does NOT manage the nfs server daemon. It depends on Linux specific NFS implementation details, so is considered not portable to other platforms yet. | |
| Filesystem | Manages filesystem mounts |
Resource script for Filesystem. It manages a Filesystem on a shared storage medium. The standard monitor operation of depth 0 (also known as probe) checks if the filesystem is mounted. If you want deeper tests, set OCF_CHECK_LEVEL to one of the following values: 10: read first 16 blocks of the device (raw read) This doesn't exercise the filesystem at all, but the device on which the filesystem lives. This is noop for non-block devices such as NFS, SMBFS, or bind mounts. 20: test if a status file can be written and read The status file must be writable by root. This is not always the case with an NFS mount, as NFS exports usually have the "root_squash" option set. In such a setup, you must either use read-only monitoring (depth=10), export with "no_root_squash" on your NFS server, or grant world write permissions on the directory where the status file is to be placed. | |
| fio | fio IO load generator |
fio is a generic I/O load generator. This RA allows start/stop of fio instances to simulate load on a cluster without configuring complex services. | |
| ICP | Manages an ICP Vortex clustered host drive |
Resource script for ICP. It Manages an ICP Vortex clustered host drive as an HA resource. | |
| ids | Manages an Informix Dynamic Server (IDS) instance |
OCF resource agent to manage an IBM Informix Dynamic Server (IDS) instance as an High-Availability resource. | |
| IPaddr | Manages virtual IPv4 addresses (portable version) |
This script manages IP alias IP addresses It can add an IP alias, or remove one. | |
| IPaddr2 | Manages virtual IPv4 addresses (Linux specific version) |
This Linux-specific resource manages IP alias IP addresses. It can add an IP alias, or remove one. In addition, it can implement Cluster Alias IP functionality if invoked as a clone resource. | |
| IPsrcaddr | Manages the preferred source address for outgoing IP packets |
Resource script for IPsrcaddr. It manages the preferred source address modification. | |
| iscsi | Manages a local iSCSI initiator and its connections to iSCSI targets |
OCF Resource Agent for iSCSI. Add (start) or remove (stop) iSCSI targets. | |
| iSCSILogicalUnit | Manages iSCSI Logical Units (LUs) |
Manages iSCSI Logical Unit. An iSCSI Logical unit is a subdivision of an SCSI Target, exported via a daemon that speaks the iSCSI protocol. | |
| iSCSITarget | iSCSI target export agent |
Manages iSCSI targets. An iSCSI target is a collection of SCSI Logical Units (LUs) exported via a daemon that speaks the iSCSI protocol. | |
| LinuxSCSI | Enables and disables SCSI devices through the
kernel SCSI hot-plug subsystem (deprecated) |
Deprecation warning: This agent makes use of Linux SCSI hot-plug functionality which has been superseded by SCSI reservations. It is deprecated and may be removed from a future release. See the scsi2reservation and sfex agents for alternatives. -- This is a resource agent for LinuxSCSI. It manages the availability of a SCSI device from the point of view of the linux kernel. It make Linux believe the device has gone away, and it can make it come back again. | |
| LVM | Controls the availability of an LVM Volume Group |
Resource script for LVM. It manages an Linux Volume Manager volume (LVM) as an HA resource. | |
| MailTo | Notifies recipients by email in the event of resource takeover |
This is a resource agent for MailTo. It sends email to a sysadmin whenever a takeover occurs. | |
| ManageRAID | Manages RAID devices |
Manages starting, stopping and monitoring of RAID devices which are preconfigured in /etc/conf.d/HB-ManageRAID. | |
| ManageVE | Manages an OpenVZ Virtual Environment (VE) |
This OCF complaint resource agent manages OpenVZ VEs and thus requires a proper OpenVZ installation including a recent vzctl util. | |
| mysql | Manages a MySQL database instance |
Resource script for MySQL. May manage a standalone MySQL database, a clone set with externally managed replication, or a complete master/slave replication setup. | |
| mysql-proxy | Manages a MySQL Proxy daemon |
This script manages MySQL Proxy as an OCF resource in a high-availability setup. Tested with MySQL Proxy 0.7.0 on Debian 5.0. | |
| nfsserver | Manages an NFS server |
Nfsserver helps to manage the Linux nfs server as a failover-able resource in Linux-HA. It depends on Linux specific NFS implementation details, so is considered not portable to other platforms yet. | |
| oracle | Manages an Oracle Database instance |
Resource script for oracle. Manages an Oracle Database instance as an HA resource. | |
| oralsnr | Manages an Oracle TNS listener |
Resource script for Oracle Listener. It manages an Oracle Listener instance as an HA resource. | |
| pgsql | Manages a PostgreSQL database instance |
Resource script for PostgreSQL. It manages a PostgreSQL as an HA resource. | |
| pingd | Monitors connectivity to specific hosts or
IP addresses ("ping nodes") (deprecated) |
Deprecation warning: This agent is deprecated and may be removed from a future release. See the ocf:pacemaker:pingd resource agent for a supported alternative. -- This is a pingd Resource Agent. It records (in the CIB) the current number of ping nodes a node can connect to. | |
| portblock | Block and unblocks access to TCP and UDP ports |
Resource script for portblock. It is used to temporarily block ports using iptables. In addition, it may allow for faster TCP reconnects for clients on failover. Use that if there are long lived TCP connections to an HA service. This feature is enabled by setting the tickle_dir parameter and only in concert with action set to unblock. Note that the tickle ACK function is new as of version 3.0.2 and hasn't yet seen widespread use. | |
| postfix | Manages a highly available Postfix mail server instance |
This script manages Postfix as an OCF resource in a high-availability setup. Tested with Postfix 2.5.5 on Debian 5.0. | |
| proftpd | OCF Resource Agent compliant FTP script. |
This script manages Proftpd in an Active-Passive setup | |
| Pure-FTPd | Manages a Pure-FTPd FTP server instance |
This script manages Pure-FTPd in an Active-Passive setup | |
| Raid1 | Manages a software RAID1 device on shared storage |
Resource script for RAID1. It manages a software Raid1 device on a shared storage medium. | |
| Route | Manages network routes |
Enables and disables network routes. Supports host and net routes, routes via a gateway address, and routes using specific source addresses. This resource agent is useful if a node's routing table needs to be manipulated based on node role assignment. Consider the following example use case: - One cluster node serves as an IPsec tunnel endpoint. - All other nodes use the IPsec tunnel to reach hosts in a specific remote network. Then, here is how you would implement this scheme making use of the Route resource agent: - Configure an ipsec LSB resource. - Configure a cloned Route OCF resource. - Create an order constraint to ensure that ipsec is started before Route. - Create a colocation constraint between the ipsec and Route resources, to make sure no instance of your cloned Route resource is started on the tunnel endpoint itself. | |
| rsyncd | Manages an rsync daemon |
This script manages rsync daemon | |
| SAPDatabase | Manages any SAP database (based on Oracle, MaxDB, or DB2) |
Resource script for SAP databases. It manages a SAP database of any type as an HA resource. | |
| SAPInstance | Manages a SAP instance |
Resource script for SAP. It manages a SAP Instance as an HA resource. | |
| scsi2reservation |
scsi-2 reservation |
The scsi-2-reserve resource agent is a place holder for SCSI-2 reservation. A healthy instance of scsi-2-reserve resource, indicates the own of the specified SCSI device. This resource agent depends on the scsi_reserve from scsires package, which is Linux specific. | |
| SendArp | Broadcasts unsolicited ARP announcements |
This script send out gratuitous Arp for an IP address | |
| ServeRAID | Enables and disables shared ServeRAID merge groups |
Resource script for ServeRAID. It enables/disables shared ServeRAID merge groups. | |
| sfex | Manages exclusive acess to shared storage using Shared Disk File EXclusiveness (SF-EX) |
Resource script for SF-EX. It manages a shared storage medium exclusively . | |
| SphinxSearchDaemon | Manages the Sphinx search daemon. |
This is a searchd Resource Agent. It manages the Sphinx Search Daemon. | |
| Squid | Manages a Squid proxy server instance |
The resource agent of Squid. This manages a Squid instance as an HA resource. | |
| Stateful | Example stateful resource agent |
This is an example resource agent that impliments two states | |
| SysInfo | Records various node attributes in the CIB |
This is a SysInfo Resource Agent. It records (in the CIB) various attributes of a node Sample Linux output: arch: i686 os: Linux-2.4.26-gentoo-r14 free_swap: 1999 cpu_info: Intel(R) Celeron(R) CPU 2.40GHz cpu_speed: 4771.02 cpu_cores: 1 cpu_load: 0.00 ram_total: 513 ram_free: 117 root_free: 2.4 Sample Darwin output: arch: i386 os: Darwin-8.6.2 cpu_info: Intel Core Duo cpu_speed: 2.16 cpu_cores: 2 cpu_load: 0.18 ram_total: 2016 ram_free: 787 root_free: 13 Units: free_swap: Mb ram_*: Mb root_free: Gb cpu_speed (Linux): bogomips cpu_speed (Darwin): Ghz | |
| syslog-ng | Syslog-ng resource agent |
This script manages a syslog-ng instance as an HA resource. | |
| tomcat | Manages a Tomcat servlet environment instance |
Resource script for tomcat. It manages a Tomcat instance as an HA resource. | |
| VIPArip | Manages a virtual IP address through RIP2 |
Virtual IP Address by RIP2 protocol. This script manages IP alias in different subnet with quagga/ripd. It can add an IP alias, or remove one. | |
| VirtualDomain | Manages virtual domains through the libvirt virtualization framework |
Resource agent for a virtual domain (a.k.a. domU, virtual machine, virtual environment etc., depending on context) managed by libvirtd. | |
| vmware | Manages VMWare Server 2.0 virtual machines |
OCF compliant script to control vmware server 2.0 virtual machines. | |
| WAS | Manages a WebSphere Application Server instance |
Resource script for WAS. It manages a Websphere Application Server (WAS) as an HA resource. | |
| WAS6 | Manages a WebSphere Application Server 6 instance |
Resource script for WAS6. It manages a Websphere Application Server (WAS6) as an HA resource. | |
| WinPopup | Sends an SMB notification message to selected hosts |
Resource script for WinPopup. It sends WinPopups message to a sysadmin's workstation whenever a takeover occurs. | |
| Xen | Manages Xen unprivileged domains (DomUs) |
Resource Agent for the Xen Hypervisor. Manages Xen virtual machine instances by mapping cluster resource start and stop, to Xen create and shutdown, respectively. A note on names We will try to extract the name from the config file (the xmfile attribute). If you use a simple assignment statement, then you should be fine. Otherwise, if there's some python acrobacy involved such as dynamically assigning names depending on other variables, and we will try to detect this, then please set the name attribute. You should also do that if there is any chance of a pathological situation where a config file might be missing, for example if it resides on a shared storage. If all fails, we finally fall back to the instance id to preserve backward compatibility. Para-virtualized guests can also be migrated by enabling the meta_attribute allow-migrate. | |
| Xinetd | Manages an Xinetd service |
Resource script for Xinetd. It starts/stops services managed by xinetd. Note that the xinetd daemon itself must be running: we are not going to start it or stop it ourselves. Important: in case the services managed by the cluster are the only ones enabled, you should specify the -stayalive option for xinetd or it will exit on Heartbeat stop. Alternatively, you may enable some internal service such as echo. |
To regenerate the table above, use:
#!/bin/bash
#
# Run this from the root of the agents source tree to get a mediawiki
# table of all RAs that will be built. Requires xmlstarlet.
#
echo '{|'
AGENTS=$(sed -n '/^ocf_SCRIPTS/,/^$/p' heartbeat/Makefile.am | sed -e 's/ocf_SCRIPTS[[:space:]]*=[[:space:]]*//' -e 's/\\//' | sort)
for ra in $AGENTS; do
echo '|-'
echo "|[http://linux-ha.org/doc/re-ra-$ra.html $ra]"
sed -n '/<?xml/,/<\/resource-agent>/p' heartbeat/$ra | \
xml sel -T -t -o '|' -v '//shortdesc' -n -o '|-' -n \
-o '|||<pre>' \
-v '//longdesc' \
-o '</pre>' -n |
sed -e 's/^[[:space:]]*//g'
done
echo '|}'
