docs/ha-administration-guide/src/main/asciidoc/overview.adoc - glassfish - Git at Google

 type=page
 status=published
 title=High Availability in {productName}
 next=ssh-setup.html
 prev=preface.html
 ~~~~~~

 = High Availability in {productName}

 [[GSHAG00002]][[abdaq]]


 [[high-availability-in-glassfish-server]]
 == 1 High Availability in {productName}

 This chapter describes the high availability features in {productName} 7.

 The following topics are addressed here:

 * link:#abdar[Overview of High Availability]
 * link:#gaymr[How {productName} Provides High Availability]
 * link:#gbcot[Recovering from Failures]
 * link:#abdaz[More Information]

 [[abdar]][[GSHAG00168]][[overview-of-high-availability]]

 === Overview of High Availability

 High availability applications and services provide their functionality
 continuously, regardless of hardware and software failures. To make such
 reliability possible, {productName} provides mechanisms for
 maintaining application state data between clustered {productName}
 instances. Application state data, such as HTTP session data, stateful
 EJB sessions, and dynamic cache information, is replicated in real time
 across server instances. If any one server instance goes down, the
 session state is available to the next failover server, resulting in
 minimum application downtime and enhanced transactional security.

 {productName} provides the following high availability features:

 * link:#gksdm[Load Balancing With the Apache `mod_jk` or `mod_proxy_ajp` Module]
 * link:#gaynn[High Availability Session Persistence]
 * link:#gayna[High Availability Java Message Service]
 * link:#gaymz[RMI-IIOP Load Balancing and Failover]

 [[gksdm]][[GSHAG00252]][[load-balancing-with-the-apache-mod_jk-or-mod_proxy_ajp-module]]

 ==== Load Balancing With the Apache `mod_jk` or `mod_proxy_ajp` Module

 A common load balancing configuration for {productName} 7 is to use
 the Apache HTTP Server as the web server front-end, and the Apache
 `mod_jk` or `mod_proxy_ajp` module as the connector between the web
 server and {productName}. See
 link:http-load-balancing.html#gksdt[Configuring {productName} with
 Apache HTTP Server and `mod_jk`] and
 link:http-load-balancing.html#CHDCCGDC[Configuring {productName} with
 Apache HTTP Server and `mod_proxy_ajp`] for more information.

 [[gaynn]][[GSHAG00253]][[high-availability-session-persistence]]

 ==== High Availability Session Persistence

 {productName} provides high availability of HTTP requests and session
 data (both HTTP session data and stateful session bean data).

 Jakarta EE applications typically have significant amounts of session state
 data. A web shopping cart is the classic example of a session state.
 Also, an application can cache frequently-needed data in the session
 object. In fact, almost all applications with significant user
 interactions need to maintain session state. Both HTTP sessions and
 stateful session beans (SFSBs) have session state data.

 Preserving session state across server failures can be important to end
 users. If the {productName} instance hosting the user session
 experiences a failure, the session state can be recovered, and the
 session can continue without loss of information. High availability is
 implemented in {productName} by means of in-memory session
 replication on {productName} instances running in a cluster.

 For more information about in-memory session replication in {productName}, see link:#gaymr[How {productName} Provides High
 Availability]. For detailed instructions on configuring high
 availability session persistence, see
 link:session-persistence-and-failover.html#abdkz[Configuring High
 Availability Session Persistence and Failover].

 [[gayna]][[GSHAG00254]][[high-availability-java-message-service]]

 ==== High Availability Java Message Service

 {productName} supports the Java Message Service (JMS) API and JMS
 messaging through its built-in jmsra resource adapter communicating with
 Open Message Queue as the JMS provider. This combination is often called
 the JMS Service.

 The JMS service makes JMS messaging highly available as follows:

 Message Queue Broker Clusters::
   By default, when a GlassFish cluster is created, the JMS service
   automatically configures a Message Queue broker cluster to provide JMS
   messaging services, with one clustered broker assigned to each cluster
   instance. This automatically created broker cluster is configurable to
   take advantage of the two types of broker clusters, conventional and
   enhanced, supported by Message Queue. +
   Additionally, Message Queue broker clusters created and managed using
   Message Queue itself can be used as external, or remote, JMS hosts.
   Using external broker clusters provides additional deployment options,
   such as deploying Message Queue brokers on different hosts from the
   GlassFish instances they service, or deploying different numbers of
   Message Queue brokers and GlassFish instances. +
   For more information about Message Queue clustering, see
   link:jms.html#abdbx[Using Message Queue Broker Clusters With {productName}].
 Connection Failover::
   The use of Message Queue broker clusters allows connection failover in
   the event of a broker failure. If the primary JMS host (Message Queue
   broker) in use by a GlassFish instance fails, connections to the
   failed JMS host will automatically fail over to another host in the
   JMS host list, allowing messaging operations to continue and
   maintaining JMS messaging semantics. +
   For more information about JMS connection failover, see
   link:jms.html#abdbv[Connection Failover].

 [[gaymz]][[GSHAG00255]][[rmi-iiop-load-balancing-and-failover]]

 ==== RMI-IIOP Load Balancing and Failover

 With RMI-IIOP load balancing, IIOP client requests are distributed to
 different server instances or name servers, which spreads the load
 evenly across the cluster, providing scalability. IIOP load balancing
 combined with EJB clustering and availability also provides EJB
 failover.

 When a client performs a JNDI lookup for an object, the Naming Service
 essentially binds the request to a particular server instance. From then
 on, all lookup requests made from that client are sent to the same
 server instance, and thus all `EJBHome` objects will be hosted on the
 same target server. Any bean references obtained henceforth are also
 created on the same target host. This effectively provides load
 balancing, since all clients randomize the list of target servers when
 performing JNDI lookups. If the target server instance goes down, the
 lookup or EJB method invocation will failover to another server
 instance.

 IIOP Load balancing and failover happens transparently. No special steps
 are needed during application deployment. If the {productName}
 instance on which the application client is deployed participates in a
 cluster, the {productName} finds all currently active IIOP endpoints
 in the cluster automatically. However, a client should have at least two
 endpoints specified for bootstrapping purposes, in case one of the
 endpoints has failed.

 For more information on RMI-IIOP load balancing and failover, see
 link:rmi-iiop.html#fxxqs[RMI-IIOP Load Balancing and Failover].

 [[gaymr]][[GSHAG00169]][[how-glassfish-server-provides-high-availability]]

 === How {productName} Provides High Availability

 {productName} provides high availability through the following
 subcomponents and features:

 * link:#gjghv[Storage for Session State Data]
 * link:#abdax[Highly Available Clusters]

 [[gjghv]][[GSHAG00256]][[storage-for-session-state-data]]

 ==== Storage for Session State Data

 Storing session state data enables the session state to be recovered
 after the failover of a server instance in a cluster. Recovering the
 session state enables the session to continue without loss of
 information. {productName} supports in-memory session replication on
 other servers in the cluster for maintaining HTTP session and stateful
 session bean data.

 In-memory session replication is implemented in {productName} 7 as
 an OSGi module. Internally, the replication module uses a consistent
 hash algorithm to pick a replica server instance within a cluster of
 instances. This allows the replication module to easily locate the
 replica or replicated data when a container needs to retrieve the data.

 The use of in-memory replication requires the Group Management Service
 (GMS) to be enabled. For more information about GMS, see
 link:clusters.html#gjfnl[Group Management Service].

 If server instances in a cluster are located on different hosts, ensure
 that the following prerequisites are met:

 * To ensure that GMS and in-memory replication function correctly, the
 hosts must be on the same subnet.
 * To ensure that in-memory replication functions correctly, the system
 clocks on all hosts in the cluster must be synchronized as closely as
 possible.

 [[abdax]][[GSHAG00257]][[highly-available-clusters]]

 ==== Highly Available Clusters

 A highly available cluster integrates a state replication service with
 clusters and load balancer.


 [NOTE]
 ====
 When implementing a highly available cluster, use a load balancer that
 includes session-based stickiness as part of its load-balancing
 algorithm. Otherwise, session data can be misdirected or lost.
 An example of a load balancer that includes session-based stickiness is the
 Loadbalancer Plug-In available in {productName}.
 ====


 [[abday]][[GSHAG00218]][[clusters-instances-sessions-and-load-balancing]]

 ===== Clusters, Instances, Sessions, and Load Balancing

 Clusters, server instances, load balancers, and sessions are related as
 follows:

 * A server instance is not required to be part of a cluster. However, an
 instance that is not part of a cluster cannot take advantage of high
 availability through transfer of session state from one instance to
 other instances.
 * The server instances within a cluster can be hosted on one or multiple
 hosts. You can group server instances across different hosts into a
 cluster.
 * A particular load balancer can forward requests to server instances on
 multiple clusters. You can use this ability of the load balancer to
 perform an online upgrade without loss of service. For more information,
 see link:rolling-upgrade.html#abdin[Upgrading in Multiple Clusters].
 * A single cluster can receive requests from multiple load balancers. If
 a cluster is served by more than one load balancer, you must configure
 the cluster in exactly the same way on each load balancer.
 * Each session is tied to a particular cluster. Therefore, although you
 can deploy an application on multiple clusters, session failover will
 occur only within a single cluster.

 The cluster thus acts as a safe boundary for session failover for the
 server instances within the cluster. You can use the load balancer and
 upgrade components within the {productName} without loss of service.

 [[gktax]][[GSHAG00219]][[protocols-for-centralized-cluster-administration]]

 ===== Protocols for Centralized Cluster Administration

 {productName} uses the Distributed Component Object Model (DCOM)
 remote protocol or secure shell (SSH) to ensure that clusters that span
 multiple hosts can be administered centrally. To perform administrative
 operations on {productName} instances that are remote from the domain
 administration server (DAS), the DAS must be able to communicate with
 those instances. If an instance is running, the DAS connects to the
 running instance directly. For example, when you deploy an application
 to an instance, the DAS connects to the instance and deploys the
 application to the instance.

 However, the DAS cannot connect to an instance to perform operations on
 an instance that is not running, such as creating or starting the
 instance. For these operations, the DAS uses DCOM or SSH to contact a
 remote host and administer instances there. DCOM or SSH provides
 confidentiality and security for data that is exchanged between the DAS
 and remote hosts.


 [NOTE]
 ====
 The use of DCOM or SSH to enable centralized administration of remote
 instances is optional. If the use of DCOM SSH is not feasible in your
 environment, you can administer remote instances locally.
 ====


 For more information, see link:ssh-setup.html#gkshg[Enabling Centralized
 Administration of {productName} Instances].

 [[gbcot]][[GSHAG00170]][[recovering-from-failures]]

 === Recovering from Failures

 You can use various techniques to manually recover individual
 subcomponents after hardware failures such as disk crashes.

 The following topics are addressed here:

 * link:#gcmkp[Recovering the Domain Administration Server]
 * link:#gcmkc[Recovering {productName} Instances]
 * link:#gcmjs[Recovering the HTTP Load Balancer and Web Server]
 * link:#gcmjr[Recovering Message Queue]

 [[gcmkp]][[GSHAG00258]][[recovering-the-domain-administration-server]]

 ==== Recovering the Domain Administration Server

 Loss of the Domain Administration Server (DAS) affects only
 administration. {productName} clusters and standalone instances, and
 the applications deployed to them, continue to run as before, even if
 the DAS is not reachable

 Use any of the following methods to recover the DAS:

 * Back up the domain periodically, so you have periodic snapshots. After
 a hardware failure, re-create the DAS on a new host, as described in
 "link:administration-guide/domains.html#GSADG00542[Re-Creating the Domain Administration Server (DAS)]"
 in {productName} Administration Guide.
 * Put the domain installation and configuration on a shared and robust
 file system (NFS for example). If the primary DAS host fails, a second
 host is brought up with the same IP address and will take over with
 manual intervention or user supplied automation.
 * Zip the {productName} installation and domain root directory.
 Restore it on the new host, assigning it the same network identity.

 [[gcmkc]][[GSHAG00259]][[recovering-glassfish-server-instances]]

 ==== Recovering {productName} Instances

 {productName} provide tools for backing up and restoring {productName} instances. For more information, see link:instances.html#gksdy[To
 Resynchronize an Instance and the DAS Offline].

 [[gcmjs]][[GSHAG00260]][[recovering-the-http-load-balancer-and-web-server]]

 ==== Recovering the HTTP Load Balancer and Web Server

 There are no explicit commands to back up only a web server
 configuration. Simply zip the web server installation directory. After
 failure, unzip the saved backup on a new host with the same network
 identity. If the new host has a different IP address, update the DNS
 server or the routers.


 [NOTE]
 ====
 This assumes that the web server is either reinstalled or restored from
 an image first.
 ====


 The Load Balancer Plug-In (`plugins` directory) and configurations are
 in the web server installation directory, typically `/opt/SUNWwbsvr`.
 The web-install``/``web-instance``/config`` directory contains the
 `loadbalancer.xml` file.

 [[gcmjr]][[GSHAG00261]][[recovering-message-queue]]

 ==== Recovering Message Queue

 When a Message Queue broker becomes unavailable, the method you use to
 restore the broker to operation depends on the nature of the failure
 that caused the broker to become unavailable:

 * Power failure or failure other than disk storage
 * Failure of disk storage

 Additionally, the urgency of restoring an unavailable broker to
 operation depends on the type of the broker:

 * Standalone Broker. When a standalone broker becomes unavailable, both
 service availability and data availability are interrupted. Restore the
 broker to operation as soon as possible to restore availability.
 * Broker in a Conventional Cluster. When a broker in a conventional
 cluster becomes unavailable, service availability continues to be
 provided by the other brokers in the cluster. However, data availability
 of the persistent data stored by the unavailable broker is interrupted.
 Restore the broker to operation to restore availability of its
 persistent data.
 * Broker in an Enhanced Cluster. When a broker in an enhanced cluster
 becomes unavailable, service availability and data availability continue
 to be provided by the other brokers in the cluster. Restore the broker
 to operation to return the cluster to its previous capacity.

 [[glaiv]][[GSHAG00220]][[recovering-from-power-failure-and-failures-other-than-disk-storage]]

 ===== Recovering From Power Failure and Failures Other Than Disk Storage

 When a host is affected by a power failure or failure of a non-disk
 component such as memory, processor or network card, restore Message
 Queue brokers on the affected host by starting the brokers after the
 failure has been remedied.

 To start brokers serving as Embedded or Local JMS hosts, start the
 GlassFish instances the brokers are servicing. To start brokers serving
 as Remote JMS hosts, use the `imqbrokerd` Message Queue utility.

 [[glaiu]][[GSHAG00221]][[recovering-from-failure-of-disk-storage]]

 ===== Recovering from Failure of Disk Storage

 Message Queue uses disk storage for software, configuration files and
 persistent data stores. In a default GlassFish installation, all three
 of these are generally stored on the same disk: the Message Queue
 software in as-install-parent``/mq``, and broker configuration files and
 persistent data stores (except for the persistent data stores of
 enhanced clusters, which are housed in highly available databases) in
 domain-dir``/imq``. If this disk fails, restoring brokers to operation is
 impossible unless you have previously created a backup of these items.
 To create such a backup, use a utility such as `zip`, `gzip` or `tar` to
 create archives of these directories and all their content. When
 creating the backup, you should first quiesce all brokers and physical
 destinations, as described in "link:../openmq/mq-admin-guide/broker-management.html#GMADG00522[Quiescing a Broker]" and
 "link:../openmq/mq-admin-guide/message-delivery.html#GMADG00533[Pausing and Resuming a Physical Destination]" in Open
 Message Queue Administration Guide, respectively. Then, after the failed
 disk is replaced and put into service, expand the backup archive into
 the same location.

 Restoring the Persistent Data Store From Backup. For many messaging
 applications, restoring a persistent data store from backup does not
 produce the desired results because the backed up store does not
 represent the content of the store when the disk failure occurred. In
 some applications, the persistent data changes rapidly enough to make
 backups obsolete as soon as they are created. To avoid issues in
 restoring a persistent data store, consider using a RAID or SAN data
 storage solution that is fault tolerant, especially for data stores in
 production environments.

 [[abdaz]][[GSHAG00171]][[more-information]]

 === More Information

 For information about planning a high-availability deployment, including
 assessing hardware requirements, planning network configuration, and
 selecting a topology, see the link:deployment-planning-guide.html#GSPLG[{productName} Deployment Planning Guide]. This manual also provides a
 high-level introduction to concepts such as:

 * {productName} components such as node agents, domains, and clusters
 * IIOP load balancing in a cluster
 * Message queue failover

 For more information about developing applications that take advantage
 of high availability features, see the link:application-development-guide.html#GSDVG[{productName} Application Development Guide].

 For information on how to configure and tune applications and {productName} for best performance with high availability, see the
 link:performance-tuning-guide.html#GSPTG[{productName} Performance Tuning
 Guide], which discusses topics such as:

 * Tuning persistence frequency and persistence scope
 * Checkpointing stateful session beans
 * Configuring the JDBC connection pool
 * Session size
 * Configuring load balancers for best performance
	type=page
	status=published
	title=High Availability in {productName}
	next=ssh-setup.html
	prev=preface.html
	~~~~~~

	= High Availability in {productName}

	[[GSHAG00002]][[abdaq]]


	[[high-availability-in-glassfish-server]]
	== 1 High Availability in {productName}

	This chapter describes the high availability features in {productName} 7.

	The following topics are addressed here:

	* link:#abdar[Overview of High Availability]
	* link:#gaymr[How {productName} Provides High Availability]
	* link:#gbcot[Recovering from Failures]
	* link:#abdaz[More Information]

	[[abdar]][[GSHAG00168]][[overview-of-high-availability]]

	=== Overview of High Availability

	High availability applications and services provide their functionality
	continuously, regardless of hardware and software failures. To make such
	reliability possible, {productName} provides mechanisms for
	maintaining application state data between clustered {productName}
	instances. Application state data, such as HTTP session data, stateful
	EJB sessions, and dynamic cache information, is replicated in real time
	across server instances. If any one server instance goes down, the
	session state is available to the next failover server, resulting in
	minimum application downtime and enhanced transactional security.

	{productName} provides the following high availability features:

	* link:#gksdm[Load Balancing With the Apache `mod_jk` or `mod_proxy_ajp` Module]
	* link:#gaynn[High Availability Session Persistence]
	* link:#gayna[High Availability Java Message Service]
	* link:#gaymz[RMI-IIOP Load Balancing and Failover]

	[[gksdm]][[GSHAG00252]][[load-balancing-with-the-apache-mod_jk-or-mod_proxy_ajp-module]]

	==== Load Balancing With the Apache `mod_jk` or `mod_proxy_ajp` Module

	A common load balancing configuration for {productName} 7 is to use
	the Apache HTTP Server as the web server front-end, and the Apache
	`mod_jk` or `mod_proxy_ajp` module as the connector between the web
	server and {productName}. See
	link:http-load-balancing.html#gksdt[Configuring {productName} with
	Apache HTTP Server and `mod_jk`] and
	link:http-load-balancing.html#CHDCCGDC[Configuring {productName} with
	Apache HTTP Server and `mod_proxy_ajp`] for more information.

	[[gaynn]][[GSHAG00253]][[high-availability-session-persistence]]

	==== High Availability Session Persistence

	{productName} provides high availability of HTTP requests and session
	data (both HTTP session data and stateful session bean data).

	Jakarta EE applications typically have significant amounts of session state
	data. A web shopping cart is the classic example of a session state.
	Also, an application can cache frequently-needed data in the session
	object. In fact, almost all applications with significant user
	interactions need to maintain session state. Both HTTP sessions and
	stateful session beans (SFSBs) have session state data.

	Preserving session state across server failures can be important to end
	users. If the {productName} instance hosting the user session
	experiences a failure, the session state can be recovered, and the
	session can continue without loss of information. High availability is
	implemented in {productName} by means of in-memory session
	replication on {productName} instances running in a cluster.

	For more information about in-memory session replication in {productName}, see link:#gaymr[How {productName} Provides High
	Availability]. For detailed instructions on configuring high
	availability session persistence, see
	link:session-persistence-and-failover.html#abdkz[Configuring High
	Availability Session Persistence and Failover].

	[[gayna]][[GSHAG00254]][[high-availability-java-message-service]]

	==== High Availability Java Message Service

	{productName} supports the Java Message Service (JMS) API and JMS
	messaging through its built-in jmsra resource adapter communicating with
	Open Message Queue as the JMS provider. This combination is often called
	the JMS Service.

	The JMS service makes JMS messaging highly available as follows:

	Message Queue Broker Clusters::
	By default, when a GlassFish cluster is created, the JMS service
	automatically configures a Message Queue broker cluster to provide JMS
	messaging services, with one clustered broker assigned to each cluster
	instance. This automatically created broker cluster is configurable to
	take advantage of the two types of broker clusters, conventional and
	enhanced, supported by Message Queue. +
	Additionally, Message Queue broker clusters created and managed using
	Message Queue itself can be used as external, or remote, JMS hosts.
	Using external broker clusters provides additional deployment options,
	such as deploying Message Queue brokers on different hosts from the
	GlassFish instances they service, or deploying different numbers of
	Message Queue brokers and GlassFish instances. +
	For more information about Message Queue clustering, see
	link:jms.html#abdbx[Using Message Queue Broker Clusters With {productName}].
	Connection Failover::
	The use of Message Queue broker clusters allows connection failover in
	the event of a broker failure. If the primary JMS host (Message Queue
	broker) in use by a GlassFish instance fails, connections to the
	failed JMS host will automatically fail over to another host in the
	JMS host list, allowing messaging operations to continue and
	maintaining JMS messaging semantics. +
	For more information about JMS connection failover, see
	link:jms.html#abdbv[Connection Failover].

	[[gaymz]][[GSHAG00255]][[rmi-iiop-load-balancing-and-failover]]

	==== RMI-IIOP Load Balancing and Failover

	With RMI-IIOP load balancing, IIOP client requests are distributed to
	different server instances or name servers, which spreads the load
	evenly across the cluster, providing scalability. IIOP load balancing
	combined with EJB clustering and availability also provides EJB
	failover.

	When a client performs a JNDI lookup for an object, the Naming Service
	essentially binds the request to a particular server instance. From then
	on, all lookup requests made from that client are sent to the same
	server instance, and thus all `EJBHome` objects will be hosted on the
	same target server. Any bean references obtained henceforth are also
	created on the same target host. This effectively provides load
	balancing, since all clients randomize the list of target servers when
	performing JNDI lookups. If the target server instance goes down, the
	lookup or EJB method invocation will failover to another server
	instance.

	IIOP Load balancing and failover happens transparently. No special steps
	are needed during application deployment. If the {productName}
	instance on which the application client is deployed participates in a
	cluster, the {productName} finds all currently active IIOP endpoints
	in the cluster automatically. However, a client should have at least two
	endpoints specified for bootstrapping purposes, in case one of the
	endpoints has failed.

	For more information on RMI-IIOP load balancing and failover, see
	link:rmi-iiop.html#fxxqs[RMI-IIOP Load Balancing and Failover].

	[[gaymr]][[GSHAG00169]][[how-glassfish-server-provides-high-availability]]

	=== How {productName} Provides High Availability

	{productName} provides high availability through the following
	subcomponents and features:

	* link:#gjghv[Storage for Session State Data]
	* link:#abdax[Highly Available Clusters]

	[[gjghv]][[GSHAG00256]][[storage-for-session-state-data]]

	==== Storage for Session State Data

	Storing session state data enables the session state to be recovered
	after the failover of a server instance in a cluster. Recovering the
	session state enables the session to continue without loss of
	information. {productName} supports in-memory session replication on
	other servers in the cluster for maintaining HTTP session and stateful
	session bean data.

	In-memory session replication is implemented in {productName} 7 as
	an OSGi module. Internally, the replication module uses a consistent
	hash algorithm to pick a replica server instance within a cluster of
	instances. This allows the replication module to easily locate the
	replica or replicated data when a container needs to retrieve the data.

	The use of in-memory replication requires the Group Management Service
	(GMS) to be enabled. For more information about GMS, see
	link:clusters.html#gjfnl[Group Management Service].

	If server instances in a cluster are located on different hosts, ensure
	that the following prerequisites are met:

	* To ensure that GMS and in-memory replication function correctly, the
	hosts must be on the same subnet.
	* To ensure that in-memory replication functions correctly, the system
	clocks on all hosts in the cluster must be synchronized as closely as
	possible.

	[[abdax]][[GSHAG00257]][[highly-available-clusters]]

	==== Highly Available Clusters

	A highly available cluster integrates a state replication service with
	clusters and load balancer.


	[NOTE]
	====
	When implementing a highly available cluster, use a load balancer that
	includes session-based stickiness as part of its load-balancing
	algorithm. Otherwise, session data can be misdirected or lost.
	An example of a load balancer that includes session-based stickiness is the
	Loadbalancer Plug-In available in {productName}.
	====


	[[abday]][[GSHAG00218]][[clusters-instances-sessions-and-load-balancing]]

	===== Clusters, Instances, Sessions, and Load Balancing

	Clusters, server instances, load balancers, and sessions are related as
	follows:

	* A server instance is not required to be part of a cluster. However, an
	instance that is not part of a cluster cannot take advantage of high
	availability through transfer of session state from one instance to
	other instances.
	* The server instances within a cluster can be hosted on one or multiple
	hosts. You can group server instances across different hosts into a
	cluster.
	* A particular load balancer can forward requests to server instances on
	multiple clusters. You can use this ability of the load balancer to
	perform an online upgrade without loss of service. For more information,
	see link:rolling-upgrade.html#abdin[Upgrading in Multiple Clusters].
	* A single cluster can receive requests from multiple load balancers. If
	a cluster is served by more than one load balancer, you must configure
	the cluster in exactly the same way on each load balancer.
	* Each session is tied to a particular cluster. Therefore, although you
	can deploy an application on multiple clusters, session failover will
	occur only within a single cluster.

	The cluster thus acts as a safe boundary for session failover for the
	server instances within the cluster. You can use the load balancer and
	upgrade components within the {productName} without loss of service.

	[[gktax]][[GSHAG00219]][[protocols-for-centralized-cluster-administration]]

	===== Protocols for Centralized Cluster Administration

	{productName} uses the Distributed Component Object Model (DCOM)
	remote protocol or secure shell (SSH) to ensure that clusters that span
	multiple hosts can be administered centrally. To perform administrative
	operations on {productName} instances that are remote from the domain
	administration server (DAS), the DAS must be able to communicate with
	those instances. If an instance is running, the DAS connects to the
	running instance directly. For example, when you deploy an application
	to an instance, the DAS connects to the instance and deploys the
	application to the instance.

	However, the DAS cannot connect to an instance to perform operations on
	an instance that is not running, such as creating or starting the
	instance. For these operations, the DAS uses DCOM or SSH to contact a
	remote host and administer instances there. DCOM or SSH provides
	confidentiality and security for data that is exchanged between the DAS
	and remote hosts.


	[NOTE]
	====
	The use of DCOM or SSH to enable centralized administration of remote
	instances is optional. If the use of DCOM SSH is not feasible in your
	environment, you can administer remote instances locally.
	====


	For more information, see link:ssh-setup.html#gkshg[Enabling Centralized
	Administration of {productName} Instances].

	[[gbcot]][[GSHAG00170]][[recovering-from-failures]]

	=== Recovering from Failures

	You can use various techniques to manually recover individual
	subcomponents after hardware failures such as disk crashes.

	The following topics are addressed here:

	* link:#gcmkp[Recovering the Domain Administration Server]
	* link:#gcmkc[Recovering {productName} Instances]
	* link:#gcmjs[Recovering the HTTP Load Balancer and Web Server]
	* link:#gcmjr[Recovering Message Queue]

	[[gcmkp]][[GSHAG00258]][[recovering-the-domain-administration-server]]

	==== Recovering the Domain Administration Server

	Loss of the Domain Administration Server (DAS) affects only
	administration. {productName} clusters and standalone instances, and
	the applications deployed to them, continue to run as before, even if
	the DAS is not reachable

	Use any of the following methods to recover the DAS:

	* Back up the domain periodically, so you have periodic snapshots. After
	a hardware failure, re-create the DAS on a new host, as described in
	"link:administration-guide/domains.html#GSADG00542[Re-Creating the Domain Administration Server (DAS)]"
	in {productName} Administration Guide.
	* Put the domain installation and configuration on a shared and robust
	file system (NFS for example). If the primary DAS host fails, a second
	host is brought up with the same IP address and will take over with
	manual intervention or user supplied automation.
	* Zip the {productName} installation and domain root directory.
	Restore it on the new host, assigning it the same network identity.

	[[gcmkc]][[GSHAG00259]][[recovering-glassfish-server-instances]]

	==== Recovering {productName} Instances

	{productName} provide tools for backing up and restoring {productName} instances. For more information, see link:instances.html#gksdy[To
	Resynchronize an Instance and the DAS Offline].

	[[gcmjs]][[GSHAG00260]][[recovering-the-http-load-balancer-and-web-server]]

	==== Recovering the HTTP Load Balancer and Web Server

	There are no explicit commands to back up only a web server
	configuration. Simply zip the web server installation directory. After
	failure, unzip the saved backup on a new host with the same network
	identity. If the new host has a different IP address, update the DNS
	server or the routers.


	[NOTE]
	====
	This assumes that the web server is either reinstalled or restored from
	an image first.
	====


	The Load Balancer Plug-In (`plugins` directory) and configurations are
	in the web server installation directory, typically `/opt/SUNWwbsvr`.
	The web-install``/``web-instance``/config`` directory contains the
	`loadbalancer.xml` file.

	[[gcmjr]][[GSHAG00261]][[recovering-message-queue]]

	==== Recovering Message Queue

	When a Message Queue broker becomes unavailable, the method you use to
	restore the broker to operation depends on the nature of the failure
	that caused the broker to become unavailable:

	* Power failure or failure other than disk storage
	* Failure of disk storage

	Additionally, the urgency of restoring an unavailable broker to
	operation depends on the type of the broker:

	* Standalone Broker. When a standalone broker becomes unavailable, both
	service availability and data availability are interrupted. Restore the
	broker to operation as soon as possible to restore availability.
	* Broker in a Conventional Cluster. When a broker in a conventional
	cluster becomes unavailable, service availability continues to be
	provided by the other brokers in the cluster. However, data availability
	of the persistent data stored by the unavailable broker is interrupted.
	Restore the broker to operation to restore availability of its
	persistent data.
	* Broker in an Enhanced Cluster. When a broker in an enhanced cluster
	becomes unavailable, service availability and data availability continue
	to be provided by the other brokers in the cluster. Restore the broker
	to operation to return the cluster to its previous capacity.

	[[glaiv]][[GSHAG00220]][[recovering-from-power-failure-and-failures-other-than-disk-storage]]

	===== Recovering From Power Failure and Failures Other Than Disk Storage

	When a host is affected by a power failure or failure of a non-disk
	component such as memory, processor or network card, restore Message
	Queue brokers on the affected host by starting the brokers after the
	failure has been remedied.

	To start brokers serving as Embedded or Local JMS hosts, start the
	GlassFish instances the brokers are servicing. To start brokers serving
	as Remote JMS hosts, use the `imqbrokerd` Message Queue utility.

	[[glaiu]][[GSHAG00221]][[recovering-from-failure-of-disk-storage]]

	===== Recovering from Failure of Disk Storage

	Message Queue uses disk storage for software, configuration files and
	persistent data stores. In a default GlassFish installation, all three
	of these are generally stored on the same disk: the Message Queue
	software in as-install-parent``/mq``, and broker configuration files and
	persistent data stores (except for the persistent data stores of
	enhanced clusters, which are housed in highly available databases) in
	domain-dir``/imq``. If this disk fails, restoring brokers to operation is
	impossible unless you have previously created a backup of these items.
	To create such a backup, use a utility such as `zip`, `gzip` or `tar` to
	create archives of these directories and all their content. When
	creating the backup, you should first quiesce all brokers and physical
	destinations, as described in "link:../openmq/mq-admin-guide/broker-management.html#GMADG00522[Quiescing a Broker]" and
	"link:../openmq/mq-admin-guide/message-delivery.html#GMADG00533[Pausing and Resuming a Physical Destination]" in Open
	Message Queue Administration Guide, respectively. Then, after the failed
	disk is replaced and put into service, expand the backup archive into
	the same location.

	Restoring the Persistent Data Store From Backup. For many messaging
	applications, restoring a persistent data store from backup does not
	produce the desired results because the backed up store does not
	represent the content of the store when the disk failure occurred. In
	some applications, the persistent data changes rapidly enough to make
	backups obsolete as soon as they are created. To avoid issues in
	restoring a persistent data store, consider using a RAID or SAN data
	storage solution that is fault tolerant, especially for data stores in
	production environments.

	[[abdaz]][[GSHAG00171]][[more-information]]

	=== More Information

	For information about planning a high-availability deployment, including
	assessing hardware requirements, planning network configuration, and
	selecting a topology, see the link:deployment-planning-guide.html#GSPLG[{productName} Deployment Planning Guide]. This manual also provides a
	high-level introduction to concepts such as:

	* {productName} components such as node agents, domains, and clusters
	* IIOP load balancing in a cluster
	* Message queue failover

	For more information about developing applications that take advantage
	of high availability features, see the link:application-development-guide.html#GSDVG[{productName} Application Development Guide].

	For information on how to configure and tune applications and {productName} for best performance with high availability, see the
	link:performance-tuning-guide.html#GSPTG[{productName} Performance Tuning
	Guide], which discusses topics such as:

	* Tuning persistence frequency and persistence scope
	* Checkpointing stateful session beans
	* Configuring the JDBC connection pool
	* Session size
	* Configuring load balancers for best performance