| DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! |
| |
| NOTICE: Some ideas in this paper aren't yet well sorted. Some ideas aren't |
| complete. Some phrasings I'm myself not happy with yet. Some ideas need |
| further explanation. Most of the ideas presented are not final yet. It is |
| mostly a braindump. |
| |
| And did I say yet that this is still a DRAFT!!!!!! ? |
| |
| |
| Title: The Executioner |
| Author: Lars Marowsky-Brée |
| Acknowledgements: David Brower, Oracle |
| Alan Robertson, IBM |
| |
| |
| 1. Summary |
| |
| Every node runs an instance of the fencing daemon ("executioner"). This daemon |
| knows which fencing devices are currently reachable and which nodes can be |
| fenced by them - ie, the current topology of the fencing mechanisms - and will |
| execute such requests on behalf of the CRM and report success or failure. |
| |
| A succesful fencing operation shall imply that the target node of the request |
| can no longer access any shared resources in the cluster, until it has |
| "properly rejoined the cluster". |
| |
| |
| 2. Fencing topology information |
| |
| (Note: modelled after the STONITH model by alanr in heartbeat) |
| |
| 2.1. Static configuration data |
| |
| The mechanisms available for fencing need to be configured on each node. |
| |
| Provision should be made that this file can be the same on all nodes (to ease |
| configuration deployment). It shall be made easy to configure a device for a |
| list of nodes or all nodes. |
| |
| TODO: Can this configuration also be stored in the CIB configuration part? |
| This would collide with the concept that every node is the authoritive source |
| of information about itself. |
| |
| |
| 2.2. Runtime topology |
| |
| Every device needs to support a low-latency "ping" operation, which shall |
| verify whether it can currently be reached from the local node; this shall |
| preferrably not be affected by concurrent access. |
| |
| The devices can either autodiscover their targets (ie, like via STONITH |
| devices), or have to provide means of configuring this list; the list shall |
| only be queried from the device on explicit request by the CRM. |
| |
| The CRM shall be allowed to assume that the same device can fence the same set |
| of nodes for all clients. |
| |
| |
| 3. Interaction with the CRM |
| |
| 3.1. Policies |
| |
| The CRM will be responsible for retrying failed commands; the Executioner |
| shall only make exactly one attempt. It shall not retry the request on another |
| device in particular; it is permissible to retry the command on the same |
| device if it seems like an intermediate failure. |
| |
| |
| 3.2. Queries/Commands issued |
| |
| 3.2.1. Device reachable |
| |
| The Executioner shall verify whether the given device is still reachable by |
| the local node at this point in time. |
| |
| The verification shall be low-latency and low weight and allow for concurrent |
| access from multiple nodes (if appropriate, ie for network switches). |
| |
| |
| 3.2.2. Targets fenceable via device Y |
| |
| The node shall contact the device and return the list of nodes which it can |
| fence. |
| |
| The CRM will ensure that no other node in the partition is accessing device Y |
| right now. |
| |
| Results: |
| 0 Success; list of targets included |
| 1 Failed; device not reached |
| 2 Failed; device failed to return list of targets |
| |
| |
| 3.2.3. Fencing request to fence node X via device Y |
| |
| This is a blocking, synchronous call. The CRM will ensure that no other node |
| in the partition is accessing device Y right now. |
| |
| (This is an issue for certain network powerswitches) |
| |
| The result code need to distinguish between: |
| |
| 0 Fencing request succeeded |
| 1 Failed: device could not be reached, potential network issue |
| 2 Failed: device tried, but failed to acknowledge success |
| 3 Failed: interal device failure |
| |
| TODO: So much differentiation really necessary? Yes, it can help identify |
| quickly whether it is sensible to retry the fencing request via |
| another node to the same device, or whether the next fencing device |
| should be tried immediately; maybe that is overkill. |
| |
| |
| |